比如我有一个宽型的数据框 dfw, tradedate 为 id_vars, r_a, r_b, v_a, v_b 为 measure_vars
using DataFrames
using Dates
using StatsBase
dts = Date(2022, 1, 1):Day(1):Date(2022, 12, 31)
d = Dict(:tradedate => dts,
:r_a => randn(length(dts)),
:r_b => randn(length(dts)),
:v_a => sample(100:2000, length(dts)),
:v_b => sample(500:3000, length(dts)))
dfw = DataFrame(d)
我希望转换后的长型数据框包含如下列:asset (a 或 b), tradedate, r, v
可以使用 stack 分别对 r 和 v 进行转化,然后再将两个数据框合并来达到目的。
dfl_r = stack(dfw, [:r_a, :r_b], :tradedate, variable_name = :asset, value_name = :r)
dfl_r[:, :asset] = replace.(dfl_r[:, :asset], "r_" => "")
select!(dfl_r, :asset, :tradedate, :r)
dfl_v = stack(dfw, [:v_a, :v_b], :tradedate, variable_name = :asset, value_name = :v)
dfl_v[:, :asset] = replace.(dfl_v[:, :asset], "v_" => "")
select!(dfl_v, :asset, :tradedate, :v)
dfl = innerjoin(dfl_r, dfl_v, on = [:asset, :tradedate])
这样尽管实现了目标,但非常笨拙,有没有类似 R 中 pivot_longer() ,可以一步完成多变量的长宽转换?
谢谢!