Query.jl无法处理CSV.Rows

手里有一份大概6GB大小的数据(太大没法直接读到内存),想用CSV.Rows逐行读取,然后用Query.jl做一些操作,在进行@groupby后,通过@map进行处理的时候报错。想问问如何解决,或者有其他的工具能处理吗

using DataFrames
using CSV
using Query
data_path = "D:\\chen\\费用明细数据\\nn2022费用明细.csv" 
data_itr = CSV.Rows(data_path, reusebuffer=true)
data_itr |> @groupby(_.项目分类) |> @map({Key=key(_),  Total=sum(_.总价)}) |> DataFrame

报错信息如下:
ERROR: type Row2 has no field 总价
Stacktrace:
[1] getproperty(g::Grouping{Any, CSV.Row2}, name::Symbol)
@ QueryOperators C:\Users\gxjk009.julia\packages\QueryOperators\dF1vq\src\enumerable\enumerable_groupby.jl:29
[2] (::var"#94#99")(333::Grouping{Any, CSV.Row2})
@ Main C:\Users\gxjk009.julia\packages\Query\85Sw7\src\query_translation.jl:58
[3] iterate
@ C:\Users\gxjk009.julia\packages\QueryOperators\dF1vq\src\enumerable\enumerable_map.jl:25 [inlined]
[4] iterate
@ C:\Users\gxjk009.julia\packages\Tables\T7rHm\src\tofromdatavalues.jl:45 [inlined]
[5] buildcolumns
@ C:\Users\gxjk009.julia\packages\Tables\T7rHm\src\fallbacks.jl:202 [inlined]
[6] columns
@ C:\Users\gxjk009.julia\packages\Tables\T7rHm\src\fallbacks.jl:265 [inlined]
[7] DataFrame(x::QueryOperators.EnumerableMap{Union{}, QueryOperators.EnumerableIterable{Grouping{Any, CSV.Row2}, QueryOperators.EnumerableGroupBy{Grouping{Any, CSV.Row2}, Any, CSV.Row2, QueryOperators.EnumerableIterable{CSV.Row2, CSV.Rows{Vector{UInt8}, Tuple{}, PosLen, PosLenString}}, var"#91#96", var"#92#97"}}, var"#94#99"}; copycols::Nothing)
@ DataFrames C:\Users\gxjk009.julia\packages\DataFrames\MA4YO\src\other\tables.jl:58
[8] DataFrame(x::QueryOperators.EnumerableMap{Union{}, QueryOperators.EnumerableIterable{Grouping{Any, CSV.Row2}, QueryOperators.EnumerableGroupBy{Grouping{Any, CSV.Row2}, Any, CSV.Row2, QueryOperators.EnumerableIterable{CSV.Row2, CSV.Rows{Vector{UInt8}, Tuple{}, PosLen, PosLenString}}, var"#91#96", var"#92#97"}}, var"#94#99"})
@ DataFrames C:\Users\gxjk009.julia\packages\DataFrames\MA4YO\src\other\tables.jl:49
[9] |>(x::QueryOperators.EnumerableMap{Union{}, QueryOperators.EnumerableIterable{Grouping{Any, CSV.Row2}, QueryOperators.EnumerableGroupBy{Grouping{Any, CSV.Row2}, Any, CSV.Row2, QueryOperators.EnumerableIterable{CSV.Row2, CSV.Rows{Vector{UInt8}, Tuple{}, PosLen, PosLenString}}, var"#91#96", var"#92#97"}}, var"#94#99"}, f::Type{DataFrame})
@ Base .\operators.jl:966
[10] top-level scope
@ REPL[18]:1

6G 数据不大,建议加内存,便宜。

:joy:只有一台公司给的小电脑,加内存是没法实现的了

请问这个问题解决了吗?我也遇到了类似的问题

流式加载处理

当时换成duckdb来处理了,我现在是自己写流式的groupby