如何在 Julia 中将列添加到空 DataFrame
How to add a Column to an empty DataFrame in Julia
我想将向量作为列附加到空 DataFrame
。假设我像这样定义了一个空的 DataFrame
:
import DataFrames
dataframe = DataFrames.DataFrame()
然后我想将此向量作为列附加到 dataframe
:
vec = [1,2,3]
我尝试了 push!(dataframe , vec)
,但出现了这个错误:
DimensionMismatch("Length of `row` does not match `DataFrame` column count.")
Stacktrace:
[1] push!(df::DataFrames.DataFrame, row::Vector{Int64}; promote::Bool)
@ DataFrames C:\Users\Shayan\.julia\packages\DataFrames\BM4OQ\src\dataframe\dataframe.jl:1691
[2] push!(df::DataFrames.DataFrame, row::Vector{Int64})
@ DataFrames C:\Users\Shayan\.julia\packages\DataFrames\BM4OQ\src\dataframe\dataframe.jl:1680
[3] top-level scope
@ c:\Users\Shayan\Documents\PyJul Scripts\Jul-test.ipynb:2
[4] eval
@ .\boot.jl:373 [inlined]
[5] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)
@ Base .\loading.jl:1196
[6] #invokelatest#2
@ .\essentials.jl:716 [inlined]
[7] invokelatest
@ .\essentials.jl:714 [inlined]
[8] (::VSCodeServer.var"#164#165"{VSCodeServer.NotebookRunCellArguments, String})()
@ VSCodeServer c:\Users\Shayan\.vscode\extensions\julialang.language-julia-1.6.17\scripts\packages\VSCodeServer\src\serve_notebook.jl:19
[9] withpath(f::VSCodeServer.var"#164#165"{VSCodeServer.NotebookRunCellArguments, String}, path::String)
@ VSCodeServer c:\Users\Shayan\.vscode\extensions\julialang.language-julia-1.6.17\scripts\packages\VSCodeServer\src\repl.jl:184
[10] notebook_runcell_request(conn::VSCodeServer.JSONRPC.JSONRPCEndpoint{Base.PipeEndpoint, Base.PipeEndpoint}, params::VSCodeServer.NotebookRunCellArguments)
@ VSCodeServer c:\Users\Shayan\.vscode\extensions\julialang.language-julia-1.6.17\scripts\packages\VSCodeServer\src\serve_notebook.jl:13
[11] dispatch_msg(x::VSCodeServer.JSONRPC.JSONRPCEndpoint{Base.PipeEndpoint, Base.PipeEndpoint}, dispatcher::VSCodeServer.JSONRPC.MsgDispatcher, msg::Dict{String, Any})
@ VSCodeServer.JSONRPC c:\Users\Shayan\.vscode\extensions\julialang.language-julia-1.6.17\scripts\packages\JSONRPC\src\typed.jl:67
[12] serve_notebook(pipename::String, outputchannel_logger::Base.CoreLogging.SimpleLogger; crashreporting_pipename::String)
@ VSCodeServer c:\Users\Shayan\.vscode\extensions\julialang.language-julia-1.6.17\scripts\packages\VSCodeServer\src\serve_notebook.jl:136
[13] top-level scope
@ c:\Users\Shayan\.vscode\extensions\julialang.language-julia-1.6.17\scripts\notebook\notebook.jl:32
[14] include(mod::Module, _path::String)
@ Base .\Base.jl:418
[15] exec_options(opts::Base.JLOptions)
@ Base .\client.jl:292
[16] _start()
@ Base .\client.jl:495
此外,我尝试了 insert!(dataframe , vec)
,但我得到了这个:
MethodError: no method matching insert!(::DataFrames.DataFrame, ::Vector{Int64})
Closest candidates are:
insert!(!Matched::DataStructures.AVLTree{K}, ::K) where K at C:\Users\Shayan\.julia\packages\DataStructures\vSp4s\src\avl_tree.jl:128
insert!(!Matched::DataStructures.SortedSet, ::Any) at C:\Users\Shayan\.julia\packages\DataStructures\vSp4s\src\sorted_set.jl:114
insert!(!Matched::DataStructures.SortedDict{K, D, Ord}, ::Any, !Matched::Any) where {K, D, Ord<:Base.Order.Ordering} at C:\Users\Shayan\.julia\packages\DataStructures\vSp4s\src\sorted_dict.jl:268
我该怎么做?任何帮助将不胜感激。
补充说明: vec
未在 dataframe
之前定义,并且是有意的!我的意思是,我必须先创建一个空的 DataFrame!
您可以进行如下操作:
julia> r=DataFrame(:a=>rand(5),:b=>rand(5))
5×2 DataFrame
Row │ a b
│ Float64 Float64
─────┼────────────────────
1 │ 0.8613 0.207534
2 │ 0.994096 0.561571
3 │ 0.220975 0.429286
4 │ 0.884805 0.835078
5 │ 0.964035 0.653509
julia> r[:,:c]=rand(5)
5-element Vector{Float64}:
0.5722614445699863
0.1582911302051686
0.14114436033460553
0.20981872218154363
0.07636493031324465
julia> r
5×3 DataFrame
Row │ a b c
│ Float64 Float64 Float64
─────┼───────────────────────────────
1 │ 0.8613 0.207534 0.572261
2 │ 0.994096 0.561571 0.158291
3 │ 0.220975 0.429286 0.141144
4 │ 0.884805 0.835078 0.209819
5 │ 0.964035 0.653509 0.0763649
nb:也可以从空数据帧开始工作:
julia> r=DataFrame()
0×0 DataFrame
julia> r[:,:c]=rand(5)
5-element Vector{Float64}:
0.6792303081607677
0.08094072339097869
0.5171831771259873
0.35343166177619845
0.44751700973394026
julia> r
5×1 DataFrame
Row │ c
│ Float64
─────┼───────────
1 │ 0.67923
2 │ 0.0809407
3 │ 0.517183
4 │ 0.353432
5 │ 0.447517
Update & summary (completed using Bogumił Kamiński answer)
You can do:
d[:,:colname] = x_vector # copy of x
d[!,:colname] = x_vector # no copy of x (shared)
if x is a scalar, see Bogumił Kamiński answer.
根据您的需要,有以下选项。
- 添加矢量而不复制
julia> x = [1, 2, 3]
3-element Vector{Int64}:
1
2
3
julia> df = DataFrame()
0×0 DataFrame
julia> df.x = x
3-element Vector{Int64}:
1
2
3
julia> df.x === x
true
或
julia> x = [1, 2, 3]
3-element Vector{Int64}:
1
2
3
julia> df = DataFrame()
0×0 DataFrame
julia> df[!, :x] = x
3-element Vector{Int64}:
1
2
3
julia> df.x === x
true
- 通过复制添加矢量
julia> x = [1, 2, 3]
3-element Vector{Int64}:
1
2
3
julia> df = DataFrame()
0×0 DataFrame
julia> df[:, :x] = x
3-element Vector{Int64}:
1
2
3
julia> df.x == x
true
julia> df.x === x
false
- 如果你有标量你可以做(也适用于矢量)
julia> df = DataFrame()
0×0 DataFrame
julia> insertcols!(df, :x => 1)
1×1 DataFrame
Row │ x
│ Int64
─────┼───────
1 │ 1
我想将向量作为列附加到空 DataFrame
。假设我像这样定义了一个空的 DataFrame
:
import DataFrames
dataframe = DataFrames.DataFrame()
然后我想将此向量作为列附加到 dataframe
:
vec = [1,2,3]
我尝试了 push!(dataframe , vec)
,但出现了这个错误:
DimensionMismatch("Length of `row` does not match `DataFrame` column count.")
Stacktrace:
[1] push!(df::DataFrames.DataFrame, row::Vector{Int64}; promote::Bool)
@ DataFrames C:\Users\Shayan\.julia\packages\DataFrames\BM4OQ\src\dataframe\dataframe.jl:1691
[2] push!(df::DataFrames.DataFrame, row::Vector{Int64})
@ DataFrames C:\Users\Shayan\.julia\packages\DataFrames\BM4OQ\src\dataframe\dataframe.jl:1680
[3] top-level scope
@ c:\Users\Shayan\Documents\PyJul Scripts\Jul-test.ipynb:2
[4] eval
@ .\boot.jl:373 [inlined]
[5] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)
@ Base .\loading.jl:1196
[6] #invokelatest#2
@ .\essentials.jl:716 [inlined]
[7] invokelatest
@ .\essentials.jl:714 [inlined]
[8] (::VSCodeServer.var"#164#165"{VSCodeServer.NotebookRunCellArguments, String})()
@ VSCodeServer c:\Users\Shayan\.vscode\extensions\julialang.language-julia-1.6.17\scripts\packages\VSCodeServer\src\serve_notebook.jl:19
[9] withpath(f::VSCodeServer.var"#164#165"{VSCodeServer.NotebookRunCellArguments, String}, path::String)
@ VSCodeServer c:\Users\Shayan\.vscode\extensions\julialang.language-julia-1.6.17\scripts\packages\VSCodeServer\src\repl.jl:184
[10] notebook_runcell_request(conn::VSCodeServer.JSONRPC.JSONRPCEndpoint{Base.PipeEndpoint, Base.PipeEndpoint}, params::VSCodeServer.NotebookRunCellArguments)
@ VSCodeServer c:\Users\Shayan\.vscode\extensions\julialang.language-julia-1.6.17\scripts\packages\VSCodeServer\src\serve_notebook.jl:13
[11] dispatch_msg(x::VSCodeServer.JSONRPC.JSONRPCEndpoint{Base.PipeEndpoint, Base.PipeEndpoint}, dispatcher::VSCodeServer.JSONRPC.MsgDispatcher, msg::Dict{String, Any})
@ VSCodeServer.JSONRPC c:\Users\Shayan\.vscode\extensions\julialang.language-julia-1.6.17\scripts\packages\JSONRPC\src\typed.jl:67
[12] serve_notebook(pipename::String, outputchannel_logger::Base.CoreLogging.SimpleLogger; crashreporting_pipename::String)
@ VSCodeServer c:\Users\Shayan\.vscode\extensions\julialang.language-julia-1.6.17\scripts\packages\VSCodeServer\src\serve_notebook.jl:136
[13] top-level scope
@ c:\Users\Shayan\.vscode\extensions\julialang.language-julia-1.6.17\scripts\notebook\notebook.jl:32
[14] include(mod::Module, _path::String)
@ Base .\Base.jl:418
[15] exec_options(opts::Base.JLOptions)
@ Base .\client.jl:292
[16] _start()
@ Base .\client.jl:495
此外,我尝试了 insert!(dataframe , vec)
,但我得到了这个:
MethodError: no method matching insert!(::DataFrames.DataFrame, ::Vector{Int64})
Closest candidates are:
insert!(!Matched::DataStructures.AVLTree{K}, ::K) where K at C:\Users\Shayan\.julia\packages\DataStructures\vSp4s\src\avl_tree.jl:128
insert!(!Matched::DataStructures.SortedSet, ::Any) at C:\Users\Shayan\.julia\packages\DataStructures\vSp4s\src\sorted_set.jl:114
insert!(!Matched::DataStructures.SortedDict{K, D, Ord}, ::Any, !Matched::Any) where {K, D, Ord<:Base.Order.Ordering} at C:\Users\Shayan\.julia\packages\DataStructures\vSp4s\src\sorted_dict.jl:268
我该怎么做?任何帮助将不胜感激。
补充说明: vec
未在 dataframe
之前定义,并且是有意的!我的意思是,我必须先创建一个空的 DataFrame!
您可以进行如下操作:
julia> r=DataFrame(:a=>rand(5),:b=>rand(5))
5×2 DataFrame
Row │ a b
│ Float64 Float64
─────┼────────────────────
1 │ 0.8613 0.207534
2 │ 0.994096 0.561571
3 │ 0.220975 0.429286
4 │ 0.884805 0.835078
5 │ 0.964035 0.653509
julia> r[:,:c]=rand(5)
5-element Vector{Float64}:
0.5722614445699863
0.1582911302051686
0.14114436033460553
0.20981872218154363
0.07636493031324465
julia> r
5×3 DataFrame
Row │ a b c
│ Float64 Float64 Float64
─────┼───────────────────────────────
1 │ 0.8613 0.207534 0.572261
2 │ 0.994096 0.561571 0.158291
3 │ 0.220975 0.429286 0.141144
4 │ 0.884805 0.835078 0.209819
5 │ 0.964035 0.653509 0.0763649
nb:也可以从空数据帧开始工作:
julia> r=DataFrame()
0×0 DataFrame
julia> r[:,:c]=rand(5)
5-element Vector{Float64}:
0.6792303081607677
0.08094072339097869
0.5171831771259873
0.35343166177619845
0.44751700973394026
julia> r
5×1 DataFrame
Row │ c
│ Float64
─────┼───────────
1 │ 0.67923
2 │ 0.0809407
3 │ 0.517183
4 │ 0.353432
5 │ 0.447517
Update & summary (completed using Bogumił Kamiński answer)
You can do:
d[:,:colname] = x_vector # copy of x d[!,:colname] = x_vector # no copy of x (shared)
if x is a scalar, see Bogumił Kamiński answer.
根据您的需要,有以下选项。
- 添加矢量而不复制
julia> x = [1, 2, 3]
3-element Vector{Int64}:
1
2
3
julia> df = DataFrame()
0×0 DataFrame
julia> df.x = x
3-element Vector{Int64}:
1
2
3
julia> df.x === x
true
或
julia> x = [1, 2, 3]
3-element Vector{Int64}:
1
2
3
julia> df = DataFrame()
0×0 DataFrame
julia> df[!, :x] = x
3-element Vector{Int64}:
1
2
3
julia> df.x === x
true
- 通过复制添加矢量
julia> x = [1, 2, 3]
3-element Vector{Int64}:
1
2
3
julia> df = DataFrame()
0×0 DataFrame
julia> df[:, :x] = x
3-element Vector{Int64}:
1
2
3
julia> df.x == x
true
julia> df.x === x
false
- 如果你有标量你可以做(也适用于矢量)
julia> df = DataFrame()
0×0 DataFrame
julia> insertcols!(df, :x => 1)
1×1 DataFrame
Row │ x
│ Int64
─────┼───────
1 │ 1