从 DataFrame 或 JuliaDB 获取列名 table
Get column names from DataFrame or JuliaDB table
如何从 DataFrame
对象或 JuliaDB IndexedTable
对象中获取列名?这可能吗?
可重现代码:
using JuliaDB
import DataFrames
DF = DataFrames
# CREATES AN EXAMPLE TABLE WITH JULIADB
colnames = [:samples, :A, :B, :C, :D]
primary_key = [:samples]
coltypes = [Int[], Float64[],Float64[],Float64[],Float64[]]
sample_sizes = [100,200,300]
example_values = (1, 0.4, 0.3, 0.2, 0.1)
mytable = table(coltypes..., names=colnames, pkey=primary_key) # initialize empty table
# add some data to table
for i in sample_sizes
example_values = (i, 0.4, 0.3, 0.2, 0.1)
table_params = [(col=>val) for (col,val) in zip(colnames, example_values)]
push!(rows(mytable), (; table_params...)) # add row
mytable = table(mytable, pkey = primary_key, copy = false) # sort rows by primary key
end
mytable = table(unique(mytable), pkey=primary_key) # remove duplicate rows which don't exist
# MAKES A DATAFRAME FROM JULIADB TABLE
df = DF.DataFrame(mytable)
例如,给定上面的代码,如果在 mytable
或 df
中有一个列 :E
,您将如何检查条件(为了添加这样的一列,如果它还不存在)?
最终,我正在寻找与以下 Python 代码等效的 Julia:
if 'E' in df.columns:
# ...
else:
# ...
如果df
是一个数据框你可以这样写:
if :E in names(df)
...
(在 JuliaDB.jl 中会是 JuliaDB.colnames
)
但更快(就 运行 时间而言且可用于数据帧的选项是:
if hasproperty(df, :E)
...
有点慢,但在其他情况下很有用(它也适用于 JuliaDB.jl 但首先你必须加载 Tables.jl 并改为写 Tables.columnindex
):
if columnindex(df, :E) != 0
...
最后一个示例 columnindex
可能是最复杂的,其工作方式在其文档中有所描述:
help?> columnindex
search: columnindex
Tables.columnindex(table, name::Symbol)
Return the column index (1-based) of a column by name in a table with a
known schema; returns 0 if name doesn't exist in table
────────────────────────────────────────────────────────────────────────────
given names and a Symbol name, compute the index (1-based) of the name in
names
如何从 DataFrame
对象或 JuliaDB IndexedTable
对象中获取列名?这可能吗?
可重现代码:
using JuliaDB
import DataFrames
DF = DataFrames
# CREATES AN EXAMPLE TABLE WITH JULIADB
colnames = [:samples, :A, :B, :C, :D]
primary_key = [:samples]
coltypes = [Int[], Float64[],Float64[],Float64[],Float64[]]
sample_sizes = [100,200,300]
example_values = (1, 0.4, 0.3, 0.2, 0.1)
mytable = table(coltypes..., names=colnames, pkey=primary_key) # initialize empty table
# add some data to table
for i in sample_sizes
example_values = (i, 0.4, 0.3, 0.2, 0.1)
table_params = [(col=>val) for (col,val) in zip(colnames, example_values)]
push!(rows(mytable), (; table_params...)) # add row
mytable = table(mytable, pkey = primary_key, copy = false) # sort rows by primary key
end
mytable = table(unique(mytable), pkey=primary_key) # remove duplicate rows which don't exist
# MAKES A DATAFRAME FROM JULIADB TABLE
df = DF.DataFrame(mytable)
例如,给定上面的代码,如果在 mytable
或 df
中有一个列 :E
,您将如何检查条件(为了添加这样的一列,如果它还不存在)?
最终,我正在寻找与以下 Python 代码等效的 Julia:
if 'E' in df.columns:
# ...
else:
# ...
如果df
是一个数据框你可以这样写:
if :E in names(df)
...
(在 JuliaDB.jl 中会是 JuliaDB.colnames
)
但更快(就 运行 时间而言且可用于数据帧的选项是:
if hasproperty(df, :E)
...
有点慢,但在其他情况下很有用(它也适用于 JuliaDB.jl 但首先你必须加载 Tables.jl 并改为写 Tables.columnindex
):
if columnindex(df, :E) != 0
...
最后一个示例 columnindex
可能是最复杂的,其工作方式在其文档中有所描述:
help?> columnindex
search: columnindex
Tables.columnindex(table, name::Symbol)
Return the column index (1-based) of a column by name in a table with a
known schema; returns 0 if name doesn't exist in table
────────────────────────────────────────────────────────────────────────────
given names and a Symbol name, compute the index (1-based) of the name in
names