使用 R 中 readr 中的 read_csv 将文本作为指定列以 [type] 打开
pass text as specified columns to open as [type] using read_csv from readr in R
我有一些要打开的 .csv 文件,将默认列类型指定为“i”(表示整数)。但是,某些文件也有特定的列,我想告诉 readr::read_csv
以定义的类型打开(哪些列的逻辑无关紧要,假设我知道哪些文件对应哪些列)
有没有办法将这些列传递到 read_csv
的 col_types
参数中,同时仍然保持每隔一列应以整数类型打开
df <- data.frame(
a = c(1,2,3,4),
b = sample(1:100, 4),
c_text = c("hi", "I", "am", "text"),
d_decimals = runif(4),
e_more_text = c("another", "text", "column", "lol")
)
readr::write_csv(df, "/path/to/csv/file.csv")
character_cols <- c("c_text", "e_more_text")
double_cols <- "d_decimals"
data <- readr::read_csv(
"/path/to/csv/file.csv",
# supply something here to determine column types
col_types = cols(.default = "i", character_cols = "c", double_cols = "d")
)
由于计算哪些列应该是字符或双精度等的逻辑。我最好将它们作为名称向量提供
干杯
您可以创建一个辅助函数,将您的额外规范与默认列规范相结合,然后将规范与 do.call
结合在一起。
extra_spec = list(
"c_text" = "c",
"d_decimals" = "i",
"e_more_text" = "c"
)
read_csv_with_default_int = function(path, extra_spec) {
readr::read_csv(path, col_types = do.call(cols, c(extra_spec, list(.default = col_integer()))))
}
read_csv_with_default_int("file.csv", extra_spec = extra_spec)
您还可以使用像
这样的助手来避免大量嵌套逻辑
cols_default_int = purrr::partial(cols, .default = col_integer())
read_csv_with_default_int = function(path, col_types) {
readr::read_csv(path, col_types = do.call(cols_default_int, col_types))
}
read_csv_with_default_int("file.csv", col_types = extra_spec)
我有一些要打开的 .csv 文件,将默认列类型指定为“i”(表示整数)。但是,某些文件也有特定的列,我想告诉 readr::read_csv
以定义的类型打开(哪些列的逻辑无关紧要,假设我知道哪些文件对应哪些列)
有没有办法将这些列传递到 read_csv
的 col_types
参数中,同时仍然保持每隔一列应以整数类型打开
df <- data.frame(
a = c(1,2,3,4),
b = sample(1:100, 4),
c_text = c("hi", "I", "am", "text"),
d_decimals = runif(4),
e_more_text = c("another", "text", "column", "lol")
)
readr::write_csv(df, "/path/to/csv/file.csv")
character_cols <- c("c_text", "e_more_text")
double_cols <- "d_decimals"
data <- readr::read_csv(
"/path/to/csv/file.csv",
# supply something here to determine column types
col_types = cols(.default = "i", character_cols = "c", double_cols = "d")
)
由于计算哪些列应该是字符或双精度等的逻辑。我最好将它们作为名称向量提供
干杯
您可以创建一个辅助函数,将您的额外规范与默认列规范相结合,然后将规范与 do.call
结合在一起。
extra_spec = list(
"c_text" = "c",
"d_decimals" = "i",
"e_more_text" = "c"
)
read_csv_with_default_int = function(path, extra_spec) {
readr::read_csv(path, col_types = do.call(cols, c(extra_spec, list(.default = col_integer()))))
}
read_csv_with_default_int("file.csv", extra_spec = extra_spec)
您还可以使用像
这样的助手来避免大量嵌套逻辑cols_default_int = purrr::partial(cols, .default = col_integer())
read_csv_with_default_int = function(path, col_types) {
readr::read_csv(path, col_types = do.call(cols_default_int, col_types))
}
read_csv_with_default_int("file.csv", col_types = extra_spec)