如何使用 readr 的 read_delim_chunked 使用 col_types?

How to use col_types using readr's read_delim_chunked?

我正在尝试分块读取文件并指定 col_types,请参阅 MWE

write.csv(cars, "cars.csv")


library(readr)
readr::read_delim_chunked("cars.csv", function(x, i) {
  x
}, delim= ",", col_types = cols(
  speed = col_character()
), chunk_size = 10)

但是我得到了错误的输出

NULL

但非分块版本工作正常

library(readr)
readr::read_delim("cars.csv", delim= ",", col_types = cols(
  speed = col_character()
))

问题是,当我们执行 write.csv 时,row.names 作为新列包含在内

write.csv(cars, "cars.csv", row.names = FALSE, quote = FALSE)

此外,我们需要 col_character() 而不是 col_character

readr::read_delim_chunked("cars.csv",  DataFrameCallback$new(function(x, i) {
  x
}), col_types = cols(
  speed = col_character()
), delim= ",",  chunk_size = 10)

由于某些原因,您需要将函数包装在 DataFrameCallback$new 中,原因我不明白。

write.csv(cars, "cars.csv")

有效

readr::read_delim_chunked("cars.csv",  DataFrameCallback$new(function(x, i) {
  x
}), col_types = cols(
  speed = col_character()
), delim= ",",  chunk_size = 10)

给出错误

readr::read_delim_chunked("cars.csv",  function(x, i) {
  x
}, col_types = cols(
  speed = col_character()
), delim= ",",  chunk_size = 10)