readr 强制列类型

Question

我正在尝试读取 CSV 文件 - 并试图强制列为特定类型。但是最后一列给我一个错误：“is.list(col_types) 中的错误：未知的快捷方式：g”

有什么建议吗？谢谢！

library(readr)

# Create data frame & write it out:
temp <- data.frame(a = 1:1001,
                   mystring_b = c(rep(NA, 1000), "1"),
                   mystring_c = c(rep(NA, 1000), "2"))
write.csv(temp, "temp.csv", row.names = F)

# Grab its names:
temp_labels <- names(read_csv("temp.csv", n_max = 0))

# Specify data type - for each column:
labels_type <- ifelse(grepl("mystring", temp_labels), "numeric", "guess")

# Reading in while forcing column types:
temp <- read_csv("temp.csv", col_types = labels_type)

# Error in is.list(col_types) : Unknown shortcut: g

Answer 1

以下是帮助页面 ?read_csv 中对 col_types 的描述的摘录：

col_types

... Alternatively, you can use a compact string representation where each character represents one column: c = character, i = integer, n = number, d = double, l = logical, D = date, T = date time, t = time, ? = guess, or _/- to skip the column.

因此，正如错误消息所说，"g" 不是可接受的快捷方式。您应该改用 "?"。

此外，虽然 read_csv 似乎幸运地从您的 "numeric" 规范中获取了第一个字符，但为了安全起见，您应该使用 "n" 来匹配文档。事实上，如果您查看这些示例，其目的是使用单个字符串，而不是长度 > 1 的字符串向量作为规范。同样，如果您的方法以其他方式工作，那么您很幸运，但最好与文档相匹配，如下所示：

labels_type <- paste(ifelse(grepl("mystring", temp_labels), "n", "g"), collapse = "")

readr 强制列类型

readr forcing column type

r

readr