读取 .csv 数据库时遇到问题

Question

我正在尝试使用 readr::read_csv

读取 .csv 文件

readr::read_csv("my_file.csv")

但是我得到了以下错误：

Parsed with column specification:
cols(
  col_character()
)
Error in read_tokens_(data, tokenizer, col_specs, col_names, locale_,  : 
  Evaluation error: Column 1 must be named.

到底是怎么回事？

.csv 文件可以在这里找到： https://drive.google.com/file/d/1W_ZetpOfWDuSVhiIVAa0sEcRE4ujCSXB/view?usp=sharing

Answer 1

问题在于编码，this post 展示了如何使用 read.csv:

read.csv("BRA_females-45q15.csv", fileEncoding="UTF-16LE")

要使用readr::read_csv实现同样的效果，我们可以按照下面的方式进行，首先我们可以找出编码：

guess_encoding(file = "BRA_females-45q15.csv")
# # A tibble: 3 x 2
#   encoding   confidence
#   <chr>           <dbl>
#   1 UTF-16LE         1   
# 2 ISO-8859-1       0.8 
# 3 ISO-8859-2       0.51

然后使用 read_csv 和 locale:

read_csv("BRA_females-45q15.csv", locale = locale(encoding = "UTF-16LE"))

# Error in guess_header_(datasource, tokenizer, locale) : 
#   Incomplete multibyte sequence

但这又给了我们一个错误，看起来像 a know issue。

Hadley: "Yeah, this is a big issue that will need some thought. In general, readr currently assumes that it can read byte-by-byte, and anything else will require quite of lot of work/thought."

读取 .csv 数据库时遇到问题

Having trouble reading an .csv database

csv

r

readr