使用 read_csv 在引用中使用 delim 读取 CSV 文件

Read CSV file with delim in the quote using read_csv

csv 文件的引号中包含逗号 (,)。 read_csv 函数将它们转换为 numeric 数字,假设保留为 character.

library(readr)
read_csv('"Name","V1","V2"\n
"A","0,20","300,200"\n
"B","0,20","300,200"')

结果看起来像

# A tibble: 2 x 3
  Name  V1        V2
  <chr> <chr>  <dbl>
1 A     0,20  300200
2 B     0,20  300200

我希望 V2 列与字符保持一致。

我该如何解决?

我的会话信息

> sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)

Matrix products: default

locale:
[1] LC_COLLATE=English_Australia.1252 
[2] LC_CTYPE=English_Australia.1252   
[3] LC_MONETARY=English_Australia.1252
[4] LC_NUMERIC=C                      
[5] LC_TIME=English_Australia.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets 
[6] methods   base     

other attached packages:
[1] readr_1.4.0

loaded via a namespace (and not attached):
 [1] fansi_0.5.0     utf8_1.2.2      crayon_1.4.1   
 [4] R6_2.5.0        lifecycle_1.0.0 magrittr_2.0.1 
 [7] pillar_1.6.1    rlang_0.4.11    cli_3.0.1      
[10] rstudioapi_0.13 vctrs_0.3.8     ellipsis_0.3.2 
[13] tools_4.1.0     hms_1.1.0       compiler_4.1.0 
[16] pkgconfig_2.0.3 tibble_3.1.3

两个选项-

  1. locale 中的 grouping_mark 传递给数据中不存在的内容。
library(readr)

read_csv('"Name","V1","V2"\n
"A","0,20","300,200"\n
"B","0,20","300,200"', locale = locale(grouping_mark = "@"))

#  Name  V1    V2     
#  <chr> <chr> <chr>  
#1 A     0,20  300,200
#2 B     0,20  300,200
  1. 显式传递 class 列。
read_csv('"Name","V1","V2"\n
"A","0,20","300,200"\n
"B","0,20","300,200"', col_types = 'ccc')