用 vroom 定义小数分隔符

Define decimal separator with vroom

我经常遇到 csv 文件,这些文件是用德语语言环境保存的,因此没有正确地用逗号分隔,而是用分号分隔。这当然可以通过定义分隔符轻松解决。但是 vroom 与例如 fread 不同的是,它不提供定义小数点分隔符的可能性。 因此,带有 , 作为小数点分隔符的数值被导入为字符,或者错误地没有任何小数点分隔符,因此是非常大的数字。 有没有一种方法可以直接定义小数分隔符,类似于它在 fread?

中的工作方式
library(vroom)
library(data.table)
   
df <- data.table(row.num = 1:10
                 , V1 = rnorm(10,10,5)
                 , V2 = rnorm(10,100,30))

fwrite(df, file = "vroom_test.csv", sep = ";", dec = ",")

fread(input = "vroom_test.csv", sep = ";", dec = ",")

vroom(file = "vroom_test.csv", delim = ";")
# definition of custom locale does allow that
vroom(file = "vroom_test.csv", delim = ";", locale = locale(grouping_mark = ".", decimal_mark = ",", encoding = "UTF-8"))

正如评论中已经提到的那样,解决方案是 straight-forward,唯一需要做的就是在 vroom 调用中包含 locale() 选项。 locale 选项的可能选项可以在其文档中找到。

vroom(file = "vroom_test.csv", delim = ";", locale = locale(grouping_mark = ".", decimal_mark = ",", encoding = "UTF-8"))