用 vroom 定义小数分隔符
Define decimal separator with vroom
我经常遇到 csv 文件,这些文件是用德语语言环境保存的,因此没有正确地用逗号分隔,而是用分号分隔。这当然可以通过定义分隔符轻松解决。但是 vroom
与例如 fread
不同的是,它不提供定义小数点分隔符的可能性。
因此,带有 ,
作为小数点分隔符的数值被导入为字符,或者错误地没有任何小数点分隔符,因此是非常大的数字。
有没有一种方法可以直接定义小数分隔符,类似于它在 fread
?
中的工作方式
library(vroom)
library(data.table)
df <- data.table(row.num = 1:10
, V1 = rnorm(10,10,5)
, V2 = rnorm(10,100,30))
fwrite(df, file = "vroom_test.csv", sep = ";", dec = ",")
fread(input = "vroom_test.csv", sep = ";", dec = ",")
vroom(file = "vroom_test.csv", delim = ";")
# definition of custom locale does allow that
vroom(file = "vroom_test.csv", delim = ";", locale = locale(grouping_mark = ".", decimal_mark = ",", encoding = "UTF-8"))
正如评论中已经提到的那样,解决方案是 straight-forward,唯一需要做的就是在 vroom
调用中包含 locale()
选项。
locale
选项的可能选项可以在其文档中找到。
vroom(file = "vroom_test.csv", delim = ";", locale = locale(grouping_mark = ".", decimal_mark = ",", encoding = "UTF-8"))
我经常遇到 csv 文件,这些文件是用德语语言环境保存的,因此没有正确地用逗号分隔,而是用分号分隔。这当然可以通过定义分隔符轻松解决。但是 vroom
与例如 fread
不同的是,它不提供定义小数点分隔符的可能性。
因此,带有 ,
作为小数点分隔符的数值被导入为字符,或者错误地没有任何小数点分隔符,因此是非常大的数字。
有没有一种方法可以直接定义小数分隔符,类似于它在 fread
?
library(vroom)
library(data.table)
df <- data.table(row.num = 1:10
, V1 = rnorm(10,10,5)
, V2 = rnorm(10,100,30))
fwrite(df, file = "vroom_test.csv", sep = ";", dec = ",")
fread(input = "vroom_test.csv", sep = ";", dec = ",")
vroom(file = "vroom_test.csv", delim = ";")
# definition of custom locale does allow that
vroom(file = "vroom_test.csv", delim = ";", locale = locale(grouping_mark = ".", decimal_mark = ",", encoding = "UTF-8"))
正如评论中已经提到的那样,解决方案是 straight-forward,唯一需要做的就是在 vroom
调用中包含 locale()
选项。
locale
选项的可能选项可以在其文档中找到。
vroom(file = "vroom_test.csv", delim = ";", locale = locale(grouping_mark = ".", decimal_mark = ",", encoding = "UTF-8"))