read_delim 不包括小数的 readr 问题

Question

我正在尝试读取如下所示的 csv 文件（我们称其为 test1.csv）

test_1;test_2;test_3;test_4
Test with Ö Ä;20;10,45;15,34

如您所见，值由 ; 分隔，而不是 , - 实际上 , 是小数点分隔符。我添加了“Ö”和“Ä”，因为我的数据中包含德语字母 - 要求我在 read_delim() 中的 locale() 中使用 ISO-8859-1。请注意，这并不重要，它只是解释了为什么我要使用 read_delim().

现在我会使用 read_delim():

阅读所有这些内容

read_delim("test1.csv", delim = ";", locale = locale(encoding = 'ISO-8859-1', 
           decimal_mark = ","))

给我这个：

# A tibble: 1 x 4
  test_1              test_2 test_3 test_4
  <chr>               <dbl>  <dbl>  <dbl>
1 "Test with Ö Ä"     20   10.4   15.3

事实上，我可以使用 pull(test_3) 得到 10.45 的值： [1] 10.45

但是现在，如果我简单地在 10.45 中添加五个 0，就可以像这样 1000000.45（我们称之为 test2.csv）

test_1;test_2;test_3;test_4
Test with Ö Ä;20;1000000,45;15,34

然后一切重复，我完全失去了1000000后面的.45。

read_delim("test2.csv", delim = ";",locale = locale(encoding = 'ISO-8859-1',decimal_mark = ",")) %>% pull(test_3)
Rows: 1 Columns: 4                                                                                                    
 0s── Column specification ────────────────────────────────────────────────────────────────────────────────────────────────
Delimiter: ";"
chr (1): test_1
dbl (3): test_2, test_3, test_4

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
[1] 1000000

我必须能够保留这些信息，不是吗？或者控制这种行为？这是一个错误吗？

Answer 1

这是一个打印问题。

如果将 %>% print(digits = 22) 添加到工作流程的末尾，您将获得：

[1] 1000000.449999999953434

这不是 1000000.45，因为显示的是标准 floating-point 系统中可用的最接近的近似值；
默认getOption("digits")值为7；您可以根据需要使用 options(digits = <your_choice>) 进行设置。在这种情况下，digits = 10 和 digits = 17 之间的任何值都会让您打印出“1000000.45”的结果； digits = 18 开始揭示潜在的近似值。

read_delim 不包括小数的 readr 问题

readr issue with read_delim not including decimals

r

readr