pivot_longer 使用 dtplyr 时出错
pivot_longer gives error when using dtplyr
我有一个大型数据集,我正在尝试使用 dtplyr 进行整理。它由大量 (>1000) 个不同位置的日期值对组成。原文使用了一个pivot_longer,在dplyr中工作正常,但在dtplyr中报错。有没有办法解决这个问题,同时保持 dtplyr 的性能优势?
这个有效
library(tidyverse)
library(dtplyr)
library(data.table)
my_data_tb <- tribble(
~`date-A`, ~`value-A`, ~`date-B`, ~`value-B`,
"date1", 1, "date2", 2,
"date2", 1, "date3", 2
)
my_data_tb %>%
pivot_longer(
cols = everything(),
names_to = c(".value", "grid_square"),
names_sep = "-"
)
但这给出了错误:
my_data_dt <- as.data.table(my_data_tb)
my_data_dt <- lazy_dt(my_data_dt)
my_data_dt %>%
pivot_longer(
cols = everything(),
names_to = c(".value", "grid_square"),
names_sep = "-"
)
错误信息是:
Error: Can't subset elements that don't exist.
x The locations 1 and 2 don't exist.
i There are only 0 elements.
Run rlang::last_error()
to see where the error occurred.
In addition: Warning message:
Expected 2 pieces. Missing pieces filled with NA
in 7 rows [1, 2, 3, 4, 5, 6, 7].
rlang::last_error()
Error: Internal error: Trace data is not square.
更新 - 它现在给出了这个错误信息:
Error in UseMethod("pivot_longer") :
no applicable method for 'pivot_longer' applied to an object of class "c('dtplyr_step_first', 'dtplyr_step')"
顺便说一句,这也有效,但我认为它失去了 dtplyr 性能增益:
my_data_dt %>%
as_tibble() %>%
pivot_longer(
cols = everything(),
names_to = c(".value", "grid_square"),
names_sep = "-"
)
Dtplyr 版本 1.2.0 现已在 CRAN 上可用,这意味着此问题现已解决!
对于遇到此错误的任何人,check/update 您的 dtplyr 版本以确保您是 运行 >=1.2.0:
install.packages("dtplyr")
(注意。这不是作为 tidyverse 包的一部分更新的,所以请务必单独更新)
我有一个大型数据集,我正在尝试使用 dtplyr 进行整理。它由大量 (>1000) 个不同位置的日期值对组成。原文使用了一个pivot_longer,在dplyr中工作正常,但在dtplyr中报错。有没有办法解决这个问题,同时保持 dtplyr 的性能优势?
这个有效
library(tidyverse)
library(dtplyr)
library(data.table)
my_data_tb <- tribble(
~`date-A`, ~`value-A`, ~`date-B`, ~`value-B`,
"date1", 1, "date2", 2,
"date2", 1, "date3", 2
)
my_data_tb %>%
pivot_longer(
cols = everything(),
names_to = c(".value", "grid_square"),
names_sep = "-"
)
但这给出了错误:
my_data_dt <- as.data.table(my_data_tb)
my_data_dt <- lazy_dt(my_data_dt)
my_data_dt %>%
pivot_longer(
cols = everything(),
names_to = c(".value", "grid_square"),
names_sep = "-"
)
错误信息是:
Error: Can't subset elements that don't exist.
x The locations 1 and 2 don't exist.
i There are only 0 elements.
Runrlang::last_error()
to see where the error occurred.
In addition: Warning message:
Expected 2 pieces. Missing pieces filled withNA
in 7 rows [1, 2, 3, 4, 5, 6, 7].
rlang::last_error()
Error: Internal error: Trace data is not square.
更新 - 它现在给出了这个错误信息:
Error in UseMethod("pivot_longer") : no applicable method for 'pivot_longer' applied to an object of class "c('dtplyr_step_first', 'dtplyr_step')"
顺便说一句,这也有效,但我认为它失去了 dtplyr 性能增益:
my_data_dt %>%
as_tibble() %>%
pivot_longer(
cols = everything(),
names_to = c(".value", "grid_square"),
names_sep = "-"
)
Dtplyr 版本 1.2.0 现已在 CRAN 上可用,这意味着此问题现已解决!
对于遇到此错误的任何人,check/update 您的 dtplyr 版本以确保您是 运行 >=1.2.0:
install.packages("dtplyr")
(注意。这不是作为 tidyverse 包的一部分更新的,所以请务必单独更新)