包含来自 lubridate 包的 class 间隔列的多个数据帧的绑定行
Binding rows of multiple data frames containing columns of class interval from lubridate package
我有一个列表,其中每个元素都是一个具有相同列名的数据框,其中一列是 class 间隔(来自 lubridate 包)。我想将列表中的所有单个数据框绑定到一个数据框中。不幸的是,使用 rbind 和 bind_rows 将间隔列强制转换为数字,我收到以下警告。
警告信息:
1:在 bind_rows_(x, .id) 中:
矢量化 'Interval' 元素可能无法保留其属性
library(dplyr)
library(lubridate)
#Create sample list length 2 actually list length ~18,000
test <- list(BGC119AP01 = structure(list(participant_code = "BGC119AP01",
interval_1 = new("Interval", .Data = 34128000, start = structure(1479427200, class = c("POSIXct",
"POSIXt"), tzone = "UTC"), tzone = "UTC")), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -1L), groups = structure(list(
participant_code = "BGC119AP01", .rows = list(1L)), row.names = c(NA,
-1L), class = c("tbl_df", "tbl", "data.frame"), .drop = FALSE)),
BGC119AP02 = structure(list(participant_code = "BGC119AP02",
interval_1 = new("Interval", .Data = 34128000, start = structure(1479427200, class = c("POSIXct",
"POSIXt"), tzone = "UTC"), tzone = "UTC")), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -1L), groups = structure(list(
participant_code = "BGC119AP02", .rows = list(1L)), row.names = c(NA,
-1L), class = c("tbl_df", "tbl", "data.frame"), .drop = FALSE)))
#Attempt bind rows both ending in the above warning.
do.call(rbind, test)
do.call(bind_rows, test)
输出
注意 interval_1 已强制转换为 double 并丢失其属性
# A tibble: 2 x 2
# Groups: participant_code [2]
participant_code interval_1
<chr> <dbl>
1 BGC119AP01 34128000
2 BGC119AP02 34128000
Warning messages:
1: In bind_rows_(x, .id) :
Vectorizing 'Interval' elements may not preserve their attributes
2: In bind_rows_(x, .id) :
Vectorizing 'Interval' elements may not preserve their attributes
这大概是因为class区间的列不是原子向量。我知道我可以通过保留原始开始和停止日期然后在绑定行后创建间隔列来解决这个问题,但我想要一个允许我绑定列表中所有单个数据框同时保持完整性的解决方案class 间隔的列,并且解决方案可扩展到 18,000 行。非常感谢
有一个提示,当您在加载 dplyr
的情况下执行 do.call(rbind, test)
并收到警告:
Warning messages:
1: In bind_rows_(x, .id) :
Vectorizing 'Interval' elements may not preserve their attributes
实际上正在调用 dplyr::bind_rows()
而不是 base::rbind()
并且间隔属性被删除。这似乎发生在对象为 tibbles(tbl
或 tbl_df
class)时。
您可以使用 rbind.data.frame()
来避免这种情况:
do.call(rbind.data.frame, test)
# A tibble: 2 x 2
# Groups: participant_code [1]
participant_code interval_1
* <chr> <Interval>
1 BGC119AP01 2016-11-18 UTC--2017-12-18 UTC
2 BGC119AP02 2016-11-18 UTC--2017-12-18 UTC
我有一个列表,其中每个元素都是一个具有相同列名的数据框,其中一列是 class 间隔(来自 lubridate 包)。我想将列表中的所有单个数据框绑定到一个数据框中。不幸的是,使用 rbind 和 bind_rows 将间隔列强制转换为数字,我收到以下警告。
警告信息: 1:在 bind_rows_(x, .id) 中: 矢量化 'Interval' 元素可能无法保留其属性
library(dplyr)
library(lubridate)
#Create sample list length 2 actually list length ~18,000
test <- list(BGC119AP01 = structure(list(participant_code = "BGC119AP01",
interval_1 = new("Interval", .Data = 34128000, start = structure(1479427200, class = c("POSIXct",
"POSIXt"), tzone = "UTC"), tzone = "UTC")), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -1L), groups = structure(list(
participant_code = "BGC119AP01", .rows = list(1L)), row.names = c(NA,
-1L), class = c("tbl_df", "tbl", "data.frame"), .drop = FALSE)),
BGC119AP02 = structure(list(participant_code = "BGC119AP02",
interval_1 = new("Interval", .Data = 34128000, start = structure(1479427200, class = c("POSIXct",
"POSIXt"), tzone = "UTC"), tzone = "UTC")), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -1L), groups = structure(list(
participant_code = "BGC119AP02", .rows = list(1L)), row.names = c(NA,
-1L), class = c("tbl_df", "tbl", "data.frame"), .drop = FALSE)))
#Attempt bind rows both ending in the above warning.
do.call(rbind, test)
do.call(bind_rows, test)
输出 注意 interval_1 已强制转换为 double 并丢失其属性
# A tibble: 2 x 2
# Groups: participant_code [2]
participant_code interval_1
<chr> <dbl>
1 BGC119AP01 34128000
2 BGC119AP02 34128000
Warning messages:
1: In bind_rows_(x, .id) :
Vectorizing 'Interval' elements may not preserve their attributes
2: In bind_rows_(x, .id) :
Vectorizing 'Interval' elements may not preserve their attributes
这大概是因为class区间的列不是原子向量。我知道我可以通过保留原始开始和停止日期然后在绑定行后创建间隔列来解决这个问题,但我想要一个允许我绑定列表中所有单个数据框同时保持完整性的解决方案class 间隔的列,并且解决方案可扩展到 18,000 行。非常感谢
有一个提示,当您在加载 dplyr
的情况下执行 do.call(rbind, test)
并收到警告:
Warning messages:
1: In bind_rows_(x, .id) :
Vectorizing 'Interval' elements may not preserve their attributes
实际上正在调用 dplyr::bind_rows()
而不是 base::rbind()
并且间隔属性被删除。这似乎发生在对象为 tibbles(tbl
或 tbl_df
class)时。
您可以使用 rbind.data.frame()
来避免这种情况:
do.call(rbind.data.frame, test)
# A tibble: 2 x 2
# Groups: participant_code [1]
participant_code interval_1
* <chr> <Interval>
1 BGC119AP01 2016-11-18 UTC--2017-12-18 UTC
2 BGC119AP02 2016-11-18 UTC--2017-12-18 UTC