使用 lubridate & dplyr 从 data.table 创建的持续时间出错
Error with durations created from a data.table using lubridate & dplyr
我正在尝试聚合存储在 data.table
中的一些数据,然后根据聚合数据创建持续时间(从 lubridate
开始)。但是,当我尝试这样做时,出现错误。这是一个可重现的例子:
library(lubridate)
library(data.table)
library(dplyr)
data(lakers)
lakers.dt <- data.table(lakers, key = "player")
durations <- lakers.dt %>%
mutate(better.date = ymd(date)) %>%
group_by(player) %>%
summarize(min.date = min(better.date), max.date = max(better.date)) %>%
mutate(duration = interval(min.date, max.date))
# Source: local data table [371 x 4]
#
# player min.date max.date
# 1 2008-10-28 2009-04-14
# 2 Aaron Brooks 2008-11-09 2009-04-03
# 3 Aaron Gray 2008-11-18 2008-11-18
# 4 Acie Law 2009-02-17 2009-02-17
# 5 Adam Morrison 2009-02-17 2009-04-12
# 6 Al Harrington 2008-12-16 2009-02-02
# 7 Al Horford 2009-02-17 2009-03-29
# 8 Al Jefferson 2008-12-14 2009-01-30
# 9 Al Thornton 2008-10-29 2009-04-05
# 10 Alando Tucker 2009-02-26 2009-02-26
# .. ... ... ...
# Variables not shown: duration (dbl)
# Warning messages:
# 1: In unclass(e1) + unclass(e2) :
# longer object length is not a multiple of shorter object length
# 2: In format.data.frame(df, justify = "left") :
# corrupt data frame: columns will be truncated or padded with NAs
知道此错误的含义或错误来源吗?
编辑:
当您省略 dplyr
并执行 data.table
中的所有操作时,仍然会发生这种情况。这是我使用的代码:
lakers.dt[, better.date := ymd(date)]
durations <- lakers.dt[, list(min.date = min(better.date),
max.date = max(better.date)), by = player]
(durations[, duration := interval(min.date, max.date)])
# Error in `rownames<-`(`*tmp*`, value = paste(format(rn, right = TRUE), :
# length of 'dimnames' [1] not equal to array extent
# In addition: Warning messages:
# 1: In unclass(e1) + unclass(e2) :
# longer object length is not a multiple of shorter object length
# 2: In cbind(player = c("", "Aaron Brooks", "Aaron Gray", "Acie Law", :
# number of rows of result is not a multiple of vector length (arg 1)
您可以尝试将 interval
输出转换为 character
class(因为 interval
输出不是 vector
)或换行as.duration
(来自@Jake Fisher)
durations <- lakers.dt %>%
mutate(better.date = ymd(date)) %>%
group_by(player) %>%
summarize(min.date = min(better.date), max.date = max(better.date)) %>%
mutate(duration= as.duration(interval(min.date, max.date))
)
或使用 as.vector
将其强制为 numeric
class.
我正在尝试聚合存储在 data.table
中的一些数据,然后根据聚合数据创建持续时间(从 lubridate
开始)。但是,当我尝试这样做时,出现错误。这是一个可重现的例子:
library(lubridate)
library(data.table)
library(dplyr)
data(lakers)
lakers.dt <- data.table(lakers, key = "player")
durations <- lakers.dt %>%
mutate(better.date = ymd(date)) %>%
group_by(player) %>%
summarize(min.date = min(better.date), max.date = max(better.date)) %>%
mutate(duration = interval(min.date, max.date))
# Source: local data table [371 x 4]
#
# player min.date max.date
# 1 2008-10-28 2009-04-14
# 2 Aaron Brooks 2008-11-09 2009-04-03
# 3 Aaron Gray 2008-11-18 2008-11-18
# 4 Acie Law 2009-02-17 2009-02-17
# 5 Adam Morrison 2009-02-17 2009-04-12
# 6 Al Harrington 2008-12-16 2009-02-02
# 7 Al Horford 2009-02-17 2009-03-29
# 8 Al Jefferson 2008-12-14 2009-01-30
# 9 Al Thornton 2008-10-29 2009-04-05
# 10 Alando Tucker 2009-02-26 2009-02-26
# .. ... ... ...
# Variables not shown: duration (dbl)
# Warning messages:
# 1: In unclass(e1) + unclass(e2) :
# longer object length is not a multiple of shorter object length
# 2: In format.data.frame(df, justify = "left") :
# corrupt data frame: columns will be truncated or padded with NAs
知道此错误的含义或错误来源吗?
编辑:
当您省略 dplyr
并执行 data.table
中的所有操作时,仍然会发生这种情况。这是我使用的代码:
lakers.dt[, better.date := ymd(date)]
durations <- lakers.dt[, list(min.date = min(better.date),
max.date = max(better.date)), by = player]
(durations[, duration := interval(min.date, max.date)])
# Error in `rownames<-`(`*tmp*`, value = paste(format(rn, right = TRUE), :
# length of 'dimnames' [1] not equal to array extent
# In addition: Warning messages:
# 1: In unclass(e1) + unclass(e2) :
# longer object length is not a multiple of shorter object length
# 2: In cbind(player = c("", "Aaron Brooks", "Aaron Gray", "Acie Law", :
# number of rows of result is not a multiple of vector length (arg 1)
您可以尝试将 interval
输出转换为 character
class(因为 interval
输出不是 vector
)或换行as.duration
(来自@Jake Fisher)
durations <- lakers.dt %>%
mutate(better.date = ymd(date)) %>%
group_by(player) %>%
summarize(min.date = min(better.date), max.date = max(better.date)) %>%
mutate(duration= as.duration(interval(min.date, max.date))
)
或使用 as.vector
将其强制为 numeric
class.