当行很大时 tab_row_group 的较短替代品
Shorter substitute for tab_row_group when rows are large
我喜欢 gt
R 包,但我无法想出适用于大型表且行组标签未知的行分组的清晰代码。
考虑一下这个玩具示例,因为它很小 data.table 看起来还不错。
library(data.table)
library(magrittr)
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:data.table':
#>
#> hour, isoweek, mday, minute, month, quarter, second, wday, week,
#> yday, year
#> The following objects are masked from 'package:base':
#>
#> date, intersect, setdiff, union
library(gt)
# create a toy data.table
dt <- data.table(datetime = seq(ymd_hm(202205100800),by = "5 hours",length.out = 15))[order(datetime)]
dt[,date:=as_date(datetime)]
dt[,time:=format(datetime,"%H:%M")]
dt[,values:=seq(1000,by = 10,length.out=15)]
# Here's how my toy data.table looks like:
print(dt)
#> datetime date time values
#> 1: 2022-05-10 08:00:00 2022-05-10 08:00 1000
#> 2: 2022-05-10 13:00:00 2022-05-10 13:00 1010
#> 3: 2022-05-10 18:00:00 2022-05-10 18:00 1020
#> 4: 2022-05-10 23:00:00 2022-05-10 23:00 1030
#> 5: 2022-05-11 04:00:00 2022-05-11 04:00 1040
#> 6: 2022-05-11 09:00:00 2022-05-11 09:00 1050
#> 7: 2022-05-11 14:00:00 2022-05-11 14:00 1060
#> 8: 2022-05-11 19:00:00 2022-05-11 19:00 1070
#> 9: 2022-05-12 00:00:00 2022-05-12 00:00 1080
#> 10: 2022-05-12 05:00:00 2022-05-12 05:00 1090
#> 11: 2022-05-12 10:00:00 2022-05-12 10:00 1100
#> 12: 2022-05-12 15:00:00 2022-05-12 15:00 1110
#> 13: 2022-05-12 20:00:00 2022-05-12 20:00 1120
#> 14: 2022-05-13 01:00:00 2022-05-13 01:00 1130
#> 15: 2022-05-13 06:00:00 2022-05-13 06:00 1140
# Now let's create a table using the gt package and add row groups.
# We will group on date.
dt %>%
gt %>%
tab_row_group(label = "May 10",id = "may10",rows = date==ymd(20220510)) %>%
tab_row_group(label = "May 11",id = "may11",rows = date==ymd(20220511)) %>%
tab_row_group(label = "May 12",id = "may12",rows = date==ymd(20220512)) %>%
row_group_order(groups = c("may10","may11","may12")) %>%
cols_hide(columns = c(datetime,date))
但在现实生活中可能有数百个日期。而且日期是事先不知道的。如果我在 gt 中使用 tab_row_group() 的当前方法,代码将变得笨拙
有没有办法缩短代码并自动进行行分组?
您可以在 gt 函数中使用 groupname_col
dt %>%
gt(groupname_col = c("date")) %>%
cols_hide(columns = c(datetime,date))
我喜欢 gt
R 包,但我无法想出适用于大型表且行组标签未知的行分组的清晰代码。
考虑一下这个玩具示例,因为它很小 data.table 看起来还不错。
library(data.table)
library(magrittr)
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:data.table':
#>
#> hour, isoweek, mday, minute, month, quarter, second, wday, week,
#> yday, year
#> The following objects are masked from 'package:base':
#>
#> date, intersect, setdiff, union
library(gt)
# create a toy data.table
dt <- data.table(datetime = seq(ymd_hm(202205100800),by = "5 hours",length.out = 15))[order(datetime)]
dt[,date:=as_date(datetime)]
dt[,time:=format(datetime,"%H:%M")]
dt[,values:=seq(1000,by = 10,length.out=15)]
# Here's how my toy data.table looks like:
print(dt)
#> datetime date time values
#> 1: 2022-05-10 08:00:00 2022-05-10 08:00 1000
#> 2: 2022-05-10 13:00:00 2022-05-10 13:00 1010
#> 3: 2022-05-10 18:00:00 2022-05-10 18:00 1020
#> 4: 2022-05-10 23:00:00 2022-05-10 23:00 1030
#> 5: 2022-05-11 04:00:00 2022-05-11 04:00 1040
#> 6: 2022-05-11 09:00:00 2022-05-11 09:00 1050
#> 7: 2022-05-11 14:00:00 2022-05-11 14:00 1060
#> 8: 2022-05-11 19:00:00 2022-05-11 19:00 1070
#> 9: 2022-05-12 00:00:00 2022-05-12 00:00 1080
#> 10: 2022-05-12 05:00:00 2022-05-12 05:00 1090
#> 11: 2022-05-12 10:00:00 2022-05-12 10:00 1100
#> 12: 2022-05-12 15:00:00 2022-05-12 15:00 1110
#> 13: 2022-05-12 20:00:00 2022-05-12 20:00 1120
#> 14: 2022-05-13 01:00:00 2022-05-13 01:00 1130
#> 15: 2022-05-13 06:00:00 2022-05-13 06:00 1140
# Now let's create a table using the gt package and add row groups.
# We will group on date.
dt %>%
gt %>%
tab_row_group(label = "May 10",id = "may10",rows = date==ymd(20220510)) %>%
tab_row_group(label = "May 11",id = "may11",rows = date==ymd(20220511)) %>%
tab_row_group(label = "May 12",id = "may12",rows = date==ymd(20220512)) %>%
row_group_order(groups = c("may10","may11","may12")) %>%
cols_hide(columns = c(datetime,date))
但在现实生活中可能有数百个日期。而且日期是事先不知道的。如果我在 gt 中使用 tab_row_group() 的当前方法,代码将变得笨拙
有没有办法缩短代码并自动进行行分组?
您可以在 gt 函数中使用 groupname_col
dt %>%
gt(groupname_col = c("date")) %>%
cols_hide(columns = c(datetime,date))