摘要 table 一些列对 R 中带有变量的向量求和
Summary table with some columns summing over a vector with variables in R
我有一个 table 看起来像这样:
df <- data.frame(week = c("one","one","two","two"),
Day = c("day1", "day2","day1","day2"),
daily_freq = c(100,110,90,90),
city1 = c(20,30,20,30),
city2 = c(10,20,30,40),
city3 = c(30,40,10,10),
city4 = c(40,20,30,10))
我正在计算几个摘要 table,例如,一个 table 具有那个时期的总频率:
resume_table <- setDT(df)[, .( total_freq = sum(daily_freq),
city1 = sum(city1),
city2 = sum(city2),
city3 = sum(city3),
city4 = sum(city4))
,by = .(week)]
找到总频率如下所示的 table:
week total_freq city1 city2 city3 city4
one 210 50 30 70 60
two 180 50 70 20 40
但是因为我有几个城市(超过40个),我需要计算几个摘要tables,我想有,例如,一个包含城市的向量:
cities <- c("city1","city2","city3","city4")
并且每次我对这个向量变量求和并仍然对其他列求和时,都能够调用这个向量。
我的代码不起作用:
resume_table2 <- setDT(df)[, .(total_freq = sum(daily_freq),
lapply(.SD, sum), .SDcols = cities)
,by = .(week)]
怎么了?
我们可以在 .SDcols
中指定 'cities' 向量并循环遍历 .SD
以获得 sum
setDT(df)[, lapply(.SD, sum), .SDcols = cities]
# city1 city2 city3 city4
#1: 105 100 55 65
如果我们还需要 'daily_freq',则与 'cities'
连接
setDT(df)[, lapply(.SD, sum), .SDcols = c('daily_freq', cities)]
# daily_freq city1 city2 city3 city4
#1: 325 105 100 55 65
我有一个 table 看起来像这样:
df <- data.frame(week = c("one","one","two","two"),
Day = c("day1", "day2","day1","day2"),
daily_freq = c(100,110,90,90),
city1 = c(20,30,20,30),
city2 = c(10,20,30,40),
city3 = c(30,40,10,10),
city4 = c(40,20,30,10))
我正在计算几个摘要 table,例如,一个 table 具有那个时期的总频率:
resume_table <- setDT(df)[, .( total_freq = sum(daily_freq),
city1 = sum(city1),
city2 = sum(city2),
city3 = sum(city3),
city4 = sum(city4))
,by = .(week)]
找到总频率如下所示的 table:
week total_freq city1 city2 city3 city4
one 210 50 30 70 60
two 180 50 70 20 40
但是因为我有几个城市(超过40个),我需要计算几个摘要tables,我想有,例如,一个包含城市的向量:
cities <- c("city1","city2","city3","city4")
并且每次我对这个向量变量求和并仍然对其他列求和时,都能够调用这个向量。 我的代码不起作用:
resume_table2 <- setDT(df)[, .(total_freq = sum(daily_freq),
lapply(.SD, sum), .SDcols = cities)
,by = .(week)]
怎么了?
我们可以在 .SDcols
中指定 'cities' 向量并循环遍历 .SD
以获得 sum
setDT(df)[, lapply(.SD, sum), .SDcols = cities]
# city1 city2 city3 city4
#1: 105 100 55 65
如果我们还需要 'daily_freq',则与 'cities'
连接setDT(df)[, lapply(.SD, sum), .SDcols = c('daily_freq', cities)]
# daily_freq city1 city2 city3 city4
#1: 325 105 100 55 65