在矩阵中按组(行名)对列求和

Sum columns by group (row names) in a matrix

假设我有一个名为 x 的矩阵。

x <- structure(c(1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1), 
.Dim = c(5L, 4L), .Dimnames = list(c("Cake", "Pie", "Cake", "Pie", "Pie"),
c("Mon", "Tue", "Wed", "Thurs"))) 

x
     Mon   Tue   Wed   Thurs
Cake   1     0     1      1
Pie    0     0     1      1
Cake   1     1     0      1
Pie    0     0     1      1
Pie    0     0     1      1

我想对按行名分组的每一列求和:

     Mon   Tue   Wed   Thurs
Cake   2     1     1      2
Pie    0     0     3      3

我试过使用 addmargins(x),但这只能给出每列和每行的总和。有什么建议么?我搜索了其他问题,但无法弄清楚。

你可以试试这个

df <- read.table(head=TRUE, text="
Name       Mon   Tue   Wed   Thurs
Cake   1     0     1      1
Pie    0     0     1      1
Cake   1     1     0      1
Pie    0     0     1      1
Pie    0     0     1      1")

aggregate(. ~ Name, data=df, FUN=sum)
##   Name Mon Tue Wed Thurs
## 1 Cake   2   1   1     2
## 2  Pie   0   0   3     3

还有 dplyr

library(dplyr)
group_by(df, Name) %>%
    summarise(Mon = sum(Mon), Tue = sum(Tue), Wed = sum(Wed), Thurs = sum(Thurs))

或更好

 group_by(df, Name) %>%
    summarise_each(funs(sum))

使用plyr的方法:

ldply(split(df, df$Name), function(u) colSums(u[-1]))
#   .id Mon Tue Wed Thurs
#1 Cake   2   1   1     2
#2  Pie   0   0   3     3

数据:

df = structure(list(Name = structure(c(1L, 2L, 1L, 2L, 2L), .Label = c("Cake", 
"Pie"), class = "factor"), Mon = c(1L, 0L, 1L, 0L, 0L), Tue = c(0L, 
0L, 1L, 0L, 0L), Wed = c(1L, 1L, 0L, 1L, 1L), Thurs = c(1L, 1L, 
1L, 1L, 1L)), .Names = c("Name", "Mon", "Tue", "Wed", "Thurs"
), row.names = c(NA, -5L), class = "data.frame")

这是一个向量化的基础解决方案

rowsum(df, row.names(x))
#      Mon Tue Wed Thurs
# Cake   2   1   1     2
# Pie    0   0   3     3

或使用 keep.rownames = TRUEdata.table 版本,以便将您的行名称转换为列名称

library(data.table)
as.data.table(x, keep.rownames = TRUE)[, lapply(.SD, sum), by = rn]
#      rn Mon Tue Wed Thurs
# 1: Cake   2   1   1     2
# 2:  Pie   0   0   3     3