将行除以条件行总和

Divide rows by conditional row sums

考虑以下矩阵:

m <- cbind(c("r1","r2","r3","r4","r1","r2","r3","r4"),c(3,2,5,2,5,2,6,4),c(4,3,5,3,7,4,6,7))

对于每一行,我想将行总和除以它们的条件行总和。也就是说,对于名称为 "r1" 的所有行,我想将它们的行总和除以名称为 "r1" 的所有行的行总和。因此,第一行是“(3+4)/(3+4+5+7)”。

"r2"、"r3" 和 "r3" 也一样。因此,例如对于第二行,计算结果为“(2+3)/(2+3+2+4)”。

我如何在 R 中做到这一点?

m <- cbind(c("r1","r2","r3","r4","r1","r2","r3","r4"),c(3,2,5,2,5,2,6,4),c(4,3,5,3,7,4,6,7))

require(dplyr)



m %>%  as_tibble %>% 
  mutate(V4 = as.numeric(V2) + as.numeric(V3)) %>% 
  group_by(V1) %>% 
  mutate(conditional_sum = sum(V4)) %>%  
  ungroup %>% 
  mutate(calculation = V4/conditional_sum) 

# A tibble: 8 x 6
# V1    V2    V3       V4 conditional_sum calculation
# <chr> <chr> <chr> <dbl>           <dbl>       <dbl>
# 1 r1    3     4         7              19       0.368
# 2 r2    2     3         5              11       0.455
# 3 r3    5     5        10              22       0.455
# 4 r4    2     3         5              16       0.312
# 5 r1    5     7        12              19       0.632
# 6 r2    2     4         6              11       0.545
# 7 r3    6     6        12              22       0.545
# 8 r4    4     7        11              16       0.688

这是我们整理您的数据后的基础 R 解决方案,

df <- data.frame(m, stringsAsFactors = FALSE)
df[-1] <- lapply(df[-1], as.numeric)
df$new <- df$X2 + df$X3

with(df, ave(new, X1, FUN = function(i)i / sum(i)))
#[1] 0.3684211 0.4545455 0.4545455 0.3125000 0.6315789 0.5454545 0.5454545 0.6875000

首先,将数据创建为 data.frame 而不是矩阵,这样数字列就不会被强制转换为字符。 (如果您已经创建了矩阵,也可以使用 sotos 答案的前两行从矩阵转换为 data.frame)

df <- data.frame(row_id = c("r1","r2","r3","r4","r1","r2","r3","r4"),
                v1 = c(3,2,5,2,5,2,6,4),
                v2 = c(4,3,5,3,7,4,6,7))

现在,如果您将 data.frame 转换为带有 setDT 的 data.table,您可以使用 data.table 分组(by = row_id 设置分组)

library(data.table)
setDT(df)

df[, ratio := (v1 + v2)/sum(v1 + v2), by = row_id]

df
#    row_id v1 v2     ratio
# 1:     r1  3  4 0.3684211
# 2:     r2  2  3 0.4545455
# 3:     r3  5  5 0.4545455
# 4:     r4  2  3 0.3125000
# 5:     r1  5  7 0.6315789
# 6:     r2  2  4 0.5454545
# 7:     r3  6  6 0.5454545
# 8:     r4  4  7 0.6875000