如何将 mutate 的应用方向从按列更改为按行?

How to change the direction of application of mutate across from column-wise to row-wise?

假设我们从下面的数据框开始,通过下面的代码生成:

> data
  To A B C
1  A 1 3 5
2  B 2 4 6
3  C 4 5 7

data <- 
  data.frame(
    To = c("A","B","C"),
    A = c(1,2,4),
    B = c(3,4,5),
    C = c(5,6,7)
  )

现在我们添加列和行总计,得到下面修改后的数据框,由下面显示的代码生成:

> data
   To A  B  C Sum
1   A 1  3  5   9
2   B 2  4  6  12
3   C 4  5  7  16
4 Sum 7 12 18  37

data <- data %>% 
  replace(is.na(.), 0) %>%
  bind_rows(summarise_all(., ~(if(is.numeric(.)) sum(.) else "Sum")))
data <- cbind(data, Sum = rowSums(data[,-1]))

最后,我们计算data中每个元素所代表的列总数的百分比:

   To         A         B         C       Sum
1   A 0.1428571 0.2500000 0.2777778 0.2432432
2   B 0.2857143 0.3333333 0.3333333 0.3243243
3   C 0.5714286 0.4166667 0.3888889 0.4324324
4 Sum 1.0000000 1.0000000 1.0000000 1.0000000

library(tidyverse)
data %>% mutate(across(-c(To), ~ ./.[To == "Sum"]))

问题:如何修改上面的代码,让我们计算row总数的百分比,而不是列总数?所以我们最终会得到以下比例(手工计算所以请原谅任何小错误,不要担心四舍五入):

  To         A         B         C       Sum
1   A 0.1111111 0.3333333 0.5555555 1.0000000
2   B 0.1666666 0.3333333 0.5000000 1.0000000
3   C 0.2500000 0.3125000 0.4375000 1.0000000
4 Sum 0.5277771 0.9791666 1.4930555 3.0000000

试试这个。 (注意第 4 行的总和也为 1。)

library(tidyverse)

data <- 
  data.frame(
    To = c("A","B","C"),
    A = c(1,2,4),
    B = c(3,4,5),
    C = c(5,6,7)
  )

data <- data %>% 
  replace(is.na(.), 0) %>%
  bind_rows(summarise_all(., ~(if(is.numeric(.)) sum(.) else "Sum")))
data <- cbind(data, Sum = rowSums(data[,-1]))

data %>% 
  rowwise() %>%
  mutate(across(A:Sum, ~ sum(.) / Sum))
#> # A tibble: 4 × 5
#> # Rowwise: 
#>   To        A     B     C   Sum
#>   <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 A     0.111 0.333 0.556     1
#> 2 B     0.167 0.333 0.5       1
#> 3 C     0.25  0.312 0.438     1
#> 4 Sum   0.189 0.324 0.486     1

reprex package (v2.0.1)

于 2022-05-04 创建

可能的解决方案:

library(dplyr)

data %>% 
  mutate(Sum = rowSums(across(A:C))) %>% 
  mutate(across(A:Sum, ~ .x / Sum)) %>% 
  bind_rows(data.frame(To = "Sum", t(colSums(.[-1]))))

#>    To         A         B         C Sum
#> 1   A 0.1111111 0.3333333 0.5555556   1
#> 2   B 0.1666667 0.3333333 0.5000000   1
#> 3   C 0.2500000 0.3125000 0.4375000   1
#> 4 Sum 0.5277778 0.9791667 1.4930556   3