如何将 mutate 的应用方向从按列更改为按行?
How to change the direction of application of mutate across from column-wise to row-wise?
假设我们从下面的数据框开始,通过下面的代码生成:
> data
To A B C
1 A 1 3 5
2 B 2 4 6
3 C 4 5 7
data <-
data.frame(
To = c("A","B","C"),
A = c(1,2,4),
B = c(3,4,5),
C = c(5,6,7)
)
现在我们添加列和行总计,得到下面修改后的数据框,由下面显示的代码生成:
> data
To A B C Sum
1 A 1 3 5 9
2 B 2 4 6 12
3 C 4 5 7 16
4 Sum 7 12 18 37
data <- data %>%
replace(is.na(.), 0) %>%
bind_rows(summarise_all(., ~(if(is.numeric(.)) sum(.) else "Sum")))
data <- cbind(data, Sum = rowSums(data[,-1]))
最后,我们计算data
中每个元素所代表的列总数的百分比:
To A B C Sum
1 A 0.1428571 0.2500000 0.2777778 0.2432432
2 B 0.2857143 0.3333333 0.3333333 0.3243243
3 C 0.5714286 0.4166667 0.3888889 0.4324324
4 Sum 1.0000000 1.0000000 1.0000000 1.0000000
library(tidyverse)
data %>% mutate(across(-c(To), ~ ./.[To == "Sum"]))
问题:如何修改上面的代码,让我们计算row
总数的百分比,而不是列总数?所以我们最终会得到以下比例(手工计算所以请原谅任何小错误,不要担心四舍五入):
To A B C Sum
1 A 0.1111111 0.3333333 0.5555555 1.0000000
2 B 0.1666666 0.3333333 0.5000000 1.0000000
3 C 0.2500000 0.3125000 0.4375000 1.0000000
4 Sum 0.5277771 0.9791666 1.4930555 3.0000000
试试这个。 (注意第 4 行的总和也为 1。)
library(tidyverse)
data <-
data.frame(
To = c("A","B","C"),
A = c(1,2,4),
B = c(3,4,5),
C = c(5,6,7)
)
data <- data %>%
replace(is.na(.), 0) %>%
bind_rows(summarise_all(., ~(if(is.numeric(.)) sum(.) else "Sum")))
data <- cbind(data, Sum = rowSums(data[,-1]))
data %>%
rowwise() %>%
mutate(across(A:Sum, ~ sum(.) / Sum))
#> # A tibble: 4 × 5
#> # Rowwise:
#> To A B C Sum
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 A 0.111 0.333 0.556 1
#> 2 B 0.167 0.333 0.5 1
#> 3 C 0.25 0.312 0.438 1
#> 4 Sum 0.189 0.324 0.486 1
由 reprex package (v2.0.1)
于 2022-05-04 创建
可能的解决方案:
library(dplyr)
data %>%
mutate(Sum = rowSums(across(A:C))) %>%
mutate(across(A:Sum, ~ .x / Sum)) %>%
bind_rows(data.frame(To = "Sum", t(colSums(.[-1]))))
#> To A B C Sum
#> 1 A 0.1111111 0.3333333 0.5555556 1
#> 2 B 0.1666667 0.3333333 0.5000000 1
#> 3 C 0.2500000 0.3125000 0.4375000 1
#> 4 Sum 0.5277778 0.9791667 1.4930556 3
假设我们从下面的数据框开始,通过下面的代码生成:
> data
To A B C
1 A 1 3 5
2 B 2 4 6
3 C 4 5 7
data <-
data.frame(
To = c("A","B","C"),
A = c(1,2,4),
B = c(3,4,5),
C = c(5,6,7)
)
现在我们添加列和行总计,得到下面修改后的数据框,由下面显示的代码生成:
> data
To A B C Sum
1 A 1 3 5 9
2 B 2 4 6 12
3 C 4 5 7 16
4 Sum 7 12 18 37
data <- data %>%
replace(is.na(.), 0) %>%
bind_rows(summarise_all(., ~(if(is.numeric(.)) sum(.) else "Sum")))
data <- cbind(data, Sum = rowSums(data[,-1]))
最后,我们计算data
中每个元素所代表的列总数的百分比:
To A B C Sum
1 A 0.1428571 0.2500000 0.2777778 0.2432432
2 B 0.2857143 0.3333333 0.3333333 0.3243243
3 C 0.5714286 0.4166667 0.3888889 0.4324324
4 Sum 1.0000000 1.0000000 1.0000000 1.0000000
library(tidyverse)
data %>% mutate(across(-c(To), ~ ./.[To == "Sum"]))
问题:如何修改上面的代码,让我们计算row
总数的百分比,而不是列总数?所以我们最终会得到以下比例(手工计算所以请原谅任何小错误,不要担心四舍五入):
To A B C Sum
1 A 0.1111111 0.3333333 0.5555555 1.0000000
2 B 0.1666666 0.3333333 0.5000000 1.0000000
3 C 0.2500000 0.3125000 0.4375000 1.0000000
4 Sum 0.5277771 0.9791666 1.4930555 3.0000000
试试这个。 (注意第 4 行的总和也为 1。)
library(tidyverse)
data <-
data.frame(
To = c("A","B","C"),
A = c(1,2,4),
B = c(3,4,5),
C = c(5,6,7)
)
data <- data %>%
replace(is.na(.), 0) %>%
bind_rows(summarise_all(., ~(if(is.numeric(.)) sum(.) else "Sum")))
data <- cbind(data, Sum = rowSums(data[,-1]))
data %>%
rowwise() %>%
mutate(across(A:Sum, ~ sum(.) / Sum))
#> # A tibble: 4 × 5
#> # Rowwise:
#> To A B C Sum
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 A 0.111 0.333 0.556 1
#> 2 B 0.167 0.333 0.5 1
#> 3 C 0.25 0.312 0.438 1
#> 4 Sum 0.189 0.324 0.486 1
由 reprex package (v2.0.1)
于 2022-05-04 创建可能的解决方案:
library(dplyr)
data %>%
mutate(Sum = rowSums(across(A:C))) %>%
mutate(across(A:Sum, ~ .x / Sum)) %>%
bind_rows(data.frame(To = "Sum", t(colSums(.[-1]))))
#> To A B C Sum
#> 1 A 0.1111111 0.3333333 0.5555556 1
#> 2 B 0.1666667 0.3333333 0.5000000 1
#> 3 C 0.2500000 0.3125000 0.4375000 1
#> 4 Sum 0.5277778 0.9791667 1.4930556 3