创建两个分类和数值变量的数据透视表 Table
Create a Pivot Table of Two Categorical and Numerical Variables
我有以下假设的数据框
Region <- c("District A", "District B","District A","District A","District B")
Gender <- c("Male","Male","Female", "Male","Female")
Age <- c(20, 21, 23, 34, 22)
AmountSold <- c(50, 10, 20, 4, 12)
RegionSales <- data.frame(Region, Gender, Age, AmountSold)
我想创建一个数据透视表 table 或 table 来显示每个性别和地区的平均销售量以及每个性别和地区的平均年龄。我如何在 R 中做到这一点?
这将是我使用 dplyr
包的方法:
library(dplyr)
RegionSales %>%
group_by(Region, Gender) %>%
summarize(mean_age = mean(Age), mean_amount = mean(AmountSold))
输出:
# A tibble: 4 x 4
# Groups: Region [2]
Region Gender mean_age mean_amount
<chr> <chr> <dbl> <dbl>
1 District A Female 23 20
2 District A Male 27 27
3 District B Female 22 12
4 District B Male 21 10
忽略 NA
值的选项:
RegionSales %>%
group_by(Region, Gender) %>%
summarize(mean_age = mean(Age, na.rm = T),
mean_amount = mean(AmountSold, na.rm = T))
使用dplyr
,另一种选择是在across
中指定变量
library(dplyr)
RegionSales %>%
group_by(Region, Gender) %>%
summarise(across(c(Age, AmountSold),
~ mean(., na.rm = TRUE), .names = "mean_{.col}"))
使用 aggregate
的基本选项可能会有所帮助
> aggregate(. ~ Region + Gender, RegionSales, mean)
Region Gender Age AmountSold
1 District A Female 23 20
2 District B Female 22 12
3 District A Male 27 27
4 District B Male 21 10
我有以下假设的数据框
Region <- c("District A", "District B","District A","District A","District B")
Gender <- c("Male","Male","Female", "Male","Female")
Age <- c(20, 21, 23, 34, 22)
AmountSold <- c(50, 10, 20, 4, 12)
RegionSales <- data.frame(Region, Gender, Age, AmountSold)
我想创建一个数据透视表 table 或 table 来显示每个性别和地区的平均销售量以及每个性别和地区的平均年龄。我如何在 R 中做到这一点?
这将是我使用 dplyr
包的方法:
library(dplyr)
RegionSales %>%
group_by(Region, Gender) %>%
summarize(mean_age = mean(Age), mean_amount = mean(AmountSold))
输出:
# A tibble: 4 x 4
# Groups: Region [2]
Region Gender mean_age mean_amount
<chr> <chr> <dbl> <dbl>
1 District A Female 23 20
2 District A Male 27 27
3 District B Female 22 12
4 District B Male 21 10
忽略 NA
值的选项:
RegionSales %>%
group_by(Region, Gender) %>%
summarize(mean_age = mean(Age, na.rm = T),
mean_amount = mean(AmountSold, na.rm = T))
使用dplyr
,另一种选择是在across
library(dplyr)
RegionSales %>%
group_by(Region, Gender) %>%
summarise(across(c(Age, AmountSold),
~ mean(., na.rm = TRUE), .names = "mean_{.col}"))
使用 aggregate
的基本选项可能会有所帮助
> aggregate(. ~ Region + Gender, RegionSales, mean)
Region Gender Age AmountSold
1 District A Female 23 20
2 District B Female 22 12
3 District A Male 27 27
4 District B Male 21 10