Dplyr group_by 并总结,但保留非数字变量
Dplyr group_by and summarise, but keep non numeric variables
我有一个长格式的数据集,我在其中为不同的组添加值。一些变量是因子变量,应保留在结果中。
mtcars$model <- as.factor(rownames(mtcars))
longmtcars <- rbind(mtcars, mtcars, mtcars)
longmtcars$vs <- ifelse(longmtcars$vs == 1, "Yes", "No")
result <- longmtcars %>%
group_by(factor(model)) %>%
summarise_if(is.numeric, sum)
result
# A tibble: 32 x 11
`factor(model)` mpg cyl disp hp drat wt qsec am gear carb
<fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 AMC Javelin 45.6 24 912 450 9.45 10.3 51.9 0 9 6
2 Cadillac Fleetwood 31.2 24 1416 615 8.79 15.8 53.9 0 9 12
3 Camaro Z28 39.9 24 1050 735 11.2 11.5 46.2 0 9 12
4 Chrysler Imperial 44.1 24 1320 690 9.69 16.0 52.3 0 9 12
5 Datsun 710 68.4 12 324 279 11.6 6.96 55.8 3 12 3
我当前的不可扩展解决方案
#ugly solution
vsvar <- longmtcars[1:32, "vs"]
result <- cbind(result, vsvar)
result
factor(model) mpg cyl disp hp drat wt qsec am gear carb vsvar
1 AMC Javelin 45.6 24 912.0 450 9.45 10.305 51.90 0 9 6 No
2 Cadillac Fleetwood 31.2 24 1416.0 615 8.79 15.750 53.94 0 9 12 No
3 Camaro Z28 39.9 24 1050.0 735 11.19 11.520 46.23 0 9 12 Yes
这是正确的,但真的很丑,我将在 Shiny App 中使用它,这会引起麻烦,所以目前的做法是没有选择的。有一体化解决方案吗?用data.table也可以,不过我不是很熟
您可以将那个(那些)变量添加到 group_by
子句:
result <- longmtcars %>%
mutate_if(is.character, factor) %>%
group_by(model, vs) %>%
summarise_if(is.numeric, sum)
result
#> # A tibble: 32 x 12
#> # Groups: model [32]
#> model vs mpg cyl disp hp drat wt qsec am gear carb
#> <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 AMC Javelin No 45.6 24 912 450 9.45 10.3 51.9 0 9 6
#> 2 Cadillac Fleetwood No 31.2 24 1416 615 8.79 15.8 53.9 0 9 12
#> 3 Camaro Z28 No 39.9 24 1050 735 11.2 11.5 46.2 0 9 12
在基础 R 中你可以使用 aggregate
.
result <- with(longmtcars,
aggregate(as.matrix(longmtcars[sapply(longmtcars, is.numeric)]) ~ model + vs,
longmtcars, sum))
head(result)
# model vs mpg cyl disp hp drat wt qsec am gear carb
# 1 AMC Javelin No 45.6 24 912 450 9.45 10.305 51.90 0 9 6
# 2 Cadillac Fleetwood No 31.2 24 1416 615 8.79 15.750 53.94 0 9 12
# 3 Camaro Z28 No 39.9 24 1050 735 11.19 11.520 46.23 0 9 12
# 4 Chrysler Imperial No 44.1 24 1320 690 9.69 16.035 52.26 0 9 12
# 5 Dodge Challenger No 46.5 24 954 450 8.28 10.560 50.61 0 9 6
# 6 Duster 360 No 42.9 24 1080 735 9.63 10.710 47.52 0 9 12
我有一个长格式的数据集,我在其中为不同的组添加值。一些变量是因子变量,应保留在结果中。
mtcars$model <- as.factor(rownames(mtcars))
longmtcars <- rbind(mtcars, mtcars, mtcars)
longmtcars$vs <- ifelse(longmtcars$vs == 1, "Yes", "No")
result <- longmtcars %>%
group_by(factor(model)) %>%
summarise_if(is.numeric, sum)
result
# A tibble: 32 x 11
`factor(model)` mpg cyl disp hp drat wt qsec am gear carb
<fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 AMC Javelin 45.6 24 912 450 9.45 10.3 51.9 0 9 6
2 Cadillac Fleetwood 31.2 24 1416 615 8.79 15.8 53.9 0 9 12
3 Camaro Z28 39.9 24 1050 735 11.2 11.5 46.2 0 9 12
4 Chrysler Imperial 44.1 24 1320 690 9.69 16.0 52.3 0 9 12
5 Datsun 710 68.4 12 324 279 11.6 6.96 55.8 3 12 3
我当前的不可扩展解决方案
#ugly solution
vsvar <- longmtcars[1:32, "vs"]
result <- cbind(result, vsvar)
result
factor(model) mpg cyl disp hp drat wt qsec am gear carb vsvar
1 AMC Javelin 45.6 24 912.0 450 9.45 10.305 51.90 0 9 6 No
2 Cadillac Fleetwood 31.2 24 1416.0 615 8.79 15.750 53.94 0 9 12 No
3 Camaro Z28 39.9 24 1050.0 735 11.19 11.520 46.23 0 9 12 Yes
这是正确的,但真的很丑,我将在 Shiny App 中使用它,这会引起麻烦,所以目前的做法是没有选择的。有一体化解决方案吗?用data.table也可以,不过我不是很熟
您可以将那个(那些)变量添加到 group_by
子句:
result <- longmtcars %>%
mutate_if(is.character, factor) %>%
group_by(model, vs) %>%
summarise_if(is.numeric, sum)
result
#> # A tibble: 32 x 12
#> # Groups: model [32]
#> model vs mpg cyl disp hp drat wt qsec am gear carb
#> <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 AMC Javelin No 45.6 24 912 450 9.45 10.3 51.9 0 9 6
#> 2 Cadillac Fleetwood No 31.2 24 1416 615 8.79 15.8 53.9 0 9 12
#> 3 Camaro Z28 No 39.9 24 1050 735 11.2 11.5 46.2 0 9 12
在基础 R 中你可以使用 aggregate
.
result <- with(longmtcars,
aggregate(as.matrix(longmtcars[sapply(longmtcars, is.numeric)]) ~ model + vs,
longmtcars, sum))
head(result)
# model vs mpg cyl disp hp drat wt qsec am gear carb
# 1 AMC Javelin No 45.6 24 912 450 9.45 10.305 51.90 0 9 6
# 2 Cadillac Fleetwood No 31.2 24 1416 615 8.79 15.750 53.94 0 9 12
# 3 Camaro Z28 No 39.9 24 1050 735 11.19 11.520 46.23 0 9 12
# 4 Chrysler Imperial No 44.1 24 1320 690 9.69 16.035 52.26 0 9 12
# 5 Dodge Challenger No 46.5 24 954 450 8.28 10.560 50.61 0 9 6
# 6 Duster 360 No 42.9 24 1080 735 9.63 10.710 47.52 0 9 12