在 R 中分组时没有得到小计
Not getting subtotals when groups in R
每当一名球员发生变化时,我都需要他职业生涯中有多少次三振的小计。
我尝试使用下面的代码进行操作,但没有得到小计。
player <- c('acostma01', 'acostma01', 'acostma01', 'adkinjo01', 'aguilri01', 'aguilri01', 'aguilri01', 'aguilri01', 'aguilri01')
year <- c(2010,2011,2012,2007,1985,1986,1987,1988,1989)
games <- c(41,44,45,1,21,28,18,11,36)
strikeouts <- c(42,46,46,0,74,104,77,16,80)
bb_data <- data.frame(player, year, games, strikeouts, stringsAsFactors = FALSE)
这是无效的代码。
mets <- select(bb_data, player, year, games, strikeouts) %>%
group_by(player, year) %>%
colSums(SO)
这是我想要得到的输出:
player games strikeouts
acostma01 130 134
adkinjo01 1 0
aguilri01 0 351
Grand Total 485
这是我得到的(数据尾部):
player team year games strikouts
<chr> <chr> <int> <int> <int>
swarzan01 NYN 2018 29 31
syndeno01 NYN 2018 25 155
vargaja01 NYN 2018 20 84
wahlbo01 NYN 2018 7 7
wheelza01 NYN 2018 29 179
zamorda01 NYN 2018 16 16
你可以这样做:
library(tidyverse)
bb_data %>%
group_by(player) %>%
summarise_at(vars(games, strikeouts), sum) %>%
add_row(player = 'Grand Total', games = NA, strikeouts = sum(.$strikeouts))
这会给你:
# A tibble: 4 x 3
player games strikeouts
<chr> <dbl> <dbl>
1 acostma01 130 134
2 adkinjo01 1 0
3 aguilri01 114 351
4 Grand Total NA 485
这与 aguilri01
的 games
以外的所有值一致 - 我认为这是一个打字错误,但如果这不正确请告诉我。
对于降序,你可以这样做:
bb_data %>%
group_by(player) %>%
summarise_at(vars(games, strikeouts), sum) %>%
arrange(-strikeouts) %>%
add_row(player = 'Grand Total', games = NA, strikeouts = sum(.$strikeouts))
输出:
# A tibble: 4 x 3
player games strikeouts
<chr> <dbl> <dbl>
1 aguilri01 114 351
2 acostma01 130 134
3 adkinjo01 1 0
4 Grand Total NA 485
要同时包括播放的季节,您可以尝试:
bb_data %>%
group_by(player) %>%
mutate(seasons_played = n_distinct(year)) %>%
group_by(player, seasons_played) %>%
summarise_at(vars(games, strikeouts), sum) %>%
arrange(-strikeouts) %>%
ungroup() %>%
add_row(player = 'Grand Total', games = NA, seasons_played = NA, strikeouts = sum(.$strikeouts))
如果您不关心开始求和的年份列,您可以这样做:
library(data.table)
data = setDT(bb_data)[, c(lapply(.SD, sum), .N), by =player]
.N
允许您按玩家(年数)计算行数。
然后你可以订购它(用-
让它减少):
data[order(-data$strikeouts)]
你得到这个结果:
1: aguilri01 9935 114 351 5
2: acostma01 6033 130 134 3
3: adkinjo01 2007 1 0 1
每当一名球员发生变化时,我都需要他职业生涯中有多少次三振的小计。
我尝试使用下面的代码进行操作,但没有得到小计。
player <- c('acostma01', 'acostma01', 'acostma01', 'adkinjo01', 'aguilri01', 'aguilri01', 'aguilri01', 'aguilri01', 'aguilri01')
year <- c(2010,2011,2012,2007,1985,1986,1987,1988,1989)
games <- c(41,44,45,1,21,28,18,11,36)
strikeouts <- c(42,46,46,0,74,104,77,16,80)
bb_data <- data.frame(player, year, games, strikeouts, stringsAsFactors = FALSE)
这是无效的代码。
mets <- select(bb_data, player, year, games, strikeouts) %>%
group_by(player, year) %>%
colSums(SO)
这是我想要得到的输出:
player games strikeouts
acostma01 130 134
adkinjo01 1 0
aguilri01 0 351
Grand Total 485
这是我得到的(数据尾部):
player team year games strikouts
<chr> <chr> <int> <int> <int>
swarzan01 NYN 2018 29 31
syndeno01 NYN 2018 25 155
vargaja01 NYN 2018 20 84
wahlbo01 NYN 2018 7 7
wheelza01 NYN 2018 29 179
zamorda01 NYN 2018 16 16
你可以这样做:
library(tidyverse)
bb_data %>%
group_by(player) %>%
summarise_at(vars(games, strikeouts), sum) %>%
add_row(player = 'Grand Total', games = NA, strikeouts = sum(.$strikeouts))
这会给你:
# A tibble: 4 x 3
player games strikeouts
<chr> <dbl> <dbl>
1 acostma01 130 134
2 adkinjo01 1 0
3 aguilri01 114 351
4 Grand Total NA 485
这与 aguilri01
的 games
以外的所有值一致 - 我认为这是一个打字错误,但如果这不正确请告诉我。
对于降序,你可以这样做:
bb_data %>%
group_by(player) %>%
summarise_at(vars(games, strikeouts), sum) %>%
arrange(-strikeouts) %>%
add_row(player = 'Grand Total', games = NA, strikeouts = sum(.$strikeouts))
输出:
# A tibble: 4 x 3
player games strikeouts
<chr> <dbl> <dbl>
1 aguilri01 114 351
2 acostma01 130 134
3 adkinjo01 1 0
4 Grand Total NA 485
要同时包括播放的季节,您可以尝试:
bb_data %>%
group_by(player) %>%
mutate(seasons_played = n_distinct(year)) %>%
group_by(player, seasons_played) %>%
summarise_at(vars(games, strikeouts), sum) %>%
arrange(-strikeouts) %>%
ungroup() %>%
add_row(player = 'Grand Total', games = NA, seasons_played = NA, strikeouts = sum(.$strikeouts))
如果您不关心开始求和的年份列,您可以这样做:
library(data.table)
data = setDT(bb_data)[, c(lapply(.SD, sum), .N), by =player]
.N
允许您按玩家(年数)计算行数。
然后你可以订购它(用-
让它减少):
data[order(-data$strikeouts)]
你得到这个结果:
1: aguilri01 9935 114 351 5
2: acostma01 6033 130 134 3
3: adkinjo01 2007 1 0 1