在 R 中分组时没有得到小计

Not getting subtotals when groups in R

每当一名球员发生变化时,我都需要他职业生涯中有多少次三振的小计。

我尝试使用下面的代码进行操作,但没有得到小计。

player <- c('acostma01', 'acostma01', 'acostma01', 'adkinjo01', 'aguilri01', 'aguilri01', 'aguilri01', 'aguilri01', 'aguilri01')
        year <- c(2010,2011,2012,2007,1985,1986,1987,1988,1989)
        games <- c(41,44,45,1,21,28,18,11,36)
        strikeouts <- c(42,46,46,0,74,104,77,16,80)
        bb_data <- data.frame(player, year, games, strikeouts, stringsAsFactors = FALSE)

这是无效的代码。

mets <- select(bb_data, player, year, games, strikeouts) %>% 
group_by(player, year) %>% 
colSums(SO)

这是我想要得到的输出:

player      games strikeouts
acostma01   130   134
adkinjo01   1     0
aguilri01   0     351
Grand Total       485

这是我得到的(数据尾部):

player    team    year  games strikouts
<chr>     <chr>   <int> <int> <int>
swarzan01 NYN      2018    29    31
syndeno01 NYN      2018    25   155
vargaja01 NYN      2018    20    84
wahlbo01  NYN      2018     7     7
wheelza01 NYN      2018    29   179
zamorda01 NYN      2018    16    16

你可以这样做:

library(tidyverse)

bb_data %>% 
  group_by(player) %>% 
  summarise_at(vars(games, strikeouts), sum) %>%
  add_row(player = 'Grand Total', games = NA, strikeouts = sum(.$strikeouts))

这会给你:

# A tibble: 4 x 3
  player      games strikeouts
  <chr>       <dbl>      <dbl>
1 acostma01     130        134
2 adkinjo01       1          0
3 aguilri01     114        351
4 Grand Total    NA        485

这与 aguilri01games 以外的所有值一致 - 我认为这是一个打字错误,但如果这不正确请告诉我。

对于降序,你可以这样做:

bb_data %>% 
  group_by(player) %>% 
  summarise_at(vars(games, strikeouts), sum) %>%
  arrange(-strikeouts) %>%
  add_row(player = 'Grand Total', games = NA, strikeouts = sum(.$strikeouts))

输出:

# A tibble: 4 x 3
  player      games strikeouts
  <chr>       <dbl>      <dbl>
1 aguilri01     114        351
2 acostma01     130        134
3 adkinjo01       1          0
4 Grand Total    NA        485

要同时包括播放的季节,您可以尝试:

bb_data %>% 
  group_by(player) %>% 
  mutate(seasons_played = n_distinct(year)) %>%
  group_by(player, seasons_played) %>%
  summarise_at(vars(games, strikeouts), sum) %>% 
  arrange(-strikeouts) %>%
  ungroup() %>%
  add_row(player = 'Grand Total', games = NA, seasons_played = NA, strikeouts = sum(.$strikeouts))

如果您不关心开始求和的年份列,您可以这样做:

 library(data.table)
 data = setDT(bb_data)[, c(lapply(.SD, sum), .N), by =player]

.N 允许您按玩家(年数)计算行数。

然后你可以订购它(用-让它减少):

data[order(-data$strikeouts)]

你得到这个结果:

1: aguilri01 9935   114        351 5
2: acostma01 6033   130        134 3
3: adkinjo01 2007     1          0 1