按列中的值折叠行
Collapse rows by value in a column
我有一个从篮球中收集的 NBA 球员统计数据的数据框-reference.com 如下所示:
Player | Pos | Team | Games | Min | Points
Alex Abrines | SG | OKC | 68 | 15.5 | 6.0
Quincy Acy | PF | TOT | 38 | 14.7 | 5.8
Quincy Acy | PF | DAL | 6 | 8.0 | 2.2
Quincy Acy | PF | BRK | 32 | 15.9 | 6.5
Steven Adams | C | OKC | 80 | 29.9 | 11.3
Arron Afflalo| SG | SAC | 61 | 25.9 | 8.4
对于为同一支球队效力整个赛季的球员(如 Abrines、Adams 和 Afflalo),他们只会出现一次。但是,如果一名球员为超过 1 支球队效力(如 Quincy Acy),则数据框包含他效力的每支球队的一行,然后是另一行 "TOT"(总计)。
我想取回每个玩家只有 1 个唯一行的数据框,该行是 "TOT" 行,其他行将被删除。有点难过。
最明智的做法是按“球队”列中具有 "TOT" 的行进行操作,但对于其中一名拥有 1 名球员的“总计”行来说,总计行总是正确的一件事是Games 值将高于该玩家其他行中的 Games 值。
我们可以做一个小组 filter
library(dplyr)
df1 %>%
group_by(Player, Pos) %>%
filter(Team == "TOT" | n()==1)
# A tibble: 4 x 6
# Groups: Player, Pos [4]
# Player Pos Team Games Min Points
# <chr> <chr> <chr> <int> <dbl> <dbl>
#1 Alex Abrines SG OKC 68 15.5 6.0
#2 Quincy Acy PF TOT 38 14.7 5.8
#3 Steven Adams C OKC 80 29.9 11.3
#4 Arron Afflalo SG SAC 61 25.9 8.4
与 data.table
类似的方法是
library(data.table)
setDT(df1)[, .SD[Team=="TOT"|.N==1], .(Player, Pos)]
数据
df1 <- structure(list(Player = c("Alex Abrines", "Quincy Acy", "Quincy Acy",
"Quincy Acy", "Steven Adams", "Arron Afflalo"), Pos = c("SG",
"PF", "PF", "PF", "C", "SG"), Team = c("OKC", "TOT", "DAL", "BRK",
"OKC", "SAC"), Games = c(68L, 38L, 6L, 32L, 80L, 61L), Min = c(15.5,
14.7, 8, 15.9, 29.9, 25.9), Points = c(6, 5.8, 2.2, 6.5, 11.3,
8.4)), .Names = c("Player", "Pos", "Team", "Games", "Min", "Points"
), class = "data.frame", row.names = c(NA, -6L))
我有一个从篮球中收集的 NBA 球员统计数据的数据框-reference.com 如下所示:
Player | Pos | Team | Games | Min | Points
Alex Abrines | SG | OKC | 68 | 15.5 | 6.0
Quincy Acy | PF | TOT | 38 | 14.7 | 5.8
Quincy Acy | PF | DAL | 6 | 8.0 | 2.2
Quincy Acy | PF | BRK | 32 | 15.9 | 6.5
Steven Adams | C | OKC | 80 | 29.9 | 11.3
Arron Afflalo| SG | SAC | 61 | 25.9 | 8.4
对于为同一支球队效力整个赛季的球员(如 Abrines、Adams 和 Afflalo),他们只会出现一次。但是,如果一名球员为超过 1 支球队效力(如 Quincy Acy),则数据框包含他效力的每支球队的一行,然后是另一行 "TOT"(总计)。
我想取回每个玩家只有 1 个唯一行的数据框,该行是 "TOT" 行,其他行将被删除。有点难过。
最明智的做法是按“球队”列中具有 "TOT" 的行进行操作,但对于其中一名拥有 1 名球员的“总计”行来说,总计行总是正确的一件事是Games 值将高于该玩家其他行中的 Games 值。
我们可以做一个小组 filter
library(dplyr)
df1 %>%
group_by(Player, Pos) %>%
filter(Team == "TOT" | n()==1)
# A tibble: 4 x 6
# Groups: Player, Pos [4]
# Player Pos Team Games Min Points
# <chr> <chr> <chr> <int> <dbl> <dbl>
#1 Alex Abrines SG OKC 68 15.5 6.0
#2 Quincy Acy PF TOT 38 14.7 5.8
#3 Steven Adams C OKC 80 29.9 11.3
#4 Arron Afflalo SG SAC 61 25.9 8.4
与 data.table
类似的方法是
library(data.table)
setDT(df1)[, .SD[Team=="TOT"|.N==1], .(Player, Pos)]
数据
df1 <- structure(list(Player = c("Alex Abrines", "Quincy Acy", "Quincy Acy",
"Quincy Acy", "Steven Adams", "Arron Afflalo"), Pos = c("SG",
"PF", "PF", "PF", "C", "SG"), Team = c("OKC", "TOT", "DAL", "BRK",
"OKC", "SAC"), Games = c(68L, 38L, 6L, 32L, 80L, 61L), Min = c(15.5,
14.7, 8, 15.9, 29.9, 25.9), Points = c(6, 5.8, 2.2, 6.5, 11.3,
8.4)), .Names = c("Player", "Pos", "Team", "Games", "Min", "Points"
), class = "data.frame", row.names = c(NA, -6L))