data.table 根据另一列的结果获取一列中的值
data.table get the values in a column conditional to the results of another column
df = data.table(
ID = c("A","B","C","D","E","F","G"),
price = c(100,101,102,103,104,102,101),
ID2=c("a","b","b","b","c","c","c"))
df
#ID price ID2
#1: A 100 a
#2: B 101 b
#3: C 102 b
#4: D 103 b
#5: E 104 c
#6: F 102 c
#7: G 101 c
鉴于上面的示例,我希望获得 ID 条件为按 ID2 分组的最高价格。
我的输出应该是这样的:
# ID2 V1 ID
#1: a 100 A
#2: b 103 D
#3: c 104 E
你可以这样做:
library(data.table)
df[, .SD[which.max(price)], by=ID2]
# ID2 ID price
#1: a A 100
#2: b D 103
#3: c E 104
在 dplyr
你会:
library(dplyr)
df %>%
group_by(ID2) %>%
slice_max(price, n = 1) %>%
select(ID2, V1 = price, ID)
# ID2 V1 ID
# <chr> <dbl> <chr>
#1 a 100 A
#2 b 103 D
#3 c 104 E
df = data.table(
ID = c("A","B","C","D","E","F","G"),
price = c(100,101,102,103,104,102,101),
ID2=c("a","b","b","b","c","c","c"))
df
#ID price ID2
#1: A 100 a
#2: B 101 b
#3: C 102 b
#4: D 103 b
#5: E 104 c
#6: F 102 c
#7: G 101 c
鉴于上面的示例,我希望获得 ID 条件为按 ID2 分组的最高价格。 我的输出应该是这样的:
# ID2 V1 ID
#1: a 100 A
#2: b 103 D
#3: c 104 E
你可以这样做:
library(data.table)
df[, .SD[which.max(price)], by=ID2]
# ID2 ID price
#1: a A 100
#2: b D 103
#3: c E 104
在 dplyr
你会:
library(dplyr)
df %>%
group_by(ID2) %>%
slice_max(price, n = 1) %>%
select(ID2, V1 = price, ID)
# ID2 V1 ID
# <chr> <dbl> <chr>
#1 a 100 A
#2 b 103 D
#3 c 104 E