data.table 根据另一列的结果获取一列中的值

Question

df = data.table(
 ID = c("A","B","C","D","E","F","G"),
 price = c(100,101,102,103,104,102,101),
 ID2=c("a","b","b","b","c","c","c"))

df
#ID price ID2
#1:  A   100   a
#2:  B   101   b
#3:  C   102   b
#4:  D   103   b
#5:  E   104   c
#6:  F   102   c
#7:  G   101   c

鉴于上面的示例，我希望获得 ID 条件为按 ID2 分组的最高价格。我的输出应该是这样的：

#   ID2   V1  ID
#1:   a  100   A
#2:   b  103   D
#3:   c  104   E

Answer 1

你可以这样做：

library(data.table)
df[, .SD[which.max(price)], by=ID2]

#   ID2 ID price
#1:   a  A   100
#2:   b  D   103
#3:   c  E   104

在 dplyr 你会：

library(dplyr)
df %>% 
  group_by(ID2) %>% 
  slice_max(price, n = 1) %>% 
  select(ID2, V1 = price, ID)

#  ID2      V1 ID   
#  <chr> <dbl> <chr>
#1 a       100 A    
#2 b       103 D    
#3 c       104 E

data.table 根据另一列的结果获取一列中的值

data.table get the values in a column conditional to the results of another column

r

data.table