只保留每组的最小值

Only keep the minimum value of each group

我有以下 data.table:-

> dataz <- data.table(group = c("ZAS", "Car", rep("EEE", times = 3), rep("EEff", times = 2), rep("2133", times = 6), "EETTE"),
                    value = runif(14))
> dataz

    group      value
 1:   ZAS 0.27218511
 2:   Car 0.39520602
 3:   EEE 0.46775956
 4:   EEE 0.55071786
 5:   EEE 0.37529203
 6:  EEff 0.01471177
 7:  EEff 0.86282569
 8:  2133 0.20789336
 9:  2133 0.91272858
10:  2133 0.06315207
11:  2133 0.18178237
12:  2133 0.42354538
13:  2133 0.10176267
14: EETTE 0.88492458

我只想保留那些具有每个 group 最小值的行。

最终的data.table将是以下形式:-

    group      value
 1:   ZAS 0.27218511
 2:   Car 0.39520602
 3:   EEE 0.37529203
 4:  EEff 0.01471177
 5:  2133 0.06315207
 6: EETTE 0.88492458

提前致谢。

.SD:

dataz[,.SD[value==min(value)],by=.(group)]
    group      value
   <char>      <num>
1:    ZAS 0.39590814
2:    Car 0.42591138
3:    EEE 0.07049145
4:   EEff 0.34670793
5:   2133 0.05702904
6:  EETTE 0.31071582

另一种选择是切片

示例代码:

  library(data.table)
   library(dplyr)

  dataz %>%
  group_by(group) %>%
  slice(which.min(value))

结果:

  group   value
  <chr>   <dbl>
1 2133  0.00592
2 Car   0.418  
3 EEE   0.208  
4 EEff  0.719  
5 EETTE 0.963  
6 ZAS   0.769

示例数据:

dataz<-structure(list(group = c("ZAS", "Car", "EEE", "EEE", "EEE", "EEff", 
"EEff", "2133", "2133", "2133", "2133", "2133", "2133", "EETTE"
), value = c(0.711316933622584, 0.456328510772437, 0.838366007432342, 
0.556059248745441, 0.621371693909168, 0.0612441042903811, 0.391384622780606, 
0.986219455022365, 0.771872294368222, 0.54334409092553, 0.122617350192741, 
0.195616364479065, 0.705191325163469, 0.940613608341664)), row.names = c(NA, 
-14L), class = c("data.table", "data.frame"))



  group     value
 1:   ZAS 0.7113169
 2:   Car 0.4563285
 3:   EEE 0.8383660
 4:   EEE 0.5560592
 5:   EEE 0.6213717
 6:  EEff 0.0612441
 7:  EEff 0.3913846
 8:  2133 0.9862195
 9:  2133 0.7718723
10:  2133 0.5433441
11:  2133 0.1226174
12:  2133 0.1956164
13:  2133 0.7051913
14: EETTE 0.9406136