只保留每组的最小值
Only keep the minimum value of each group
我有以下 data.table
:-
> dataz <- data.table(group = c("ZAS", "Car", rep("EEE", times = 3), rep("EEff", times = 2), rep("2133", times = 6), "EETTE"),
value = runif(14))
> dataz
group value
1: ZAS 0.27218511
2: Car 0.39520602
3: EEE 0.46775956
4: EEE 0.55071786
5: EEE 0.37529203
6: EEff 0.01471177
7: EEff 0.86282569
8: 2133 0.20789336
9: 2133 0.91272858
10: 2133 0.06315207
11: 2133 0.18178237
12: 2133 0.42354538
13: 2133 0.10176267
14: EETTE 0.88492458
我只想保留那些具有每个 group
最小值的行。
最终的data.table
将是以下形式:-
group value
1: ZAS 0.27218511
2: Car 0.39520602
3: EEE 0.37529203
4: EEff 0.01471177
5: 2133 0.06315207
6: EETTE 0.88492458
提前致谢。
与.SD
:
dataz[,.SD[value==min(value)],by=.(group)]
group value
<char> <num>
1: ZAS 0.39590814
2: Car 0.42591138
3: EEE 0.07049145
4: EEff 0.34670793
5: 2133 0.05702904
6: EETTE 0.31071582
另一种选择是切片
示例代码:
library(data.table)
library(dplyr)
dataz %>%
group_by(group) %>%
slice(which.min(value))
结果:
group value
<chr> <dbl>
1 2133 0.00592
2 Car 0.418
3 EEE 0.208
4 EEff 0.719
5 EETTE 0.963
6 ZAS 0.769
示例数据:
dataz<-structure(list(group = c("ZAS", "Car", "EEE", "EEE", "EEE", "EEff",
"EEff", "2133", "2133", "2133", "2133", "2133", "2133", "EETTE"
), value = c(0.711316933622584, 0.456328510772437, 0.838366007432342,
0.556059248745441, 0.621371693909168, 0.0612441042903811, 0.391384622780606,
0.986219455022365, 0.771872294368222, 0.54334409092553, 0.122617350192741,
0.195616364479065, 0.705191325163469, 0.940613608341664)), row.names = c(NA,
-14L), class = c("data.table", "data.frame"))
group value
1: ZAS 0.7113169
2: Car 0.4563285
3: EEE 0.8383660
4: EEE 0.5560592
5: EEE 0.6213717
6: EEff 0.0612441
7: EEff 0.3913846
8: 2133 0.9862195
9: 2133 0.7718723
10: 2133 0.5433441
11: 2133 0.1226174
12: 2133 0.1956164
13: 2133 0.7051913
14: EETTE 0.9406136
我有以下 data.table
:-
> dataz <- data.table(group = c("ZAS", "Car", rep("EEE", times = 3), rep("EEff", times = 2), rep("2133", times = 6), "EETTE"),
value = runif(14))
> dataz
group value
1: ZAS 0.27218511
2: Car 0.39520602
3: EEE 0.46775956
4: EEE 0.55071786
5: EEE 0.37529203
6: EEff 0.01471177
7: EEff 0.86282569
8: 2133 0.20789336
9: 2133 0.91272858
10: 2133 0.06315207
11: 2133 0.18178237
12: 2133 0.42354538
13: 2133 0.10176267
14: EETTE 0.88492458
我只想保留那些具有每个 group
最小值的行。
最终的data.table
将是以下形式:-
group value
1: ZAS 0.27218511
2: Car 0.39520602
3: EEE 0.37529203
4: EEff 0.01471177
5: 2133 0.06315207
6: EETTE 0.88492458
提前致谢。
与.SD
:
dataz[,.SD[value==min(value)],by=.(group)]
group value
<char> <num>
1: ZAS 0.39590814
2: Car 0.42591138
3: EEE 0.07049145
4: EEff 0.34670793
5: 2133 0.05702904
6: EETTE 0.31071582
另一种选择是切片
示例代码:
library(data.table)
library(dplyr)
dataz %>%
group_by(group) %>%
slice(which.min(value))
结果:
group value
<chr> <dbl>
1 2133 0.00592
2 Car 0.418
3 EEE 0.208
4 EEff 0.719
5 EETTE 0.963
6 ZAS 0.769
示例数据:
dataz<-structure(list(group = c("ZAS", "Car", "EEE", "EEE", "EEE", "EEff",
"EEff", "2133", "2133", "2133", "2133", "2133", "2133", "EETTE"
), value = c(0.711316933622584, 0.456328510772437, 0.838366007432342,
0.556059248745441, 0.621371693909168, 0.0612441042903811, 0.391384622780606,
0.986219455022365, 0.771872294368222, 0.54334409092553, 0.122617350192741,
0.195616364479065, 0.705191325163469, 0.940613608341664)), row.names = c(NA,
-14L), class = c("data.table", "data.frame"))
group value
1: ZAS 0.7113169
2: Car 0.4563285
3: EEE 0.8383660
4: EEE 0.5560592
5: EEE 0.6213717
6: EEff 0.0612441
7: EEff 0.3913846
8: 2133 0.9862195
9: 2133 0.7718723
10: 2133 0.5433441
11: 2133 0.1226174
12: 2133 0.1956164
13: 2133 0.7051913
14: EETTE 0.9406136