R - 从每组值的数据集中选择一个样本
R - choose a sample from a data set from each group of values
我有一个如下所示的数据集:
string_1,score,group
"sdfsd",0.546,0.5
"sdfsd",0.53,0.5
"sdfsd",0.52,0.5
"dgfbx",0.43,0.4
"dsgfgsd",0.48,0.4
"dsgfgsd",0.42,0.4
"dsgfgsd",0.84,0.8
"dsgfgsd",0.83,0.8
"dsgfgsd",0.82,0.8
我想从每组中抽样。含义 - 我想从每组值中随机取 2 行:0.4,0.5,0.8(组字段)
最简单的方法是什么?
谢谢
你可以考虑做这样的事情。它按组拆分您的数据,并 returns 采样行。
set.seed(1)
res <- do.call(rbind,lapply(split(dat,dat$group),function(x){x[sample(nrow(x),2),]}))
> res
string_1 score group
0.4.4 dgfbx 0.43 0.4
0.4.6 dsgfgsd 0.42 0.4
0.5.2 sdfsd 0.53 0.5
0.5.3 sdfsd 0.52 0.5
0.8.7 dsgfgsd 0.84 0.8
0.8.8 dsgfgsd 0.83 0.8
我有一个如下所示的数据集:
string_1,score,group
"sdfsd",0.546,0.5
"sdfsd",0.53,0.5
"sdfsd",0.52,0.5
"dgfbx",0.43,0.4
"dsgfgsd",0.48,0.4
"dsgfgsd",0.42,0.4
"dsgfgsd",0.84,0.8
"dsgfgsd",0.83,0.8
"dsgfgsd",0.82,0.8
我想从每组中抽样。含义 - 我想从每组值中随机取 2 行:0.4,0.5,0.8(组字段)
最简单的方法是什么?
谢谢
你可以考虑做这样的事情。它按组拆分您的数据,并 returns 采样行。
set.seed(1)
res <- do.call(rbind,lapply(split(dat,dat$group),function(x){x[sample(nrow(x),2),]}))
> res
string_1 score group
0.4.4 dgfbx 0.43 0.4
0.4.6 dsgfgsd 0.42 0.4
0.5.2 sdfsd 0.53 0.5
0.5.3 sdfsd 0.52 0.5
0.8.7 dsgfgsd 0.84 0.8
0.8.8 dsgfgsd 0.83 0.8