添加一列,按 R 中的组计算行数,直到第一个 1
Add a column that count number of rows until the first 1, by group in R
我有以下数据集:
test_df=data.frame(Group=c(1,1,1,1,2,2),var1=c(1,0,0,1,1,1),var2=c(0,0,1,1,0,0),var3=c(0,1,0,0,0,1))
Group
var1
var2
var3
1
1
0
0
1
0
0
1
1
0
1
0
1
1
1
0
2
1
0
0
2
1
0
1
我想为 var1-3 添加 3 列 (out1-3),按组计算行数,直到第一个 1,
如下图:
Group
var1
var2
var3
out1
out2
out3
1
1
0
0
1
3
2
1
0
0
1
1
3
2
1
0
1
0
1
3
2
1
1
1
0
1
3
2
2
1
0
0
1
0
2
2
1
0
1
1
0
2
我使用了这个 R 代码,我对我的 3 个变量重复了它,而我的实际数据集包含的不仅仅是 3 列。
但它不起作用:
test_var1<-select(test_df,Group,var1 )%>%
group_by(Group) %>%
mutate(out1 = row_number()) %>%
filter(var1 != 0) %>%
slice(1)
如果你只有 3 个“输出”变量,那么你可以创建如下三行
#1- Your dataset
df=data.frame(Group=rep(1,4),var1=c(1,0,0,1),var2=c(0,0,1,1),var3=c(0,1,0,0))
#2- Count the first row number with "1" value
df$out1=min(rownames(df)[which(df$var1==1)])
df$out2=min(rownames(df)[which(df$var2==1)])
df$out3=min(rownames(df)[which(df$var3==1)])
如果您有超过 3 列,那么最好创建一个循环,例如
for(i in 1:3){
df[paste("out",i,sep="")]=min(rownames(df)[which(df[,which(colnames(df)==paste("var",i,sep=""))]==1)])
}
df <- data.frame(Group=c(1,1,1,1,2,2),
var1=c(1,0,0,1,1,1),
var2=c(0,0,1,1,0,0),
var3=c(0,1,0,0,0,1))
这适用于任意数量的变量,只要结构与示例中的相同(即组 + 许多 0 或 1 的变量)
df %>%
mutate(rownr = row_number()) %>%
pivot_longer(-c(Group, rownr)) %>%
group_by(Group, name) %>%
mutate(out = cumsum(value != 1 & (cumsum(value) < 1)) + 1,
out = ifelse(max(out) > n(), 0, max(out))) %>%
pivot_wider(names_from = c(name, name), values_from = c(value, out)) %>%
select(-rownr)
Returns:
Group value_var1 value_var2 value_var3 out_var1 out_var2 out_var3
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 1 0 0 1 3 2
2 1 0 0 1 1 3 2
3 1 0 1 0 1 3 2
4 1 1 1 0 1 3 2
5 2 1 0 0 1 0 2
6 2 1 0 1 1 0 2
我有以下数据集:
test_df=data.frame(Group=c(1,1,1,1,2,2),var1=c(1,0,0,1,1,1),var2=c(0,0,1,1,0,0),var3=c(0,1,0,0,0,1))
Group | var1 | var2 | var3 |
---|---|---|---|
1 | 1 | 0 | 0 |
1 | 0 | 0 | 1 |
1 | 0 | 1 | 0 |
1 | 1 | 1 | 0 |
2 | 1 | 0 | 0 |
2 | 1 | 0 | 1 |
我想为 var1-3 添加 3 列 (out1-3),按组计算行数,直到第一个 1,
如下图:
Group | var1 | var2 | var3 | out1 | out2 | out3 |
---|---|---|---|---|---|---|
1 | 1 | 0 | 0 | 1 | 3 | 2 |
1 | 0 | 0 | 1 | 1 | 3 | 2 |
1 | 0 | 1 | 0 | 1 | 3 | 2 |
1 | 1 | 1 | 0 | 1 | 3 | 2 |
2 | 1 | 0 | 0 | 1 | 0 | 2 |
2 | 1 | 0 | 1 | 1 | 0 | 2 |
我使用了这个 R 代码,我对我的 3 个变量重复了它,而我的实际数据集包含的不仅仅是 3 列。 但它不起作用:
test_var1<-select(test_df,Group,var1 )%>%
group_by(Group) %>%
mutate(out1 = row_number()) %>%
filter(var1 != 0) %>%
slice(1)
如果你只有 3 个“输出”变量,那么你可以创建如下三行
#1- Your dataset
df=data.frame(Group=rep(1,4),var1=c(1,0,0,1),var2=c(0,0,1,1),var3=c(0,1,0,0))
#2- Count the first row number with "1" value
df$out1=min(rownames(df)[which(df$var1==1)])
df$out2=min(rownames(df)[which(df$var2==1)])
df$out3=min(rownames(df)[which(df$var3==1)])
如果您有超过 3 列,那么最好创建一个循环,例如
for(i in 1:3){
df[paste("out",i,sep="")]=min(rownames(df)[which(df[,which(colnames(df)==paste("var",i,sep=""))]==1)])
}
df <- data.frame(Group=c(1,1,1,1,2,2),
var1=c(1,0,0,1,1,1),
var2=c(0,0,1,1,0,0),
var3=c(0,1,0,0,0,1))
这适用于任意数量的变量,只要结构与示例中的相同(即组 + 许多 0 或 1 的变量)
df %>%
mutate(rownr = row_number()) %>%
pivot_longer(-c(Group, rownr)) %>%
group_by(Group, name) %>%
mutate(out = cumsum(value != 1 & (cumsum(value) < 1)) + 1,
out = ifelse(max(out) > n(), 0, max(out))) %>%
pivot_wider(names_from = c(name, name), values_from = c(value, out)) %>%
select(-rownr)
Returns:
Group value_var1 value_var2 value_var3 out_var1 out_var2 out_var3 <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 1 1 1 0 0 1 3 2 2 1 0 0 1 1 3 2 3 1 0 1 0 1 3 2 4 1 1 1 0 1 3 2 5 2 1 0 0 1 0 2 6 2 1 0 1 1 0 2