条件绑定
Conditional rbind
我有两个数据框:
df1<-data.frame(ID = c(1,1,2,3,3,4,5,6),
week = c(20,23,10,15,20,40,10,12),
var1 = rep(1, 8))
df2<-data.frame(ID=c(1,1,1,2,2,3,5),
week = c(18,19,22,8,9,14,9),
var1= rep(0,7))
我想结合以下条件:
1. Keep all of df1
2. Only add the rows from df2 where the df2$week = df1$week-1
输出将如下所示:
ID week var1
1 1 19 0
2 1 20 1
3 1 22 0
4 1 23 1
5 2 9 0
6 2 10 1
7 3 14 0
8 3 15 1
9 5 9 0
10 5 10 1
11 6 12 1
这是上一个问题的变体,该问题询问如何在一个条件下保留一行,而在另一个条件下保留其上方的行。我已经将数据子集分为两个数据框,假设有条件地 rbind 它们可能更容易。我试过了:
df3<-rbind.data.frame(ifelse(df2$ID==df1$ID & df2$week==df2$week-1, df1, df2))
但我收到一条错误消息:
longer object length is not a multiple of shorter object length.
我觉得这非常接近我想要的输出,但我对 rbind 不是很有经验。谢谢!
可以使用 SQL 以直接的方式轻松指定复杂的联接。我们假设您希望条件 2 中的 ID 也相同。
library(sqldf)
sqldf("select b.* from df1 a join df2 b on a.ID = b.ID and b.week = a.week-1
union
select * from df1
order by ID, week")
给予:
ID week var1
1 1 19 0
2 1 20 1
3 1 22 0
4 1 23 1
5 2 9 0
6 2 10 1
7 3 14 0
8 3 15 1
9 3 20 1
10 4 40 1
11 5 9 0
12 5 10 1
13 6 12 1
另一种方法:
library(tidyverse)
# List the weeks preceding those in df1
df1_wks <- unique(df1$week) - 1
# Combine df1 with the appropriate rows of df2
bind_rows(df1,
df2 %>% filter(week %in% df1_wks)) %>%
arrange(ID, week)
输出:
ID week var1
1 1 19 0
2 1 20 1
3 1 22 0
4 1 23 1
5 2 9 0
6 2 10 1
7 3 14 0
8 3 15 1
9 3 20 1
10 4 40 1
11 5 9 0
12 5 10 1
13 6 12 1
使用base R
df <- rbind(df1, df2[df2$week %in% (df1$week - 1), ])
df
ID week var1
1 1 20 1
2 1 23 1
3 2 10 1
4 3 15 1
5 3 20 1
6 4 40 1
7 5 10 1
8 6 12 1
21 1 19 0
31 1 22 0
51 2 9 0
61 3 14 0
71 5 9 0
我有两个数据框:
df1<-data.frame(ID = c(1,1,2,3,3,4,5,6),
week = c(20,23,10,15,20,40,10,12),
var1 = rep(1, 8))
df2<-data.frame(ID=c(1,1,1,2,2,3,5),
week = c(18,19,22,8,9,14,9),
var1= rep(0,7))
我想结合以下条件:
1. Keep all of df1
2. Only add the rows from df2 where the df2$week = df1$week-1
输出将如下所示:
ID week var1
1 1 19 0
2 1 20 1
3 1 22 0
4 1 23 1
5 2 9 0
6 2 10 1
7 3 14 0
8 3 15 1
9 5 9 0
10 5 10 1
11 6 12 1
这是上一个问题的变体,该问题询问如何在一个条件下保留一行,而在另一个条件下保留其上方的行。我已经将数据子集分为两个数据框,假设有条件地 rbind 它们可能更容易。我试过了:
df3<-rbind.data.frame(ifelse(df2$ID==df1$ID & df2$week==df2$week-1, df1, df2))
但我收到一条错误消息:
longer object length is not a multiple of shorter object length.
我觉得这非常接近我想要的输出,但我对 rbind 不是很有经验。谢谢!
可以使用 SQL 以直接的方式轻松指定复杂的联接。我们假设您希望条件 2 中的 ID 也相同。
library(sqldf)
sqldf("select b.* from df1 a join df2 b on a.ID = b.ID and b.week = a.week-1
union
select * from df1
order by ID, week")
给予:
ID week var1
1 1 19 0
2 1 20 1
3 1 22 0
4 1 23 1
5 2 9 0
6 2 10 1
7 3 14 0
8 3 15 1
9 3 20 1
10 4 40 1
11 5 9 0
12 5 10 1
13 6 12 1
另一种方法:
library(tidyverse)
# List the weeks preceding those in df1
df1_wks <- unique(df1$week) - 1
# Combine df1 with the appropriate rows of df2
bind_rows(df1,
df2 %>% filter(week %in% df1_wks)) %>%
arrange(ID, week)
输出:
ID week var1
1 1 19 0
2 1 20 1
3 1 22 0
4 1 23 1
5 2 9 0
6 2 10 1
7 3 14 0
8 3 15 1
9 3 20 1
10 4 40 1
11 5 9 0
12 5 10 1
13 6 12 1
使用base R
df <- rbind(df1, df2[df2$week %in% (df1$week - 1), ])
df
ID week var1
1 1 20 1
2 1 23 1
3 2 10 1
4 3 15 1
5 3 20 1
6 4 40 1
7 5 10 1
8 6 12 1
21 1 19 0
31 1 22 0
51 2 9 0
61 3 14 0
71 5 9 0