条件绑定

Conditional rbind

我有两个数据框:

df1<-data.frame(ID = c(1,1,2,3,3,4,5,6), 
           week = c(20,23,10,15,20,40,10,12), 
           var1 = rep(1, 8))
df2<-data.frame(ID=c(1,1,1,2,2,3,5),
            week = c(18,19,22,8,9,14,9),
            var1= rep(0,7))

我想结合以下条件:

1. Keep all of df1
2. Only add the rows from df2 where the df2$week = df1$week-1

输出将如下所示:

    ID week var1
1   1   19    0
2   1   20    1
3   1   22    0
4   1   23    1
5   2    9    0
6   2   10    1
7   3   14    0
8   3   15    1
9   5    9    0
10  5   10    1
11  6   12    1

这是上一个问题的变体,该问题询问如何在一个条件下保留一行,而在另一个条件下保留其上方的行。我已经将数据子集分为两个数据框,假设有条件地 rbind 它们可能更容易。我试过了:

df3<-rbind.data.frame(ifelse(df2$ID==df1$ID & df2$week==df2$week-1, df1, df2))

但我收到一条错误消息:

longer object length is not a multiple of shorter object length. 

我觉得这非常接近我想要的输出,但我对 rbind 不是很有经验。谢谢!

可以使用 SQL 以直接的方式轻松指定复杂的联接。我们假设您希望条件 2 中的 ID 也相同。

library(sqldf)

sqldf("select b.* from df1 a join df2 b on a.ID = b.ID and b.week = a.week-1 
       union 
       select * from df1 
       order by ID, week")

给予:

   ID week var1
1   1   19    0
2   1   20    1
3   1   22    0
4   1   23    1
5   2    9    0
6   2   10    1
7   3   14    0
8   3   15    1
9   3   20    1
10  4   40    1
11  5    9    0
12  5   10    1
13  6   12    1

另一种方法:

library(tidyverse)

# List the weeks preceding those in df1
df1_wks <- unique(df1$week) - 1

# Combine df1 with the appropriate rows of df2
bind_rows(df1, 
          df2 %>% filter(week %in% df1_wks)) %>%
  arrange(ID, week)

输出:

   ID week var1
1   1   19    0
2   1   20    1
3   1   22    0
4   1   23    1
5   2    9    0
6   2   10    1
7   3   14    0
8   3   15    1
9   3   20    1
10  4   40    1
11  5    9    0
12  5   10    1
13  6   12    1

使用base R

df <- rbind(df1, df2[df2$week %in% (df1$week - 1), ])
df
   ID week var1
1   1   20    1
2   1   23    1
3   2   10    1
4   3   15    1
5   3   20    1
6   4   40    1
7   5   10    1
8   6   12    1
21  1   19    0
31  1   22    0
51  2    9    0
61  3   14    0
71  5    9    0