使用 seq_along() 处理 for 循环中的日期

Using seq_along() to handle Date in for loop

这里有两个 df 示例数据:

df1

ID      First.seen  Last.seen 
A10   2015-09-07  2015-09-16       
A11   2015-09-07  2015-09-19 



df2
 ID      First_seen  Last_seen
 A1      2015-09-07  0
A10      2015-09-07  0

如果 ID 在两个 dfs 中都很常见,我想填写 df2$Last_seen。请注意,在真实数据中,我在两个 dfs 中都有多个 ID。我试过 for 循环,但我只得到数值:

for (i in 1:nrow(df2)){
  if (df2$ID[i] %in% df1$ID) {
    df2$Last_seen[i] <- df1$Last.seen[df1$ID == df2$ID[i]]
  }else{
    df2$Last_seen[i] <- 0
  }
}

我找到了 this 对使用 seq_along 的同一个问题的回答,但是当我应用此代码时我得到了 df1$Last_seen[i] == 1 的结果:

 for (i in seq_along(1:nrow(df2))){
      if (df2$ID[i] %in% df1$ID) {
        df2$Last_seen[i] <- df1$Last.seen[df1$ID == df2$ID[i]]
      }else{
        df2$Last_seen[i] <- 0
      }
    }

关于如何正确使用它有什么建议吗?

你不需要循环来做到这一点。您需要加入 ID 上的 tables。这可以通过 dplyr:

来完成
df1 <- read.table(text="ID      First.seen  Last.seen
A10   2015-09-07  2015-09-16
A11   2015-09-07  2015-09-19",header=TRUE, stringsAsFactors=FALSE)

df2<- read.table(text="ID      First_seen  Last_seen
 A1      2015-09-07  0
A10      2015-09-07  0",header=TRUE, stringsAsFactors=FALSE)

library(dplyr)
left_join(df2,df1)
   ID First_seen Last_seen First.seen  Last.seen
1  A1 2015-09-07         0       <NA>       <NA>
2 A10 2015-09-07         0 2015-09-07 2015-09-16

如果你想要三列 table:

left_join(df2,df1, by=c("ID" = "ID","First_seen"="First.seen")) %>%
mutate(Last_seen=ifelse(is.na(Last.seen),Last_seen,Last.seen)) %>%
select(-Last.seen)

   ID First_seen  Last_seen
1  A1 2015-09-07          0
2 A10 2015-09-07 2015-09-16

编辑 要更改 Last_seen 为 0 的事件,您可以添加另一个 ifelse:

left_join(df2,df1, by=c("ID" = "ID","First_seen"="First.seen")) %>%
mutate(Last_seen=ifelse(is.na(Last.seen),Last_seen,Last.seen),
       Last_seen=ifelse(Last_seen==0,format(as.Date(First_seen)+16,"%Y-%m-%d"),Last.seen))%>%
select(-Last.seen)

   ID First_seen  Last_seen
1  A1 2015-09-07 2015-09-23
2 A10 2015-09-07 2015-09-16

EDIT2

left_join(df2,df1, by=c("ID" = "ID","First_seen"="First.seen")) %>%
mutate(Last_seen=ifelse(is.na(Last.seen),Last_seen,Last.seen),
       Last_seen=ifelse(Last_seen==0,format(as.Date(First_seen)+16,"%Y-%m-%d",origin = "1900-01-01"),Last.seen))%>%
select(-Last.seen)

   ID First_seen  Last_seen
1  A1 2015-09-07 2015-09-23
2 A10 2015-09-07 2015-09-16