如果满足条件,则替换第一个日期实例

Replacing the first date instance if a condition is satisfied

我在数据框中有这样的快照数据:

zz <- "id  created  snap    stage
ALPHA   2012-09-07  2014-01-02  A
ALPHA   2012-09-07  2014-10-01  End
BETA    2012-08-26  2014-01-04  B
BETA    2012-08-26  2014-06-19  C
BETA    2012-08-26  2014-11-21  End
GAMMA   2014-01-04  2014-01-04  A
GAMMA   2014-01-04  2014-03-07  B
GAMMA   2014-01-04  2014-03-28  C
GAMMA   2014-01-04  2014-03-29  End
DELTA   2014-07-14  2014-07-15  A
DELTA   2014-07-14  2014-09-26  C
DELTA   2014-07-14  2015-02-06  End"
df <- read.table(text=zz, header = T)

每当 created 日期早于 2014-01-01 时,我需要将 snap 日期替换为 created 日期。但我只想替换第一个观察实例的捕捉日期。尽管 id 按顺序从 A-B-C-End 移动,但 id 不必从 A 开始。

例如,这是我正在寻找的输出:

id  created snap    stage
ALPHA   2012-09-07  2012-09-07  A
ALPHA   2012-09-07  2014-10-01  End
BETA    2012-08-26  2012-08-26  B
BETA    2012-08-26  2014-06-19  C
BETA    2012-08-26  2014-11-21  End
GAMMA   2014-01-04  2014-01-04  A
GAMMA   2014-01-04  2014-03-07  B
GAMMA   2014-01-04  2014-03-28  C
GAMMA   2014-01-04  2014-03-29  End
DELTA   2014-07-14  2014-07-15  A
DELTA   2014-07-14  2014-09-26  C
DELTA   2014-07-14  2015-02-06  End

请注意 GAMMADELTA 保持不变,但是 ALPHAA 阶段替换了快照日期,BETA 在阶段也是如此B.

试试这个:

library(data.table)
setDT(df)[, snap := if (created[1L] < as.Date('2014-01-01')) 
                    c(created[1L], snap[-1L]), by = id]

我假设 snapcreated 是日期列。如果不是,您可以通过以下方式转换它们:

cols = c("snap", "created")
df[, (cols) := lapply(.SD, as.Date), .SDcols=cols]

这是一个 dplyr 方法 - 我从 "mutate_each" 开始,以确保 "created" 和 "snap" 都被格式化为正确的日期。然后我们按 "id" 对数据进行分组,最后使用 "mutate" 和 "replace" 对 "snap" 列进行必要的更改(我们检查创建时间是在截止日期之前,并且其中 row_number 为 1,即该 id 组中的第一行):

library(dplyr)
df %>% 
  mutate_each(funs(as.Date(.)), created, snap) %>%
  group_by(id) %>%
  mutate(snap = replace(snap, which(created < as.Date("2014-01-01") & row_number() == 1), created))

#Source: local data frame [12 x 4]
#Groups: id
#
#      id    created       snap stage
#1  ALPHA 2012-09-07 2012-09-07     A
#2  ALPHA 2012-09-07 2014-10-01   End
#3   BETA 2012-08-26 2012-08-26     B
#4   BETA 2012-08-26 2014-06-19     C
#5   BETA 2012-08-26 2014-11-21   End
#6  GAMMA 2014-01-04 2014-01-04     A
#7  GAMMA 2014-01-04 2014-03-07     B
#8  GAMMA 2014-01-04 2014-03-28     C
#9  GAMMA 2014-01-04 2014-03-29   End
#10 DELTA 2014-07-14 2014-07-15     A
#11 DELTA 2014-07-14 2014-09-26     C
#12 DELTA 2014-07-14 2015-02-06   End