基于条件 R 的列值顺序增加

Sequential Increase in Column value based on a condition R

我有一个 R 数据框,它有一个 ID 列,其中包含一个 ID 的多个记录。当 ID 的标志设置为 1 时,我想创建一个从 1 开始并以 6(1,6,12 ...)为增量顺序增加的新时间轴列。如何使用 dplyr 在 R 中实现此目的?

下面是一个示例数据框

ID Timepoint Flag
A 0 0
A 6 0
A 12 0
A 18 1
A 24 0
A 30 0
A 36 0

预期数据帧

ID Timepoint Flag New_Timepoint
A 0 0
A 6 0
A 12 0
A 18 1 1
A 24 0 6
A 30 0 12
A 36 0 18

一个选项是按 'ID' 分组,创建 'Timepoint' 的 lag 并将 n 指定为 'Flag' 的位置,其中值是 1 (-1)

library(dplyr)
df1 %>% 
   group_by(ID) %>% 
   mutate(New_Timepoint = dplyr::lag(replace(Timepoint, !Timepoint, 1),
           n = which(Flag == 1)-1)) %>%
   ungroup

-输出

# A tibble: 7 x 4
#  ID    Timepoint  Flag New_Timepoint
#  <chr>     <int> <int>         <dbl>
#1 A             0     0            NA
#2 A             6     0            NA
#3 A            12     0            NA
#4 A            18     1             1
#5 A            24     0             6
#6 A            30     0            12
#7 A            36     0            18

或者使用双cumsum创建索引

df1 %>% 
   group_by(ID) %>% 
   mutate(New_Timepoint = Timepoint[na_if(cumsum(cumsum(Flag)), 0)]) %>%
   ungroup

数据

df1 <- structure(list(ID = c("A", "A", "A", "A", "A", "A", "A"), 
    Timepoint = c(0L, 
6L, 12L, 18L, 24L, 30L, 36L), 
     Flag = c(0L, 0L, 0L, 1L, 0L, 0L, 
0L)), class = "data.frame", row.names = c(NA, -7L))

另一个dplyr选项

df %>%
  group_by(ID) %>%
  mutate(New_Timepoint = pmax(1, Timepoint - c(NA, Timepoint[Flag == 1])[cumsum(Flag) + 1])) %>%
  ungroup()

给予

  ID    Timepoint  Flag New_Timepoint
  <chr>     <int> <int>         <dbl>
1 A             0     0            NA
2 A             6     0            NA
3 A            12     0            NA
4 A            18     1             1
5 A            24     0             6
6 A            30     0            12
7 A            36     0            18