基于 R 中变量值的相对行索引
Relative row index based on a variable value in R
想知道是否有有效且 terse/concise data.table 解决以下问题的方法。
请假设我有以下 data.table:
library(data.table)
DT <- data.table(store = c("A", "A", "A", "A", "B", "B", "B", "B"),
time = c(1,2,3,4,1,2,3,4),
treat_time = c(0,0,1,0, 0,1,0,0))
这里,treat_time
是店铺接受治疗的时间段。请注意,A店和B店的治疗时间不同。我想创建一个列time_rel
,它描述了相对于treat_time = 1时的时间段。即data.table 应该是这样的:
DT_outcome <- data.table(store = c("A", "A", "A", "A", "B", "B", "B", "B"),
time = c(1,2,3,4,1,2,3,4),
treat_time = c(0,0,1,0, 0,1,0,0),
time_rel = c(-1,0,1,2, 0,1,2,3))
store time treat_time time_rel
<char> <num> <num> <num>
1: A 1 0 -1
2: A 2 0 0
3: A 3 1 1
4: A 4 0 2
5: B 1 0 0
6: B 2 1 1
7: B 3 0 2
8: B 4 0 3
谢谢!
我们可以从序列中减去
library(data.table)
DT[, time_rel := seq_len(.N) - seq_len(.N)[treat_time == 1] + 1 ,store]
-输出
> DT
store time treat_time time_rel
<char> <num> <num> <num>
1: A 1 0 -1
2: A 2 0 0
3: A 3 1 1
4: A 4 0 2
5: B 1 0 0
6: B 2 1 1
7: B 3 0 2
8: B 4 0 3
或dplyr
中的相同逻辑
library(dplyr)
DT %>%
group_by(store) %>%
mutate(time_rel = row_number() - which(treat_time == 1) + 1) %>%
ungroup
想知道是否有有效且 terse/concise data.table 解决以下问题的方法。
请假设我有以下 data.table:
library(data.table)
DT <- data.table(store = c("A", "A", "A", "A", "B", "B", "B", "B"),
time = c(1,2,3,4,1,2,3,4),
treat_time = c(0,0,1,0, 0,1,0,0))
这里,treat_time
是店铺接受治疗的时间段。请注意,A店和B店的治疗时间不同。我想创建一个列time_rel
,它描述了相对于treat_time = 1时的时间段。即data.table 应该是这样的:
DT_outcome <- data.table(store = c("A", "A", "A", "A", "B", "B", "B", "B"),
time = c(1,2,3,4,1,2,3,4),
treat_time = c(0,0,1,0, 0,1,0,0),
time_rel = c(-1,0,1,2, 0,1,2,3))
store time treat_time time_rel
<char> <num> <num> <num>
1: A 1 0 -1
2: A 2 0 0
3: A 3 1 1
4: A 4 0 2
5: B 1 0 0
6: B 2 1 1
7: B 3 0 2
8: B 4 0 3
谢谢!
我们可以从序列中减去
library(data.table)
DT[, time_rel := seq_len(.N) - seq_len(.N)[treat_time == 1] + 1 ,store]
-输出
> DT
store time treat_time time_rel
<char> <num> <num> <num>
1: A 1 0 -1
2: A 2 0 0
3: A 3 1 1
4: A 4 0 2
5: B 1 0 0
6: B 2 1 1
7: B 3 0 2
8: B 4 0 3
或dplyr
library(dplyr)
DT %>%
group_by(store) %>%
mutate(time_rel = row_number() - which(treat_time == 1) + 1) %>%
ungroup