基于 data.table R 中列的条件滚动差异或梯度
conditional rolling difference or gradient based on a column in data.table R
我需要根据非常大的另一列的条件值获取一列的梯度 data.table。
> require(data.table)
> DT = data.table( ID = c(rep('A', 8), rep('B', 6)),
Condition = c(0,1,0,0,1,1,0,1,0,0,1,0,0,1),
Value = c(4,3,2,1,4,3,2,1,4,3,2,1,4,3))
我想通过 ID 获取 'Value' 列的滚动梯度,仅适用于条件 == 1 的行。
> desired_output
ID Condition Value Gradient
1: A 0 4 NA # condition isn't met so no gradient
2: A 1 3 0 # condition is met but there is no predecessor. Gradient set to 0
3: A 0 2 NA # condition isn't met so no gradient
4: A 0 1 NA # condition isn't met so no gradient
5: A 1 4 1 # condition is met and gradient is 4-3=1
6: A 1 3 -1 # condition is met and gradient is 3-4=-1
7: A 0 2 NA # condition isn't met so no gradient
8: A 1 1 -2 # condition is met and gradient is 1-3=-2
9: B 0 4 NA
10: B 0 3 NA
11: B 1 2 0
12: B 0 1 NA
13: B 0 4 NA
14: B 1 3 1
如果可能的话,我更喜欢本地 data.table 解决方案。
请注意:可以通过子设置 DT[Condition == 1] 然后重新加入结果来实现。如果可能的话,我想避免分集和重新加入。
library(data.table)
library(magrittr)
dt = data.table( ID = c(rep('A', 8), rep('B', 6)),
Condition = c(0,1,0,0,1,1,0,1,0,0,1,0,0,1),
Value = c(4,3,2,1,4,3,2,1,4,3,2,1,4,3))
# 1
dt[Condition == 1, Gradient := Value - shift(Value, fill = first(Value)), by = ID][]
#> ID Condition Value Gradient
#> 1: A 0 4 NA
#> 2: A 1 3 0
#> 3: A 0 2 NA
#> 4: A 0 1 NA
#> 5: A 1 4 1
#> 6: A 1 3 -1
#> 7: A 0 2 NA
#> 8: A 1 1 -2
#> 9: B 0 4 NA
#> 10: B 0 3 NA
#> 11: B 1 2 0
#> 12: B 0 1 NA
#> 13: B 0 4 NA
#> 14: B 1 3 1
#2
dt$grad <- c (NA, NA, -1, -2,1, -1, - 1, -2, NA, NA, NA, -1,2,1)
dt[Condition == 1, Value2 := Value, by = ID] %>%
.[, Value2 := shift(nafill(Value2, "locf"))] %>%
.[ Value2 != 1, Gradient2 := Value - Value2] %>%
.[, Value2 := NULL] %>%
.[]
#> ID Condition Value Gradient grad Gradient2
#> 1: A 0 4 NA NA NA
#> 2: A 1 3 0 NA NA
#> 3: A 0 2 NA -1 -1
#> 4: A 0 1 NA -2 -2
#> 5: A 1 4 1 1 1
#> 6: A 1 3 -1 -1 -1
#> 7: A 0 2 NA -1 -1
#> 8: A 1 1 -2 -2 -2
#> 9: B 0 4 NA NA NA
#> 10: B 0 3 NA NA NA
#> 11: B 1 2 0 NA NA
#> 12: B 0 1 NA -1 -1
#> 13: B 0 4 NA 2 2
#> 14: B 1 3 1 1 1
由 reprex package (v2.0.0)
于 2021-06-04 创建
我需要根据非常大的另一列的条件值获取一列的梯度 data.table。
> require(data.table)
> DT = data.table( ID = c(rep('A', 8), rep('B', 6)),
Condition = c(0,1,0,0,1,1,0,1,0,0,1,0,0,1),
Value = c(4,3,2,1,4,3,2,1,4,3,2,1,4,3))
我想通过 ID 获取 'Value' 列的滚动梯度,仅适用于条件 == 1 的行。
> desired_output
ID Condition Value Gradient
1: A 0 4 NA # condition isn't met so no gradient
2: A 1 3 0 # condition is met but there is no predecessor. Gradient set to 0
3: A 0 2 NA # condition isn't met so no gradient
4: A 0 1 NA # condition isn't met so no gradient
5: A 1 4 1 # condition is met and gradient is 4-3=1
6: A 1 3 -1 # condition is met and gradient is 3-4=-1
7: A 0 2 NA # condition isn't met so no gradient
8: A 1 1 -2 # condition is met and gradient is 1-3=-2
9: B 0 4 NA
10: B 0 3 NA
11: B 1 2 0
12: B 0 1 NA
13: B 0 4 NA
14: B 1 3 1
如果可能的话,我更喜欢本地 data.table 解决方案。
请注意:可以通过子设置 DT[Condition == 1] 然后重新加入结果来实现。如果可能的话,我想避免分集和重新加入。
library(data.table)
library(magrittr)
dt = data.table( ID = c(rep('A', 8), rep('B', 6)),
Condition = c(0,1,0,0,1,1,0,1,0,0,1,0,0,1),
Value = c(4,3,2,1,4,3,2,1,4,3,2,1,4,3))
# 1
dt[Condition == 1, Gradient := Value - shift(Value, fill = first(Value)), by = ID][]
#> ID Condition Value Gradient
#> 1: A 0 4 NA
#> 2: A 1 3 0
#> 3: A 0 2 NA
#> 4: A 0 1 NA
#> 5: A 1 4 1
#> 6: A 1 3 -1
#> 7: A 0 2 NA
#> 8: A 1 1 -2
#> 9: B 0 4 NA
#> 10: B 0 3 NA
#> 11: B 1 2 0
#> 12: B 0 1 NA
#> 13: B 0 4 NA
#> 14: B 1 3 1
#2
dt$grad <- c (NA, NA, -1, -2,1, -1, - 1, -2, NA, NA, NA, -1,2,1)
dt[Condition == 1, Value2 := Value, by = ID] %>%
.[, Value2 := shift(nafill(Value2, "locf"))] %>%
.[ Value2 != 1, Gradient2 := Value - Value2] %>%
.[, Value2 := NULL] %>%
.[]
#> ID Condition Value Gradient grad Gradient2
#> 1: A 0 4 NA NA NA
#> 2: A 1 3 0 NA NA
#> 3: A 0 2 NA -1 -1
#> 4: A 0 1 NA -2 -2
#> 5: A 1 4 1 1 1
#> 6: A 1 3 -1 -1 -1
#> 7: A 0 2 NA -1 -1
#> 8: A 1 1 -2 -2 -2
#> 9: B 0 4 NA NA NA
#> 10: B 0 3 NA NA NA
#> 11: B 1 2 0 NA NA
#> 12: B 0 1 NA -1 -1
#> 13: B 0 4 NA 2 2
#> 14: B 1 3 1 1 1
由 reprex package (v2.0.0)
于 2021-06-04 创建