如何在 data.table 函数中嵌入 and 运算符?
How to embed an and operator inside a data.table function?
假设我们从下面的代码生成的这个数据帧开始:
> data1
ID Period Values_1 Values_2 State
1 1 1 5 5 X0
2 1 2 0 2 X1
3 1 3 0 0 X2
4 1 4 0 12 X1
5 2 1 1 2 X0
6 2 2 -1 0 X2
7 2 3 0 1 X0
8 2 4 0 0 X0
9 3 1 0 0 X2
10 3 2 0 0 X1
11 3 3 0 0 X9
12 3 4 0 2 X3
13 4 1 1 4 X2
14 4 2 2 5 X1
15 4 3 3 6 X9
16 4 4 0 0 X3
data1 <-
data.frame(
ID = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4),
Period = c(1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4),
Values_1 = c(5, 0, 0, 0, 1, -1, 0, 0, 0, 0, 0, 0, 1, 2, 3, 0),
Values_2 = c(5, 2, 0, 12, 2, 0, 1, 0, 0, 0, 0, 2, 4, 5, 6, 0),
State = c("X0","X1","X2","X1","X0","X2","X0","X0", "X2","X1","X9","X3", "X2","X1","X9","X3")
)
我一直在使用此 data.table 代码在 State_1 中标记每个 ID,当它在其未来期间不再生成值时:
setDT(data1)[, State1 := ifelse(rev(cumsum(rev(Values_1 + Values_2))), State, "END"), ID]
以上代码给出了这些结果:
> data1
ID Period Values_1 Values_2 State State1
1: 1 1 5 5 X0 X0
2: 1 2 0 2 X1 X1
3: 1 3 0 0 X2 X2
4: 1 4 0 12 X1 X1
5: 2 1 1 2 X0 X0
6: 2 2 -1 0 X2 END
7: 2 3 0 1 X0 X0
8: 2 4 0 0 X0 END
9: 3 1 0 0 X2 X2
10: 3 2 0 0 X1 X1
11: 3 3 0 0 X9 X9
12: 3 4 0 2 X3 X3
13: 4 1 1 4 X2 X2
14: 4 2 2 5 X1 X1
15: 4 3 3 6 X9 X9
16: 4 4 0 0 X3 END
当我想为 ID = 2 提供这些结果时:
> data1
ID Period Values_1 Values_2 State State1
1: 1 1 5 5 X0 X0
2: 1 2 0 2 X1 X1
3: 1 3 0 0 X2 X2
4: 1 4 0 12 X1 X1
5: 2 1 1 2 X0 X0
6: 2 2 -1 0 X2 X2
7: 2 3 0 1 X0 X0
8: 2 4 0 0 X0 END
9: 3 1 0 0 X2 X2
10: 3 2 0 0 X1 X1
11: 3 3 0 0 X9 X9
12: 3 4 0 2 X3 X3
13: 4 1 1 4 X2 X2
14: 4 2 2 5 X1 X1
15: 4 3 3 6 X9 X9
16: 4 4 0 0 X3 END
为了做到这一点,我需要更改 data.table 代码以使其生效,如下所示(它不起作用),其中如果两个 ID 为 [= 的未来期间值33=] AND Values_2(单独计算)= 0,则该 ID 的 State_1 在其所有未来期间都被标记为 END。如何在 data.table 中完成?
setDT(data1)[, State1 := ifelse(rev(cumsum(rev(Values_1))) & rev(cumsum(rev(Values_2))), State, "END"), ID]
此链接与相关 post
链接 post 中的答案似乎假设 Values_1
和 Values_2
的值为 non-negative。如果有负数,则在 data.table
表达式中插入一个 abs
:
setDT(data1)[, State1 := ifelse(rev(cumsum(rev(Values_1 | Values_2))), State, "END"), ID]
也许是这样的:
f <- function(v1,v2,s) {
s[cumsum(abs(v1)+abs(v2))==0] <- "END"
s
}
setDT(data1)[order(-Period), State1:=f(Values_1, Values_2, State), by=ID]
输出:
ID Period Values_1 Values_2 State State1
1: 1 1 5 5 X0 X0
2: 1 2 0 2 X1 X1
3: 1 3 0 0 X2 X2
4: 1 4 0 12 X1 X1
5: 2 1 1 2 X0 X0
6: 2 2 -1 0 X2 X2
7: 2 3 0 1 X0 X0
8: 2 4 0 0 X0 END
9: 3 1 0 0 X2 X2
10: 3 2 0 0 X1 X1
11: 3 3 0 0 X9 X9
12: 3 4 0 2 X3 X3
13: 4 1 1 4 X2 X2
14: 4 2 2 5 X1 X1
15: 4 3 3 6 X9 X9
16: 4 4 0 0 X3 END
假设我们从下面的代码生成的这个数据帧开始:
> data1
ID Period Values_1 Values_2 State
1 1 1 5 5 X0
2 1 2 0 2 X1
3 1 3 0 0 X2
4 1 4 0 12 X1
5 2 1 1 2 X0
6 2 2 -1 0 X2
7 2 3 0 1 X0
8 2 4 0 0 X0
9 3 1 0 0 X2
10 3 2 0 0 X1
11 3 3 0 0 X9
12 3 4 0 2 X3
13 4 1 1 4 X2
14 4 2 2 5 X1
15 4 3 3 6 X9
16 4 4 0 0 X3
data1 <-
data.frame(
ID = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4),
Period = c(1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4),
Values_1 = c(5, 0, 0, 0, 1, -1, 0, 0, 0, 0, 0, 0, 1, 2, 3, 0),
Values_2 = c(5, 2, 0, 12, 2, 0, 1, 0, 0, 0, 0, 2, 4, 5, 6, 0),
State = c("X0","X1","X2","X1","X0","X2","X0","X0", "X2","X1","X9","X3", "X2","X1","X9","X3")
)
我一直在使用此 data.table 代码在 State_1 中标记每个 ID,当它在其未来期间不再生成值时:
setDT(data1)[, State1 := ifelse(rev(cumsum(rev(Values_1 + Values_2))), State, "END"), ID]
以上代码给出了这些结果:
> data1
ID Period Values_1 Values_2 State State1
1: 1 1 5 5 X0 X0
2: 1 2 0 2 X1 X1
3: 1 3 0 0 X2 X2
4: 1 4 0 12 X1 X1
5: 2 1 1 2 X0 X0
6: 2 2 -1 0 X2 END
7: 2 3 0 1 X0 X0
8: 2 4 0 0 X0 END
9: 3 1 0 0 X2 X2
10: 3 2 0 0 X1 X1
11: 3 3 0 0 X9 X9
12: 3 4 0 2 X3 X3
13: 4 1 1 4 X2 X2
14: 4 2 2 5 X1 X1
15: 4 3 3 6 X9 X9
16: 4 4 0 0 X3 END
当我想为 ID = 2 提供这些结果时:
> data1
ID Period Values_1 Values_2 State State1
1: 1 1 5 5 X0 X0
2: 1 2 0 2 X1 X1
3: 1 3 0 0 X2 X2
4: 1 4 0 12 X1 X1
5: 2 1 1 2 X0 X0
6: 2 2 -1 0 X2 X2
7: 2 3 0 1 X0 X0
8: 2 4 0 0 X0 END
9: 3 1 0 0 X2 X2
10: 3 2 0 0 X1 X1
11: 3 3 0 0 X9 X9
12: 3 4 0 2 X3 X3
13: 4 1 1 4 X2 X2
14: 4 2 2 5 X1 X1
15: 4 3 3 6 X9 X9
16: 4 4 0 0 X3 END
为了做到这一点,我需要更改 data.table 代码以使其生效,如下所示(它不起作用),其中如果两个 ID 为 [= 的未来期间值33=] AND Values_2(单独计算)= 0,则该 ID 的 State_1 在其所有未来期间都被标记为 END。如何在 data.table 中完成?
setDT(data1)[, State1 := ifelse(rev(cumsum(rev(Values_1))) & rev(cumsum(rev(Values_2))), State, "END"), ID]
此链接与相关 post
链接 post 中的答案似乎假设 Values_1
和 Values_2
的值为 non-negative。如果有负数,则在 data.table
表达式中插入一个 abs
:
setDT(data1)[, State1 := ifelse(rev(cumsum(rev(Values_1 | Values_2))), State, "END"), ID]
也许是这样的:
f <- function(v1,v2,s) {
s[cumsum(abs(v1)+abs(v2))==0] <- "END"
s
}
setDT(data1)[order(-Period), State1:=f(Values_1, Values_2, State), by=ID]
输出:
ID Period Values_1 Values_2 State State1
1: 1 1 5 5 X0 X0
2: 1 2 0 2 X1 X1
3: 1 3 0 0 X2 X2
4: 1 4 0 12 X1 X1
5: 2 1 1 2 X0 X0
6: 2 2 -1 0 X2 X2
7: 2 3 0 1 X0 X0
8: 2 4 0 0 X0 END
9: 3 1 0 0 X2 X2
10: 3 2 0 0 X1 X1
11: 3 3 0 0 X9 X9
12: 3 4 0 2 X3 X3
13: 4 1 1 4 X2 X2
14: 4 2 2 5 X1 X1
15: 4 3 3 6 X9 X9
16: 4 4 0 0 X3 END