根据 R 中一个因子的水平在 data.table 中创建一个序列
Create a seq in data.table based on levels in a factor in R
我有一个data.table如下
DT <- structure(list(Seq = c(1, 2, 3, 5, 7, 9, 11, 15, 23, 67, 1, 3,
4, 5, 9, 3, 4, 6), Tpe = c("U", "U", "U", "U", "U", "U", "U",
"U", "U", "U", "Y", "Y", "Y", "Y", "Y", "D", "D", "D")), .Names = c("Seq",
"Tpe"), row.names = c(NA, 18L), class = "data.frame")
DT <- data.table(DT, key = c("Tpe", "Seq"))
DT[,Tpe:= as.factor(Tpe)]
DT
Seq Tpe
1: 3 D
2: 4 D
3: 6 D
4: 1 U
5: 2 U
6: 3 U
7: 5 U
8: 7 U
9: 9 U
10: 11 U
11: 15 U
12: 23 U
13: 67 U
14: 1 Y
15: 3 Y
16: 4 Y
17: 5 Y
18: 9 Y
考虑到 Tpe
中的所有级别,我可以更改 Seq
。但我想为 Tpe
中的每个级别独立更改 Seq
中的顺序,忽略缺失的。
DT[,Seq:= as.factor(Seq)]
DT[,Seq:= interaction(DT$Seq, DT$Tpe, drop=TRUE)]
setattr(DT$Seq,"levels", seq(from = 1, to = length(levels(DT$Seq))))
DT
Seq Tpe
1: 1 D
2: 2 D
3: 3 D
4: 4 U
5: 5 U
6: 6 U
7: 7 U
8: 8 U
9: 9 U
10: 10 U
11: 11 U
12: 12 U
13: 13 U
14: 14 Y
15: 15 Y
16: 16 Y
17: 17 Y
18: 18 Y
期望的输出如下。
out <- structure(list(Seq = c(1, 2, 3, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
1, 2, 3, 4, 5), Tpe = c("D", "D", "D", "U", "U", "U", "U", "U",
"U", "U", "U", "U", "U", "Y", "Y", "Y", "Y", "Y")), .Names = c("Seq",
"Tpe"), row.names = c(NA, 18L), class = "data.frame")
out
Seq Tpe
1 1 D
2 2 D
3 3 D
4 1 U
5 2 U
6 3 U
7 4 U
8 5 U
9 6 U
10 7 U
11 8 U
12 9 U
13 10 U
14 1 Y
15 2 Y
16 3 Y
17 4 Y
18 5 Y
你可以试试
DT[, Seq1:=1:.N , by=Tpe]
我有一个data.table如下
DT <- structure(list(Seq = c(1, 2, 3, 5, 7, 9, 11, 15, 23, 67, 1, 3,
4, 5, 9, 3, 4, 6), Tpe = c("U", "U", "U", "U", "U", "U", "U",
"U", "U", "U", "Y", "Y", "Y", "Y", "Y", "D", "D", "D")), .Names = c("Seq",
"Tpe"), row.names = c(NA, 18L), class = "data.frame")
DT <- data.table(DT, key = c("Tpe", "Seq"))
DT[,Tpe:= as.factor(Tpe)]
DT
Seq Tpe
1: 3 D
2: 4 D
3: 6 D
4: 1 U
5: 2 U
6: 3 U
7: 5 U
8: 7 U
9: 9 U
10: 11 U
11: 15 U
12: 23 U
13: 67 U
14: 1 Y
15: 3 Y
16: 4 Y
17: 5 Y
18: 9 Y
考虑到 Tpe
中的所有级别,我可以更改 Seq
。但我想为 Tpe
中的每个级别独立更改 Seq
中的顺序,忽略缺失的。
DT[,Seq:= as.factor(Seq)]
DT[,Seq:= interaction(DT$Seq, DT$Tpe, drop=TRUE)]
setattr(DT$Seq,"levels", seq(from = 1, to = length(levels(DT$Seq))))
DT
Seq Tpe
1: 1 D
2: 2 D
3: 3 D
4: 4 U
5: 5 U
6: 6 U
7: 7 U
8: 8 U
9: 9 U
10: 10 U
11: 11 U
12: 12 U
13: 13 U
14: 14 Y
15: 15 Y
16: 16 Y
17: 17 Y
18: 18 Y
期望的输出如下。
out <- structure(list(Seq = c(1, 2, 3, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
1, 2, 3, 4, 5), Tpe = c("D", "D", "D", "U", "U", "U", "U", "U",
"U", "U", "U", "U", "U", "Y", "Y", "Y", "Y", "Y")), .Names = c("Seq",
"Tpe"), row.names = c(NA, 18L), class = "data.frame")
out
Seq Tpe
1 1 D
2 2 D
3 3 D
4 1 U
5 2 U
6 3 U
7 4 U
8 5 U
9 6 U
10 7 U
11 8 U
12 9 U
13 10 U
14 1 Y
15 2 Y
16 3 Y
17 4 Y
18 5 Y
你可以试试
DT[, Seq1:=1:.N , by=Tpe]