R data.table 通过引用添加具有其他列值的新列
R data.table add new column with values from other columns by referencing
我有一个示例 data.table
如下:
> dt = data.table("Label" = rep(LETTERS[1:3], 3),
+ "Col_A" = c(2,3,5,0,2,7,6,8,9),
+ "Col_B" = c(1,4,3,5,2,0,7,5,8),
+ "Col_C" = c(2,0,4,1,5,6,7,3,0))
> dt[order(Label)]
Label Col_A Col_B Col_C
1: A 2 1 2
2: A 0 5 1
3: A 6 7 7
4: B 3 4 0
5: B 2 2 5
6: B 8 5 3
7: C 5 3 4
8: C 7 0 6
9: C 9 8 0
我想创建一个新列,它根据标签列从现有列中获取值。我想要的示例输出如下:
Label Col_A Col_B Col_C Newcol
1: A 2 1 2 2
2: A 0 5 1 0
3: A 6 7 7 6
4: B 3 4 0 4
5: B 2 2 5 2
6: B 8 5 3 5
7: C 5 3 4 4
8: C 7 0 6 6
9: C 9 8 0 0
逻辑是Newcol
值引用基于Label
列的各个列。例如Label
列的前3行是A
,那么Newcol
列的前3行就是指Col_A
列的前3行。
我试过使用代码 dt[, `:=` ("Newcol" = eval(as.symbol(paste0("Col_", dt$Label))))]
但它没有给出所需的输出。
我们可以使用 kit
包的矢量化开关函数,它像 data.table
一样是 fastverse
.
的一部分
dt[, "Newcol" := kit::vswitch(Label, c("A", "B", "C"), list(Col_A, Col_B, Col_C))]
# or if you want to pass column indices
dt[, "Newcol" := kit::vswitch(Label, c("A", "B", "C"), dt[,2:4])]
dt
Label Col_A Col_B Col_C Newcol
1: A 2 1 2 2
2: A 0 5 1 0
3: A 6 7 7 6
4: B 3 4 0 4
5: B 2 2 5 2
6: B 8 5 3 5
7: C 5 3 4 4
8: C 7 0 6 6
9: C 9 8 0 0
library(data.table)
dt = data.table("Label" = rep(LETTERS[1:3], 3),
"Col_A" = c(2,3,5,0,2,7,6,8,9),
"Col_B" = c(1,4,3,5,2,0,7,5,8),
"Col_C" = c(2,0,4,1,5,6,7,3,0))
dt[, new := ifelse(Label == "A", Col_A, NA)]
dt[, new := ifelse(Label == "B", Col_B, new)]
dt[, new := ifelse(Label == "C", Col_C, new)]
如果您能够使用 dplyr 库,我会使用那里的 case_when 函数。
dt$newCol <- case_when(dt$Col_A == 'A' ~ Col_A, dt$Col_A == 'B' ~ Col_B, dt$Col_A == 'C' ~ Col_C)
我还没有测试过那个代码,但应该是这样的。
与fcase
:
cols <- unique(dt$Label)
dt[,newCol:=eval(parse(text=paste('fcase(',paste0("Label=='",cols,"',Col_",cols,collapse=','),')')))][]
Label Col_A Col_B Col_C newCol
<char> <num> <num> <num> <num>
1: A 2 1 2 2
2: B 3 4 0 4
3: C 5 3 4 4
4: A 0 5 1 0
5: B 2 2 5 2
6: C 7 0 6 6
7: A 6 7 7 6
8: B 8 5 3 5
9: C 9 8 0 0
我有一个示例 data.table
如下:
> dt = data.table("Label" = rep(LETTERS[1:3], 3),
+ "Col_A" = c(2,3,5,0,2,7,6,8,9),
+ "Col_B" = c(1,4,3,5,2,0,7,5,8),
+ "Col_C" = c(2,0,4,1,5,6,7,3,0))
> dt[order(Label)]
Label Col_A Col_B Col_C
1: A 2 1 2
2: A 0 5 1
3: A 6 7 7
4: B 3 4 0
5: B 2 2 5
6: B 8 5 3
7: C 5 3 4
8: C 7 0 6
9: C 9 8 0
我想创建一个新列,它根据标签列从现有列中获取值。我想要的示例输出如下:
Label Col_A Col_B Col_C Newcol
1: A 2 1 2 2
2: A 0 5 1 0
3: A 6 7 7 6
4: B 3 4 0 4
5: B 2 2 5 2
6: B 8 5 3 5
7: C 5 3 4 4
8: C 7 0 6 6
9: C 9 8 0 0
逻辑是Newcol
值引用基于Label
列的各个列。例如Label
列的前3行是A
,那么Newcol
列的前3行就是指Col_A
列的前3行。
我试过使用代码 dt[, `:=` ("Newcol" = eval(as.symbol(paste0("Col_", dt$Label))))]
但它没有给出所需的输出。
我们可以使用 kit
包的矢量化开关函数,它像 data.table
一样是 fastverse
.
dt[, "Newcol" := kit::vswitch(Label, c("A", "B", "C"), list(Col_A, Col_B, Col_C))]
# or if you want to pass column indices
dt[, "Newcol" := kit::vswitch(Label, c("A", "B", "C"), dt[,2:4])]
dt
Label Col_A Col_B Col_C Newcol
1: A 2 1 2 2
2: A 0 5 1 0
3: A 6 7 7 6
4: B 3 4 0 4
5: B 2 2 5 2
6: B 8 5 3 5
7: C 5 3 4 4
8: C 7 0 6 6
9: C 9 8 0 0
library(data.table)
dt = data.table("Label" = rep(LETTERS[1:3], 3),
"Col_A" = c(2,3,5,0,2,7,6,8,9),
"Col_B" = c(1,4,3,5,2,0,7,5,8),
"Col_C" = c(2,0,4,1,5,6,7,3,0))
dt[, new := ifelse(Label == "A", Col_A, NA)]
dt[, new := ifelse(Label == "B", Col_B, new)]
dt[, new := ifelse(Label == "C", Col_C, new)]
如果您能够使用 dplyr 库,我会使用那里的 case_when 函数。
dt$newCol <- case_when(dt$Col_A == 'A' ~ Col_A, dt$Col_A == 'B' ~ Col_B, dt$Col_A == 'C' ~ Col_C)
我还没有测试过那个代码,但应该是这样的。
与fcase
:
cols <- unique(dt$Label)
dt[,newCol:=eval(parse(text=paste('fcase(',paste0("Label=='",cols,"',Col_",cols,collapse=','),')')))][]
Label Col_A Col_B Col_C newCol
<char> <num> <num> <num> <num>
1: A 2 1 2 2
2: B 3 4 0 4
3: C 5 3 4 4
4: A 0 5 1 0
5: B 2 2 5 2
6: C 7 0 6 6
7: A 6 7 7 6
8: B 8 5 3 5
9: C 9 8 0 0