在 data.table 中逐行应用函数;将列名作为向量传递
Applying function row-wise in a data.table; passing column names as a vector
考虑一个函数 foo
如下。
foo <- function(a, b, c) {
out <- (sum(a) + sqrt(prod(c))) / sqrt(pi * b)
return(out)
}
我想将该函数应用到 data.table
DT
中,将列中的数据作为参数,根据唯一键列 ID
.[=19 按行排列=]
DT <- structure(list(ID = c("K1L1", "K1L2", "K1L3", "K2L1", "K2L2",
"K2L3", "K3L1", "K3L2", "K3L3", "K4L1", "K4L2", "K4L3", "K5L1",
"K5L2", "K5L3"), K1 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L), K2 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L), K3 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L), K4 = c(0L, 0L, 0L, 1L, 0L, 0L, 2L,
1L, 0L, 0L, 1L, 0L, 0L, 0L, 0L), K5 = c(4L, 3L, 5L, 3L, 4L, 3L,
3L, 3L, 4L, 4L, 3L, 3L, 5L, 4L, 4L), K6 = c(17L, 21L, 21L, 15L,
18L, 20L, 18L, 14L, 19L, 19L, 19L, 21L, 20L, 18L, 17L), K7 = c(10L,
11L, 11L, 13L, 11L, 10L, 9L, 12L, 12L, 12L, 10L, 11L, 12L, 13L,
10L), K8 = c(7L, 7L, 8L, 6L, 7L, 7L, 8L, 6L, 8L, 6L, 8L, 6L,
8L, 6L, 8L), K9 = c(1L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 1L,
1L, 1L, 2L, 1L), K10 = c(0L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 1L,
1L, 1L, 0L, 0L, 1L, 1L), keq = c(50, 49, 51, 51, 48, 51, 48,
47, 49, 51, 52, 48, 50, 50, 48), result = c(3.32285019941341,
3.75957814378025, 3.85756018427585, 3.51276824014721, 3.55423728741272,
3.52711899186614, 3.82738634323954, 3.49460484846665, 3.85490005446497,
3.7497752713846, 3.58557114276955, 3.61968872352116, 3.89594481311228,
3.78708738710968, 3.56911326431751)), class = "data.frame", row.names = c(NA,
-15L))
library(data.table)
setDT(DT)
DT
ID K1 K2 K3 K4 K5 K6 K7 K8 K9 K10 keq
1 K1L1 0 0 0 0 4 17 10 7 1 0 50
2 K1L2 0 0 0 0 3 21 11 7 1 1 49
3 K1L3 0 0 0 0 5 21 11 8 1 0 51
4 K2L1 0 0 0 1 3 15 13 6 2 1 51
5 K2L2 0 0 0 0 4 18 11 7 1 0 48
6 K2L3 0 0 0 0 3 20 10 7 1 1 51
7 K3L1 0 0 0 2 3 18 9 8 2 1 48
8 K3L2 0 0 0 1 3 14 12 6 2 1 47
9 K3L3 0 0 0 0 4 19 12 8 1 1 49
10 K4L1 0 0 0 0 4 19 12 6 2 1 51
11 K4L2 0 0 0 1 3 19 10 8 1 1 52
12 K4L3 0 0 0 0 3 21 11 6 1 0 48
13 K5L1 0 0 0 0 5 20 12 8 1 0 50
14 K5L2 0 0 0 0 4 18 13 6 2 1 50
15 K5L3 0 0 0 0 4 17 10 8 1 1 48
我用通常的语法得到了想要的结果。
DT[, result := foo(a = c(K1, K2, K3, K4, K5, K6, K7, K8, K9, K10),
b = keq, c = c(K8, K9)), by = "ID"]
DT
ID K1 K2 K3 K4 K5 K6 K7 K8 K9 K10 keq result
1: K1L1 0 0 0 0 4 17 10 7 1 0 50 3.322850
2: K1L2 0 0 0 0 3 21 11 7 1 1 49 3.759578
3: K1L3 0 0 0 0 5 21 11 8 1 0 51 3.857560
4: K2L1 0 0 0 1 3 15 13 6 2 1 51 3.512768
5: K2L2 0 0 0 0 4 18 11 7 1 0 48 3.554237
6: K2L3 0 0 0 0 3 20 10 7 1 1 51 3.527119
7: K3L1 0 0 0 2 3 18 9 8 2 1 48 3.827386
8: K3L2 0 0 0 1 3 14 12 6 2 1 47 3.494605
9: K3L3 0 0 0 0 4 19 12 8 1 1 49 3.854900
10: K4L1 0 0 0 0 4 19 12 6 2 1 51 3.749775
11: K4L2 0 0 0 1 3 19 10 8 1 1 52 3.585571
12: K4L3 0 0 0 0 3 21 11 6 1 0 48 3.619689
13: K5L1 0 0 0 0 5 20 12 8 1 0 50 3.895945
14: K5L2 0 0 0 0 4 18 13 6 2 1 50 3.787087
15: K5L3 0 0 0 0 4 17 10 8 1 1 48 3.569113
x <- c(0, 0, 0, 0, 4, 17, 10, 7, 1, 0)
y <- 50
z <- c(7, 1)
foo(x, y, z)
[1] 3.32285
但是当我尝试将参数作为列名向量传递时,我没有得到正确的结果。
DT[, result := foo(a = get(acol), b = get(bcol), c = get(ccol)), by = "ID"]
DT
ID K1 K2 K3 K4 K5 K6 K7 K8 K9 K10 keq result
1: K1L1 0 0 0 0 4 17 10 7 1 0 50 0.2111004
2: K1L2 0 0 0 0 3 21 11 7 1 1 49 0.2132436
3: K1L3 0 0 0 0 5 21 11 8 1 0 51 0.2234524
4: K2L1 0 0 0 1 3 15 13 6 2 1 51 0.1935154
5: K2L2 0 0 0 0 4 18 11 7 1 0 48 0.2154535
6: K2L3 0 0 0 0 3 20 10 7 1 1 51 0.2090206
7: K3L1 0 0 0 2 3 18 9 8 2 1 48 0.2303294
8: K3L2 0 0 0 1 3 14 12 6 2 1 47 0.2015820
9: K3L3 0 0 0 0 4 19 12 8 1 1 49 0.2279670
10: K4L1 0 0 0 0 4 19 12 6 2 1 51 0.1935154
11: K4L2 0 0 0 1 3 19 10 8 1 1 52 0.2212934
12: K4L3 0 0 0 0 3 21 11 6 1 0 48 0.1994711
13: K5L1 0 0 0 0 5 20 12 8 1 0 50 0.2256758
14: K5L2 0 0 0 0 4 18 13 6 2 1 50 0.1954410
15: K5L3 0 0 0 0 4 17 10 8 1 1 48 0.2303294
我哪里错了?
试试这个:
DT[, result := foo(a = unlist(mget(acol)),
b = unlist(mget(bcol)),
c = unlist(mget(ccol))), by = "ID"]
使用过的物品(除了DT
)
acol <- paste0("K", 1:10)
bcol <- "keq"
ccol <- c("K8", "K9")
考虑一个函数 foo
如下。
foo <- function(a, b, c) {
out <- (sum(a) + sqrt(prod(c))) / sqrt(pi * b)
return(out)
}
我想将该函数应用到 data.table
DT
中,将列中的数据作为参数,根据唯一键列 ID
.[=19 按行排列=]
DT <- structure(list(ID = c("K1L1", "K1L2", "K1L3", "K2L1", "K2L2",
"K2L3", "K3L1", "K3L2", "K3L3", "K4L1", "K4L2", "K4L3", "K5L1",
"K5L2", "K5L3"), K1 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L), K2 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L), K3 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L), K4 = c(0L, 0L, 0L, 1L, 0L, 0L, 2L,
1L, 0L, 0L, 1L, 0L, 0L, 0L, 0L), K5 = c(4L, 3L, 5L, 3L, 4L, 3L,
3L, 3L, 4L, 4L, 3L, 3L, 5L, 4L, 4L), K6 = c(17L, 21L, 21L, 15L,
18L, 20L, 18L, 14L, 19L, 19L, 19L, 21L, 20L, 18L, 17L), K7 = c(10L,
11L, 11L, 13L, 11L, 10L, 9L, 12L, 12L, 12L, 10L, 11L, 12L, 13L,
10L), K8 = c(7L, 7L, 8L, 6L, 7L, 7L, 8L, 6L, 8L, 6L, 8L, 6L,
8L, 6L, 8L), K9 = c(1L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 1L,
1L, 1L, 2L, 1L), K10 = c(0L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 1L,
1L, 1L, 0L, 0L, 1L, 1L), keq = c(50, 49, 51, 51, 48, 51, 48,
47, 49, 51, 52, 48, 50, 50, 48), result = c(3.32285019941341,
3.75957814378025, 3.85756018427585, 3.51276824014721, 3.55423728741272,
3.52711899186614, 3.82738634323954, 3.49460484846665, 3.85490005446497,
3.7497752713846, 3.58557114276955, 3.61968872352116, 3.89594481311228,
3.78708738710968, 3.56911326431751)), class = "data.frame", row.names = c(NA,
-15L))
library(data.table)
setDT(DT)
DT
ID K1 K2 K3 K4 K5 K6 K7 K8 K9 K10 keq
1 K1L1 0 0 0 0 4 17 10 7 1 0 50
2 K1L2 0 0 0 0 3 21 11 7 1 1 49
3 K1L3 0 0 0 0 5 21 11 8 1 0 51
4 K2L1 0 0 0 1 3 15 13 6 2 1 51
5 K2L2 0 0 0 0 4 18 11 7 1 0 48
6 K2L3 0 0 0 0 3 20 10 7 1 1 51
7 K3L1 0 0 0 2 3 18 9 8 2 1 48
8 K3L2 0 0 0 1 3 14 12 6 2 1 47
9 K3L3 0 0 0 0 4 19 12 8 1 1 49
10 K4L1 0 0 0 0 4 19 12 6 2 1 51
11 K4L2 0 0 0 1 3 19 10 8 1 1 52
12 K4L3 0 0 0 0 3 21 11 6 1 0 48
13 K5L1 0 0 0 0 5 20 12 8 1 0 50
14 K5L2 0 0 0 0 4 18 13 6 2 1 50
15 K5L3 0 0 0 0 4 17 10 8 1 1 48
我用通常的语法得到了想要的结果。
DT[, result := foo(a = c(K1, K2, K3, K4, K5, K6, K7, K8, K9, K10),
b = keq, c = c(K8, K9)), by = "ID"]
DT
ID K1 K2 K3 K4 K5 K6 K7 K8 K9 K10 keq result
1: K1L1 0 0 0 0 4 17 10 7 1 0 50 3.322850
2: K1L2 0 0 0 0 3 21 11 7 1 1 49 3.759578
3: K1L3 0 0 0 0 5 21 11 8 1 0 51 3.857560
4: K2L1 0 0 0 1 3 15 13 6 2 1 51 3.512768
5: K2L2 0 0 0 0 4 18 11 7 1 0 48 3.554237
6: K2L3 0 0 0 0 3 20 10 7 1 1 51 3.527119
7: K3L1 0 0 0 2 3 18 9 8 2 1 48 3.827386
8: K3L2 0 0 0 1 3 14 12 6 2 1 47 3.494605
9: K3L3 0 0 0 0 4 19 12 8 1 1 49 3.854900
10: K4L1 0 0 0 0 4 19 12 6 2 1 51 3.749775
11: K4L2 0 0 0 1 3 19 10 8 1 1 52 3.585571
12: K4L3 0 0 0 0 3 21 11 6 1 0 48 3.619689
13: K5L1 0 0 0 0 5 20 12 8 1 0 50 3.895945
14: K5L2 0 0 0 0 4 18 13 6 2 1 50 3.787087
15: K5L3 0 0 0 0 4 17 10 8 1 1 48 3.569113
x <- c(0, 0, 0, 0, 4, 17, 10, 7, 1, 0)
y <- 50
z <- c(7, 1)
foo(x, y, z)
[1] 3.32285
但是当我尝试将参数作为列名向量传递时,我没有得到正确的结果。
DT[, result := foo(a = get(acol), b = get(bcol), c = get(ccol)), by = "ID"]
DT
ID K1 K2 K3 K4 K5 K6 K7 K8 K9 K10 keq result
1: K1L1 0 0 0 0 4 17 10 7 1 0 50 0.2111004
2: K1L2 0 0 0 0 3 21 11 7 1 1 49 0.2132436
3: K1L3 0 0 0 0 5 21 11 8 1 0 51 0.2234524
4: K2L1 0 0 0 1 3 15 13 6 2 1 51 0.1935154
5: K2L2 0 0 0 0 4 18 11 7 1 0 48 0.2154535
6: K2L3 0 0 0 0 3 20 10 7 1 1 51 0.2090206
7: K3L1 0 0 0 2 3 18 9 8 2 1 48 0.2303294
8: K3L2 0 0 0 1 3 14 12 6 2 1 47 0.2015820
9: K3L3 0 0 0 0 4 19 12 8 1 1 49 0.2279670
10: K4L1 0 0 0 0 4 19 12 6 2 1 51 0.1935154
11: K4L2 0 0 0 1 3 19 10 8 1 1 52 0.2212934
12: K4L3 0 0 0 0 3 21 11 6 1 0 48 0.1994711
13: K5L1 0 0 0 0 5 20 12 8 1 0 50 0.2256758
14: K5L2 0 0 0 0 4 18 13 6 2 1 50 0.1954410
15: K5L3 0 0 0 0 4 17 10 8 1 1 48 0.2303294
我哪里错了?
试试这个:
DT[, result := foo(a = unlist(mget(acol)),
b = unlist(mget(bcol)),
c = unlist(mget(ccol))), by = "ID"]
使用过的物品(除了DT
)
acol <- paste0("K", 1:10)
bcol <- "keq"
ccol <- c("K8", "K9")