R: data.table 按组操作转换为有条件的逻辑（以看似随机的方式）

Question

我有以下问题：

test <- data.table(v = ceiling(runif(20, 0, 5)), g = ceiling(runif(20, 0, 2)))
setorder(test, g)

test[, (paste0("n", 1:5)) := lapply(1:5, function(x) sum(v == x)), by = g]

test[, (paste0("foo", 1:3)) := lapply(1:3, function(x){ifelse(get(paste0("n", x + 1)) != 0,
                                                       get(paste0("n", x))/get(paste0("n", x + 1)), NA)}), by = g]

test

如果您运行多次执行此代码，则有时“foo”变量之一会转换为逻辑变量，这毫无意义。

感谢您的帮助！

Answer 1

原因是使用了NA，默认是NA_logical_，如果根据条件只有NA，那么它就是一个逻辑列，否则就是被强制转换为其他观察的列类型。如果我们使用 ?NA

中提到的 NA_real_ 常量，这可以解决

NA is a logical constant of length 1 which contains a missing value indicator. NA can be coerced to any other vector type except raw. There are also constants NA_integer_, NA_real_, NA_complex_ and NA_character_ of the other atomic vector types which support missing values: all of these are reserved words in the R language.

test[, (paste0("foo", 1:3)) := 
   lapply(1:3, function(x){
    ifelse(get(paste0("n", x + 1)) != 0,                                          
       get(paste0("n", x))/get(paste0("n", x + 1)), NA_real_)}), by = g]

除了使用 ifelse 并根据列类型指定正确的 NA 之外，还可以选择使用 case_when（来自 dplyr）或 data.table::fcase 默认情况下 return NA（具有适当的列类型）

test[, paste0("foo", 1:3) := lapply(1:3, 
  function(x) fcase(.SD[[paste0("n", x + 1)]] !=0, 
   .SD[[paste0("n", x)]]/.SD[[paste0("n", x + 1)]])), by = g]

-测试

lst1 <- replicate(10, {
  test <- data.table(v = ceiling(runif(20, 0, 5)),
     g = ceiling(runif(20, 0, 2)))
  setorder(test, g)
test[, (paste0("n", 1:5)) := lapply(1:5, function(x) sum(v == x)),
   by = g];test[, paste0("foo", 1:3) := lapply(1:3, 
  function(x) fcase(.SD[[paste0("n", x + 1)]] !=0, 
   .SD[[paste0("n", x)]]/.SD[[paste0("n", x + 1)]])), by = g]
}, simplify = FALSE)

-只检查一个元素 NA

> lst1[[9]]
        v     g    n1    n2    n3    n4    n5  foo1  foo2  foo3
    <num> <num> <int> <int> <int> <int> <int> <num> <num> <num>
 1:     4     1     3     1     0     2     4  3.00    NA     0
 2:     5     1     3     1     0     2     4  3.00    NA     0
 3:     1     1     3     1     0     2     4  3.00    NA     0
 4:     4     1     3     1     0     2     4  3.00    NA     0
 5:     5     1     3     1     0     2     4  3.00    NA     0
 6:     1     1     3     1     0     2     4  3.00    NA     0
 7:     5     1     3     1     0     2     4  3.00    NA     0
 8:     2     1     3     1     0     2     4  3.00    NA     0
 9:     1     1     3     1     0     2     4  3.00    NA     0
10:     5     1     3     1     0     2     4  3.00    NA     0
11:     2     2     1     4     0     1     4  0.25    NA     0
12:     1     2     1     4     0     1     4  0.25    NA     0
13:     2     2     1     4     0     1     4  0.25    NA     0
14:     5     2     1     4     0     1     4  0.25    NA     0
15:     5     2     1     4     0     1     4  0.25    NA     0
16:     2     2     1     4     0     1     4  0.25    NA     0
17:     5     2     1     4     0     1     4  0.25    NA     0
18:     4     2     1     4     0     1     4  0.25    NA     0
19:     2     2     1     4     0     1     4  0.25    NA     0
20:     5     2     1     4     0     1     4  0.25    NA     0
        v     g    n1    n2    n3    n4    n5  foo1  foo2  foo3

R: data.table 按组操作转换为有条件的逻辑（以看似随机的方式）

R: data.table converts to logical in conditional by group operation (in seemingly random way)

r

data.table