检查包含空向量的列表的 "emptiness"(R 无法将其识别为空列表)

Check "emptiness" of list containing empty vectors (which R does not recognise as empty list)

我有一个列表,它是数据框中一行 selection 的结果。 问题是有时 select 没有行,它 returns 是这种形式的列表:没有实际内容的非空列表。

L <- list(combattech = character(0), damage = character(0), bonus = character(0), 
          range = structure(list(close = character(0), medium = character(0), far = character(0)), 
                            row.names = integer(0), class = "data.frame"), 
          ammo = character(0), weight = character(0), name = character(0), 
          price = character(0), sf = character(0))

我想验证我是否真的有一个有意义的结果,而不是一个所有元素都是空向量的列表。但是带有空向量的列表不等同于空列表:

length(L) == 0
#> [1] FALSE

不给我 TRUE 因为长度是 9 而不是 0.

当然,我可以在选择 selection 之前简单地检查是否 length( which(...row selection...) ),通常我会这样做,但在这种情况下,我无法访问原始行索引。

all(sapply(L, length) == 0)
#> [1] FALSE

也不起作用(即 returns FALSE),因为嵌套数据结构 range returns 3.

reprex package (v0.3.0)

于 2020-06-28 创建

您可以检查列表中的元素是否是数据框并且 return 它是行 :

all(sapply(L, function(x) if(is.data.frame(x)) nrow(x) else length(x)) == 0)
#[1] TRUE

我们可以按照@user20650 的建议使用NROW,这使它变得紧凑。

all(sapply(L, NROW) == 0)

1) 我们可以使用 rapply 递归遍历结构并 return 一个平坦的结果。

all(rapply(L, length) == 0)
## [1] TRUE

2)另一种做法是unlist它先:

length(unlist(L)) == 0
## [1] TRUE

一个 purrr 解决方案使用@user20650 和@Ronak Shah 提供的基本逻辑:

every(L, ~ NROW(.) == 0)

[1] TRUE

我做了一些检查,所有提出的解决方案都适用于肯定的情况(L 为空)…

L0 <- list(combattech = character(0), damage = character(0), bonus = character(0), 
           range = structure(list(close = character(0), medium = character(0), far = character(0)), 
                             row.names = integer(0), class = "data.frame"), 
           ammo = character(0), weight = character(0), name = character(0), price = character(0), sf = character(0))

all(rapply(L0, length) == 0) # Solution 1
#> [1] TRUE
all(sapply(L0, function(x) if(is.data.frame(x)) nrow(x) else length(x)) == 0) # Solution 2
#> [1] TRUE
all(sapply(L0, NROW) == 0) # Solution 3
#> [1] TRUE
length(unlist(L0)) == 0 # Solution 4
#> [1] TRUE
require(purrr)
#> Lade nötiges Paket: purrr
every(L0, ~ NROW(.) == 0) # Solution 5
#> [1] TRUE

... 在否定的情况下(L 有内容)

L1 <- list(combattech = "ranged", damage = "1d", bonus = "+3", 
           range = structure(list(close = "20", medium = "40", far = "80"), 
                             row.names = integer(0), class = "data.frame"), 
           ammo = "arrow", weight = "1.5 Stone", name = "Bow", price = "120 silver", sf = "3/5")

all(rapply(L1, length) == 0) # Solution 1
#> [1] FALSE
all(sapply(L1, function(x) if(is.data.frame(x)) nrow(x) else length(x)) == 0) # Solution 2
#> [1] FALSE
all(sapply(L1, NROW) == 0) # Solution 3
#> [1] FALSE
length(unlist(L1)) == 0 # Solution 4
#> [1] FALSE
every(L1, ~ NROW(.) == 0) # Solution 5
#> [1] FALSE

直接使用 NROW - 但是 - 不起作用,即使我们将 L1 强制转换为数据框:

NROW(as.data.frame(L1)) == 0 # Solution 6 only works with empty lists
#> Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : Argumente implizieren unterschiedliche Anzahl Zeilen: 1, 0

我想根据他们的表现来决定一种方法,使用 两种情况都是正面和负面的例子。

require(microbenchmark)
#> Lade nötiges Paket: microbenchmark
L40 <- list(combattech = rep("ranged", 40), damage = rep(paste0(1:2, "d"), each = 20), bonus = paste0("+", 1:40), 
            range = structure(list(close = "20", medium = "40", far = "80"), row.names = integer(0), class = "data.frame"), 
           ammo = rep(c("arrow", "bolt"), 20), weight = paste0(0.5*1:40, " Stone"), name = rep(c("bow", "crossbow"), 20), price = paste(seq(10, 10*40, 10), "silver"), sf = rep("3/5", 40))
microbenchmark(
  unlist   = {length(unlist(L0)) == 0; length(unlist(L1)) == 0; length(unlist(L40)) == 0},
  rapply   = {all(rapply(L0, length) == 0); all(rapply(L1, length) == 0); all(rapply(L40, length) == 0)},
  NROW     = {all(sapply(L0, NROW) == 0); all(sapply(L0, NROW) == 0); all(sapply(L40, NROW) == 0)},
  long.one = {all(sapply(L0, function(x) if(is.data.frame(x)) nrow(x) else length(x)) == 0); all(sapply(L1, function(x) if(is.data.frame(x)) nrow(x) else length(x)) == 0); all(sapply(L40, function(x) if(is.data.frame(x)) nrow(x) else length(x)) == 0)},
  purrr    = {every(L0, ~ NROW(.) == 0); every(L1, ~ NROW(.) == 0); every(L40, ~ NROW(.) == 0)},
  times = 5E3)
#> Unit: microseconds
#>      expr  min    lq      mean median     uq    max neval
#>    unlist 81.5  83.4  84.68564   84.2  84.90 1365.7  5000
#>    rapply 27.9  31.9  36.44792   34.1  35.60 6015.9  5000
#>      NROW 51.3  56.0  60.63962   58.0  60.30 1657.4  5000
#>  long.one 61.1  67.2  72.01368   69.4  71.90 3727.1  5000
#>     purrr 97.7 108.2 116.74834  111.6 114.95 1917.5  5000

很高兴我终于添加了一个 40 行的示例。到目前为止,只有 1 行(如 L1),unlist 方法表现出最佳性能。但是有了 40 行,情况就变了。

所以,最后的推荐是:

  • 当您希望有少量行(或 none)时使用 unlist 方法。
  • 如果列表通常包含大量行并且您想过滤掉偶尔出现的空列表,请使用 rapply

reprex package (v0.3.0)

于 2020-06-28 创建