取消列出 R 中的对象,但用 NAs 替换 numeric(0)

Unlist an object in R but replace numeric(0) with NAs

我目前有一个数据框,其中每一列都有 dimnames。在大多数情况下,这不是问题,但我最近向它添加了一个新列 qgf,出于某种原因,它被作为列表而不是向量读入。

这是几行的示例:

> wc_results_data[12:20, 23]
$wc_1930_Uruguay
numeric(0)

$wc_1930_USA
numeric(0)

$wc_1934_Argentina
numeric(0)

$wc_1934_Austria
[1] 6

$wc_1934_Belgium
[1] 6

$wc_1934_Brazil
numeric(0)

$`wc_1934_Czech Republic/CSFR`
[1] 2

$wc_1934_Egypt
[1] 11

$wc_1934_France
[1] 6

因此,如您所见,wc_results_data[c(12:14, 17, 23] 目前编码为 numeric(0),而实际上它们应该是 NAs(我应该澄清一下,有实际的 在我的代码中也是 0)。

这些值真正奇怪的是,如果我尝试这样测试它们,我会得到奇怪的结果:

> wc_results_data[12,23]
$wc_1930_Uruguay
numeric(0)
> identical(wc_results_data[12,23], numeric(0))
[1] FALSE
> length(wc_results_data[12,23])
[1] 1

为了尝试解决这个问题,我尝试使用 unlist 保存此列:

wc_results_data[23] <- unlist(wc_results_data[23])

但是我得到这个错误:

replacement has 368 rows, data has 425

这当然是有道理的,基本上我有 57 个观察结果 numeric(0) 但我无法摆脱它们。有没有办法 unlist 并将这些 numeric(0) 观察结果存储为 NAs?有人可以告诉我我做错了什么吗?

根据下面的评论,这是我在 data.frame 的几行和几列上对 dput 的输出:

dput( wc_results_data[12:20, 22:24])
structure(list(fgd = structure(c(12, 1, -1, 0, -3, -2, 3, -2, 
-1), .Dim = 9L, .Dimnames = list(c("wc_1930_Uruguay", "wc_1930_USA", 
"wc_1934_Argentina", "wc_1934_Austria", "wc_1934_Belgium", "wc_1934_Brazil", 
"wc_1934_Czech Republic/CSFR", "wc_1934_Egypt", "wc_1934_France"
))), qgf = structure(list(wc_1930_Uruguay = numeric(0), wc_1930_USA = numeric(0), 
    wc_1934_Argentina = numeric(0), wc_1934_Austria = 6, wc_1934_Belgium = 6, 
    wc_1934_Brazil = numeric(0), `wc_1934_Czech Republic/CSFR` = 2, 
    wc_1934_Egypt = 11, wc_1934_France = 6), .Dim = 9L, .Dimnames = list(
    c("wc_1930_Uruguay", "wc_1930_USA", "wc_1934_Argentina", 
    "wc_1934_Austria", "wc_1934_Belgium", "wc_1934_Brazil", "wc_1934_Czech Republic/CSFR", 
    "wc_1934_Egypt", "wc_1934_France"))), qga = structure(list(
    wc_1930_Uruguay = numeric(0), wc_1930_USA = numeric(0), wc_1934_Argentina = numeric(0), 
    wc_1934_Austria = 1, wc_1934_Belgium = 8, wc_1934_Brazil = numeric(0), 
    `wc_1934_Czech Republic/CSFR` = 1, wc_1934_Egypt = 2, wc_1934_France = 1), .Dim = 9L, .Dimnames = list(
    c("wc_1930_Uruguay", "wc_1930_USA", "wc_1934_Argentina", 
    "wc_1934_Austria", "wc_1934_Belgium", "wc_1934_Brazil", "wc_1934_Czech Republic/CSFR", 
    "wc_1934_Egypt", "wc_1934_France")))), .Names = c("fgd", 
"qgf", "qga"), row.names = 12:20, class = "data.frame")

如果我没有理解错的话,下面是一个dplyr解决方案:

library(tidyverse);
df %>%
    mutate(
        qgf = unlist(ifelse(sapply(qgf, length) == 0, NA, qgf)),
        qga = unlist(ifelse(sapply(qga, length) == 0, NA, qga)))
#  fgd qgf qga
#1  12  NA  NA
#2   1  NA  NA
#3  -1  NA  NA
#4   0   6   1
#5  -3   6   8
#6  -2  NA  NA
#7   3   2   1
#8  -2  11   2
#9  -1   6   1

实际上唯一的 dplyr 依赖是 mutate 所以基础 R 解决方案同样简单:

df$qgf <- unlist(ifelse(sapply(df$qgf, length) == 0, NA, df$qgf));
df$qga <- unlist(ifelse(sapply(df$qga, length) == 0, NA, df$qga));

示例数据

df <- structure(list(fgd = structure(c(12, 1, -1, 0, -3, -2, 3, -2,
-1), .Dim = 9L, .Dimnames = list(c("wc_1930_Uruguay", "wc_1930_USA",
"wc_1934_Argentina", "wc_1934_Austria", "wc_1934_Belgium", "wc_1934_Brazil",
"wc_1934_Czech Republic/CSFR", "wc_1934_Egypt", "wc_1934_France"
))), qgf = structure(list(wc_1930_Uruguay = numeric(0), wc_1930_USA = numeric(0),
    wc_1934_Argentina = numeric(0), wc_1934_Austria = 6, wc_1934_Belgium = 6,
    wc_1934_Brazil = numeric(0), `wc_1934_Czech Republic/CSFR` = 2,
    wc_1934_Egypt = 11, wc_1934_France = 6), .Dim = 9L, .Dimnames = list(
    c("wc_1930_Uruguay", "wc_1930_USA", "wc_1934_Argentina",
    "wc_1934_Austria", "wc_1934_Belgium", "wc_1934_Brazil", "wc_1934_Czech Republic/CSFR",
    "wc_1934_Egypt", "wc_1934_France"))), qga = structure(list(
    wc_1930_Uruguay = numeric(0), wc_1930_USA = numeric(0), wc_1934_Argentina = numeric(0),
    wc_1934_Austria = 1, wc_1934_Belgium = 8, wc_1934_Brazil = numeric(0),
    `wc_1934_Czech Republic/CSFR` = 1, wc_1934_Egypt = 2, wc_1934_France = 1), .Dim = 9L, .Dimnames = list(
    c("wc_1930_Uruguay", "wc_1930_USA", "wc_1934_Argentina",
    "wc_1934_Austria", "wc_1934_Belgium", "wc_1934_Brazil", "wc_1934_Czech Republic/CSFR",
    "wc_1934_Egypt", "wc_1934_France")))), .Names = c("fgd",
"qgf", "qga"), row.names = 12:20, class = "data.frame")

我将 dput 输出分配给名称 wc_results_data,这里是打印输出:

wc_results_data
   fgd qgf qga
12  12        
13   1        
14  -1        
15   0   6   1
16  -3   6   8
17  -2        
18   3   2   1
19  -2  11   2
20  -1   6   1

在相关列上 str 的结果是:

str(wc_results_data$qgf)
List of 9
 $ wc_1930_Uruguay            : num(0) 
 $ wc_1930_USA                : num(0) 
 $ wc_1934_Argentina          : num(0) 
 $ wc_1934_Austria            : num 6
 $ wc_1934_Belgium            : num 6
 $ wc_1934_Brazil             : num(0) 
 $ wc_1934_Czech Republic/CSFR: num 2
 $ wc_1934_Egypt              : num 11
 $ wc_1934_France             : num 6
 - attr(*, "dim")= int 9
 - attr(*, "dimnames")=List of 1
  ..$ : chr [1:9] "wc_1930_Uruguay" "wc_1930_USA" "wc_1934_Argentina" "wc_1934_Austria" ...

我需要在该列上使用 sapply 来 "apply" length 函数:

is.na( wc_results_data$qgf) <- sapply( wc_results_data$qgf, length) == 0
> wc_results_data
   fgd qgf qga
12  12  NA    
13   1  NA    
14  -1  NA    
15   0   6   1
16  -3   6   8
17  -2  NA    
18   3   2   1
19  -2  11   2
20  -1   6   1

您可能需要先使用此方法遍历所有列,然后才能对其进行修改,使其表现得像常规数据框。仅在这些列上使用 unlist 无法生成 dataframe-able 结果。

这里有一个选项tidyverse

library(tidyverse)
df %>% 
   mutate_at(2:3, funs(map(., ~ .x[1]) ))
#  fgd qgf qga
#1  12  NA  NA
#2   1  NA  NA
#3  -1  NA  NA
#4   0   6   1
#5  -3   6   8
#6  -2  NA  NA
#7   3   2   1
#8  -2  11   2
#9  -1   6   1

以上将列保留为 list,但如果需要将其作为常规列,请使用 map_dbl

df %>%
    mutate_at(2:3, funs(map_dbl(., ~ .x[1]) ))