取消列出 R 中的对象,但用 NAs 替换 numeric(0)
Unlist an object in R but replace numeric(0) with NAs
我目前有一个数据框,其中每一列都有 dimnames。在大多数情况下,这不是问题,但我最近向它添加了一个新列 qgf
,出于某种原因,它被作为列表而不是向量读入。
这是几行的示例:
> wc_results_data[12:20, 23]
$wc_1930_Uruguay
numeric(0)
$wc_1930_USA
numeric(0)
$wc_1934_Argentina
numeric(0)
$wc_1934_Austria
[1] 6
$wc_1934_Belgium
[1] 6
$wc_1934_Brazil
numeric(0)
$`wc_1934_Czech Republic/CSFR`
[1] 2
$wc_1934_Egypt
[1] 11
$wc_1934_France
[1] 6
因此,如您所见,wc_results_data[c(12:14, 17, 23]
目前编码为 numeric(0)
,而实际上它们应该是 NAs
(我应该澄清一下,有实际的 值 在我的代码中也是 0)。
这些值真正奇怪的是,如果我尝试这样测试它们,我会得到奇怪的结果:
> wc_results_data[12,23]
$wc_1930_Uruguay
numeric(0)
> identical(wc_results_data[12,23], numeric(0))
[1] FALSE
> length(wc_results_data[12,23])
[1] 1
为了尝试解决这个问题,我尝试使用 unlist
保存此列:
wc_results_data[23] <- unlist(wc_results_data[23])
但是我得到这个错误:
replacement has 368 rows, data has 425
这当然是有道理的,基本上我有 57 个观察结果 numeric(0)
但我无法摆脱它们。有没有办法 unlist
并将这些 numeric(0)
观察结果存储为 NA
s?有人可以告诉我我做错了什么吗?
根据下面的评论,这是我在 data.frame 的几行和几列上对 dput
的输出:
dput( wc_results_data[12:20, 22:24])
structure(list(fgd = structure(c(12, 1, -1, 0, -3, -2, 3, -2,
-1), .Dim = 9L, .Dimnames = list(c("wc_1930_Uruguay", "wc_1930_USA",
"wc_1934_Argentina", "wc_1934_Austria", "wc_1934_Belgium", "wc_1934_Brazil",
"wc_1934_Czech Republic/CSFR", "wc_1934_Egypt", "wc_1934_France"
))), qgf = structure(list(wc_1930_Uruguay = numeric(0), wc_1930_USA = numeric(0),
wc_1934_Argentina = numeric(0), wc_1934_Austria = 6, wc_1934_Belgium = 6,
wc_1934_Brazil = numeric(0), `wc_1934_Czech Republic/CSFR` = 2,
wc_1934_Egypt = 11, wc_1934_France = 6), .Dim = 9L, .Dimnames = list(
c("wc_1930_Uruguay", "wc_1930_USA", "wc_1934_Argentina",
"wc_1934_Austria", "wc_1934_Belgium", "wc_1934_Brazil", "wc_1934_Czech Republic/CSFR",
"wc_1934_Egypt", "wc_1934_France"))), qga = structure(list(
wc_1930_Uruguay = numeric(0), wc_1930_USA = numeric(0), wc_1934_Argentina = numeric(0),
wc_1934_Austria = 1, wc_1934_Belgium = 8, wc_1934_Brazil = numeric(0),
`wc_1934_Czech Republic/CSFR` = 1, wc_1934_Egypt = 2, wc_1934_France = 1), .Dim = 9L, .Dimnames = list(
c("wc_1930_Uruguay", "wc_1930_USA", "wc_1934_Argentina",
"wc_1934_Austria", "wc_1934_Belgium", "wc_1934_Brazil", "wc_1934_Czech Republic/CSFR",
"wc_1934_Egypt", "wc_1934_France")))), .Names = c("fgd",
"qgf", "qga"), row.names = 12:20, class = "data.frame")
如果我没有理解错的话,下面是一个dplyr
解决方案:
library(tidyverse);
df %>%
mutate(
qgf = unlist(ifelse(sapply(qgf, length) == 0, NA, qgf)),
qga = unlist(ifelse(sapply(qga, length) == 0, NA, qga)))
# fgd qgf qga
#1 12 NA NA
#2 1 NA NA
#3 -1 NA NA
#4 0 6 1
#5 -3 6 8
#6 -2 NA NA
#7 3 2 1
#8 -2 11 2
#9 -1 6 1
实际上唯一的 dplyr
依赖是 mutate
所以基础 R 解决方案同样简单:
df$qgf <- unlist(ifelse(sapply(df$qgf, length) == 0, NA, df$qgf));
df$qga <- unlist(ifelse(sapply(df$qga, length) == 0, NA, df$qga));
示例数据
df <- structure(list(fgd = structure(c(12, 1, -1, 0, -3, -2, 3, -2,
-1), .Dim = 9L, .Dimnames = list(c("wc_1930_Uruguay", "wc_1930_USA",
"wc_1934_Argentina", "wc_1934_Austria", "wc_1934_Belgium", "wc_1934_Brazil",
"wc_1934_Czech Republic/CSFR", "wc_1934_Egypt", "wc_1934_France"
))), qgf = structure(list(wc_1930_Uruguay = numeric(0), wc_1930_USA = numeric(0),
wc_1934_Argentina = numeric(0), wc_1934_Austria = 6, wc_1934_Belgium = 6,
wc_1934_Brazil = numeric(0), `wc_1934_Czech Republic/CSFR` = 2,
wc_1934_Egypt = 11, wc_1934_France = 6), .Dim = 9L, .Dimnames = list(
c("wc_1930_Uruguay", "wc_1930_USA", "wc_1934_Argentina",
"wc_1934_Austria", "wc_1934_Belgium", "wc_1934_Brazil", "wc_1934_Czech Republic/CSFR",
"wc_1934_Egypt", "wc_1934_France"))), qga = structure(list(
wc_1930_Uruguay = numeric(0), wc_1930_USA = numeric(0), wc_1934_Argentina = numeric(0),
wc_1934_Austria = 1, wc_1934_Belgium = 8, wc_1934_Brazil = numeric(0),
`wc_1934_Czech Republic/CSFR` = 1, wc_1934_Egypt = 2, wc_1934_France = 1), .Dim = 9L, .Dimnames = list(
c("wc_1930_Uruguay", "wc_1930_USA", "wc_1934_Argentina",
"wc_1934_Austria", "wc_1934_Belgium", "wc_1934_Brazil", "wc_1934_Czech Republic/CSFR",
"wc_1934_Egypt", "wc_1934_France")))), .Names = c("fgd",
"qgf", "qga"), row.names = 12:20, class = "data.frame")
我将 dput
输出分配给名称 wc_results_data
,这里是打印输出:
wc_results_data
fgd qgf qga
12 12
13 1
14 -1
15 0 6 1
16 -3 6 8
17 -2
18 3 2 1
19 -2 11 2
20 -1 6 1
在相关列上 str
的结果是:
str(wc_results_data$qgf)
List of 9
$ wc_1930_Uruguay : num(0)
$ wc_1930_USA : num(0)
$ wc_1934_Argentina : num(0)
$ wc_1934_Austria : num 6
$ wc_1934_Belgium : num 6
$ wc_1934_Brazil : num(0)
$ wc_1934_Czech Republic/CSFR: num 2
$ wc_1934_Egypt : num 11
$ wc_1934_France : num 6
- attr(*, "dim")= int 9
- attr(*, "dimnames")=List of 1
..$ : chr [1:9] "wc_1930_Uruguay" "wc_1930_USA" "wc_1934_Argentina" "wc_1934_Austria" ...
我需要在该列上使用 sapply
来 "apply" length
函数:
is.na( wc_results_data$qgf) <- sapply( wc_results_data$qgf, length) == 0
> wc_results_data
fgd qgf qga
12 12 NA
13 1 NA
14 -1 NA
15 0 6 1
16 -3 6 8
17 -2 NA
18 3 2 1
19 -2 11 2
20 -1 6 1
您可能需要先使用此方法遍历所有列,然后才能对其进行修改,使其表现得像常规数据框。仅在这些列上使用 unlist
无法生成 dataframe-able 结果。
这里有一个选项tidyverse
library(tidyverse)
df %>%
mutate_at(2:3, funs(map(., ~ .x[1]) ))
# fgd qgf qga
#1 12 NA NA
#2 1 NA NA
#3 -1 NA NA
#4 0 6 1
#5 -3 6 8
#6 -2 NA NA
#7 3 2 1
#8 -2 11 2
#9 -1 6 1
以上将列保留为 list
,但如果需要将其作为常规列,请使用 map_dbl
df %>%
mutate_at(2:3, funs(map_dbl(., ~ .x[1]) ))
我目前有一个数据框,其中每一列都有 dimnames。在大多数情况下,这不是问题,但我最近向它添加了一个新列 qgf
,出于某种原因,它被作为列表而不是向量读入。
这是几行的示例:
> wc_results_data[12:20, 23]
$wc_1930_Uruguay
numeric(0)
$wc_1930_USA
numeric(0)
$wc_1934_Argentina
numeric(0)
$wc_1934_Austria
[1] 6
$wc_1934_Belgium
[1] 6
$wc_1934_Brazil
numeric(0)
$`wc_1934_Czech Republic/CSFR`
[1] 2
$wc_1934_Egypt
[1] 11
$wc_1934_France
[1] 6
因此,如您所见,wc_results_data[c(12:14, 17, 23]
目前编码为 numeric(0)
,而实际上它们应该是 NAs
(我应该澄清一下,有实际的 值 在我的代码中也是 0)。
这些值真正奇怪的是,如果我尝试这样测试它们,我会得到奇怪的结果:
> wc_results_data[12,23]
$wc_1930_Uruguay
numeric(0)
> identical(wc_results_data[12,23], numeric(0))
[1] FALSE
> length(wc_results_data[12,23])
[1] 1
为了尝试解决这个问题,我尝试使用 unlist
保存此列:
wc_results_data[23] <- unlist(wc_results_data[23])
但是我得到这个错误:
replacement has 368 rows, data has 425
这当然是有道理的,基本上我有 57 个观察结果 numeric(0)
但我无法摆脱它们。有没有办法 unlist
并将这些 numeric(0)
观察结果存储为 NA
s?有人可以告诉我我做错了什么吗?
根据下面的评论,这是我在 data.frame 的几行和几列上对 dput
的输出:
dput( wc_results_data[12:20, 22:24])
structure(list(fgd = structure(c(12, 1, -1, 0, -3, -2, 3, -2,
-1), .Dim = 9L, .Dimnames = list(c("wc_1930_Uruguay", "wc_1930_USA",
"wc_1934_Argentina", "wc_1934_Austria", "wc_1934_Belgium", "wc_1934_Brazil",
"wc_1934_Czech Republic/CSFR", "wc_1934_Egypt", "wc_1934_France"
))), qgf = structure(list(wc_1930_Uruguay = numeric(0), wc_1930_USA = numeric(0),
wc_1934_Argentina = numeric(0), wc_1934_Austria = 6, wc_1934_Belgium = 6,
wc_1934_Brazil = numeric(0), `wc_1934_Czech Republic/CSFR` = 2,
wc_1934_Egypt = 11, wc_1934_France = 6), .Dim = 9L, .Dimnames = list(
c("wc_1930_Uruguay", "wc_1930_USA", "wc_1934_Argentina",
"wc_1934_Austria", "wc_1934_Belgium", "wc_1934_Brazil", "wc_1934_Czech Republic/CSFR",
"wc_1934_Egypt", "wc_1934_France"))), qga = structure(list(
wc_1930_Uruguay = numeric(0), wc_1930_USA = numeric(0), wc_1934_Argentina = numeric(0),
wc_1934_Austria = 1, wc_1934_Belgium = 8, wc_1934_Brazil = numeric(0),
`wc_1934_Czech Republic/CSFR` = 1, wc_1934_Egypt = 2, wc_1934_France = 1), .Dim = 9L, .Dimnames = list(
c("wc_1930_Uruguay", "wc_1930_USA", "wc_1934_Argentina",
"wc_1934_Austria", "wc_1934_Belgium", "wc_1934_Brazil", "wc_1934_Czech Republic/CSFR",
"wc_1934_Egypt", "wc_1934_France")))), .Names = c("fgd",
"qgf", "qga"), row.names = 12:20, class = "data.frame")
如果我没有理解错的话,下面是一个dplyr
解决方案:
library(tidyverse);
df %>%
mutate(
qgf = unlist(ifelse(sapply(qgf, length) == 0, NA, qgf)),
qga = unlist(ifelse(sapply(qga, length) == 0, NA, qga)))
# fgd qgf qga
#1 12 NA NA
#2 1 NA NA
#3 -1 NA NA
#4 0 6 1
#5 -3 6 8
#6 -2 NA NA
#7 3 2 1
#8 -2 11 2
#9 -1 6 1
实际上唯一的 dplyr
依赖是 mutate
所以基础 R 解决方案同样简单:
df$qgf <- unlist(ifelse(sapply(df$qgf, length) == 0, NA, df$qgf));
df$qga <- unlist(ifelse(sapply(df$qga, length) == 0, NA, df$qga));
示例数据
df <- structure(list(fgd = structure(c(12, 1, -1, 0, -3, -2, 3, -2,
-1), .Dim = 9L, .Dimnames = list(c("wc_1930_Uruguay", "wc_1930_USA",
"wc_1934_Argentina", "wc_1934_Austria", "wc_1934_Belgium", "wc_1934_Brazil",
"wc_1934_Czech Republic/CSFR", "wc_1934_Egypt", "wc_1934_France"
))), qgf = structure(list(wc_1930_Uruguay = numeric(0), wc_1930_USA = numeric(0),
wc_1934_Argentina = numeric(0), wc_1934_Austria = 6, wc_1934_Belgium = 6,
wc_1934_Brazil = numeric(0), `wc_1934_Czech Republic/CSFR` = 2,
wc_1934_Egypt = 11, wc_1934_France = 6), .Dim = 9L, .Dimnames = list(
c("wc_1930_Uruguay", "wc_1930_USA", "wc_1934_Argentina",
"wc_1934_Austria", "wc_1934_Belgium", "wc_1934_Brazil", "wc_1934_Czech Republic/CSFR",
"wc_1934_Egypt", "wc_1934_France"))), qga = structure(list(
wc_1930_Uruguay = numeric(0), wc_1930_USA = numeric(0), wc_1934_Argentina = numeric(0),
wc_1934_Austria = 1, wc_1934_Belgium = 8, wc_1934_Brazil = numeric(0),
`wc_1934_Czech Republic/CSFR` = 1, wc_1934_Egypt = 2, wc_1934_France = 1), .Dim = 9L, .Dimnames = list(
c("wc_1930_Uruguay", "wc_1930_USA", "wc_1934_Argentina",
"wc_1934_Austria", "wc_1934_Belgium", "wc_1934_Brazil", "wc_1934_Czech Republic/CSFR",
"wc_1934_Egypt", "wc_1934_France")))), .Names = c("fgd",
"qgf", "qga"), row.names = 12:20, class = "data.frame")
我将 dput
输出分配给名称 wc_results_data
,这里是打印输出:
wc_results_data
fgd qgf qga
12 12
13 1
14 -1
15 0 6 1
16 -3 6 8
17 -2
18 3 2 1
19 -2 11 2
20 -1 6 1
在相关列上 str
的结果是:
str(wc_results_data$qgf)
List of 9
$ wc_1930_Uruguay : num(0)
$ wc_1930_USA : num(0)
$ wc_1934_Argentina : num(0)
$ wc_1934_Austria : num 6
$ wc_1934_Belgium : num 6
$ wc_1934_Brazil : num(0)
$ wc_1934_Czech Republic/CSFR: num 2
$ wc_1934_Egypt : num 11
$ wc_1934_France : num 6
- attr(*, "dim")= int 9
- attr(*, "dimnames")=List of 1
..$ : chr [1:9] "wc_1930_Uruguay" "wc_1930_USA" "wc_1934_Argentina" "wc_1934_Austria" ...
我需要在该列上使用 sapply
来 "apply" length
函数:
is.na( wc_results_data$qgf) <- sapply( wc_results_data$qgf, length) == 0
> wc_results_data
fgd qgf qga
12 12 NA
13 1 NA
14 -1 NA
15 0 6 1
16 -3 6 8
17 -2 NA
18 3 2 1
19 -2 11 2
20 -1 6 1
您可能需要先使用此方法遍历所有列,然后才能对其进行修改,使其表现得像常规数据框。仅在这些列上使用 unlist
无法生成 dataframe-able 结果。
这里有一个选项tidyverse
library(tidyverse)
df %>%
mutate_at(2:3, funs(map(., ~ .x[1]) ))
# fgd qgf qga
#1 12 NA NA
#2 1 NA NA
#3 -1 NA NA
#4 0 6 1
#5 -3 6 8
#6 -2 NA NA
#7 3 2 1
#8 -2 11 2
#9 -1 6 1
以上将列保留为 list
,但如果需要将其作为常规列,请使用 map_dbl
df %>%
mutate_at(2:3, funs(map_dbl(., ~ .x[1]) ))