当某些元素为 NA 时从列表列访问结果

Question

问题：列表列包含一些缺失值

数据

考虑以下包含 2 个模型拟合结果的小标题：

> Model_fits
# A tibble: 4 x 4
    cyl               data model1    model2   
  <dbl> <list<tibble[,2]>> <list>    <list>   
1     2            [5 x 2] <dbl [1]> <dbl [1]>
2     4           [11 x 2] <lm>      <lm>     
3     6            [7 x 2] <lm>      <dbl [1]>
4     8           [14 x 2] <lm>      <lm>

此示例中缺少 cyl==2 的数据。因此，model1 在第一行包含 NA_real_。同样，model2 在第 1 行和第 3 行中包含 NA_real_。

提取模型结果

我想使用 broom::glance 提取模型拟合的结果。但由于缺少值，它不起作用：

> Model_fits %>% 
+   mutate(summary_res = map(model1, broom::glance))
Error: Problem with `mutate()` input `summary_res`.
x No glance method for objects of class numeric
i Input `summary_res` is `map(model1, broom::glance)`.

尝试解决

所以，我尝试使用 purrr::possibly，但这也不起作用：

> Model_fits %>% 
+   mutate(summary_res1 = map(model1, ~ possibly(broom::glance(.x),
+                                              otherwise = NA_real_)))
Error: Problem with `mutate()` input `summary_res1`.
x No glance method for objects of class numeric
i Input `summary_res1` is `map(model1, ~possibly(broom::glance(.x), otherwise = NA_real_))`.

预期结果

我想获得所有 non-missing 值的 broom::glance 结果和所有缺失值的 NA_real_ 结果。请指导我如何获得这些结果？

创建代码`Model_fits`

请注意，我创建了以下示例作为可重现的示例。但这不是我原来的 data/model 结果。

library(tidyverse)

new_data <- tibble(mpg = rep(NA_real_, 5),
       cyl = rep(2, 5),
       disp = rep(NA_real_, 5))

mtcars2 <- mtcars %>% 
  dplyr::select(mpg, cyl, disp)

mt <- bind_rows(mtcars2, 
                new_data)
  
model_res_list <- map(mtcars2 %>% group_split(cyl), ~lm(mpg ~ disp, data = .x))

lizt <- list(NA_real_, model_res_list[[1]], model_res_list[[2]], model_res_list[[3]])

lizt2 <- list(NA_real_, model_res_list[[1]], NA_real_, model_res_list[[3]])


Model_fits <- mt %>% 
  group_nest(cyl) %>% 
  mutate(model1 = lizt,
         model2 = lizt2)

Answer 1

您可以在传递给 map 的包装函数中检查值是否为 NA。

Model_fits %>% 
   mutate(summary_res = map(model1, function(x) if (length(x) == 1 && is.na(x)) NA_real_ else  broom::glance(x)))

Answer 2

您还可以做的一件事是使用 tryCatch 函数，这样您就可以定义在发生错误时函数的输出。在这种情况下，它不会停止函数的执行。

Model_fits %>%
  mutate(mod01 = map(model1, ~ tryCatch(glance(.x), 
                                        error = function(cond) {
                                          NA_real_
                                        }))) %>%
  unnest(mod01)

# A tibble: 4 x 17
    cyl         data model1  model2 mod01 r.squared adj.r.squared sigma statistic  p.value    df
  <dbl> <list<tibbl> <list>  <list> <dbl>     <dbl>         <dbl> <dbl>     <dbl>    <dbl> <dbl>
1     2      [5 x 2] <dbl [~ <dbl ~    NA   NA             NA     NA      NA      NA          NA
2     4     [11 x 2] <lm>    <lm>      NA    0.648          0.609  2.82   16.6     0.00278     1
3     6      [7 x 2] <lm>    <dbl ~    NA    0.0106        -0.187  1.58    0.0537  0.826       1
4     8     [14 x 2] <lm>    <lm>      NA    0.270          0.209  2.28    4.44    0.0568      1
# ... with 6 more variables: logLik <dbl>, AIC <dbl>, BIC <dbl>, deviance <dbl>,
#   df.residual <int>, nobs <int>

如果我们想使用 possibly 或 safely 而不是 tryCatch 我们应该首先编写一个自定义函数来包装 glance ，然后再应用于我们的数据设置：

poss_glance <- possibly(glance, otherwise = NA_real_)

Model_fits %>%
  mutate(mod01 = map(model1, ~ poss_glance(.x))) %>%
  unnest(mod01)

# A tibble: 4 x 17
    cyl         data model1  model2 mod01 r.squared adj.r.squared sigma statistic  p.value    df
  <dbl> <list<tibbl> <list>  <list> <dbl>     <dbl>         <dbl> <dbl>     <dbl>    <dbl> <dbl>
1     2      [5 x 2] <dbl [~ <dbl ~    NA   NA             NA     NA      NA      NA          NA
2     4     [11 x 2] <lm>    <lm>      NA    0.648          0.609  2.82   16.6     0.00278     1
3     6      [7 x 2] <lm>    <dbl ~    NA    0.0106        -0.187  1.58    0.0537  0.826       1
4     8     [14 x 2] <lm>    <lm>      NA    0.270          0.209  2.28    4.44    0.0568      1
# ... with 6 more variables: logLik <dbl>, AIC <dbl>, BIC <dbl>, deviance <dbl>,
#   df.residual <int>, nobs <int>

或者甚至我们可以使用 safely 代替 possibly 以便我们的函数 returns 在这种情况下得到增强的输出 NA_real_:

safe_glance <- safely(glance, otherwise = NA_real_)

Model_fits %>%
  mutate(mod01 = map(model1, ~ safe_glance(.x)))

当某些元素为 NA 时从列表列访问结果

Accessing results from a list column when some elements are NA

r

try-catch

purrr

tidyverse

问题：列表列包含一些缺失值

数据

提取模型结果

尝试解决

预期结果

创建代码`Model_fits`

当某些元素为 NA 时从列表列访问结果

Accessing results from a list column when some elements are NA

r

try-catch

purrr

tidyverse

问题：列表列包含一些缺失值

数据

提取模型结果

尝试解决

预期结果

创建代码Model_fits

创建代码`Model_fits`