How to solve Error: Selections can't have missing values

How to solve Error: Selections can't have missing values

我编写了一个运行良好的代码。但它在函数内不起作用。

我的示例数据如下:

set.seed(34)
children <- data.frame(
  ID = 1:100,
  gender = as.integer(sample(c(1,2),100,replace = TRUE)),
  height = ifelse(children$gender=="1", sample(120:140), sample(110:130)),
  weight = ifelse(children$gender=="1", sample(25:35), sample(15:25)),
  ave_sleep = ifelse(children$gender=="1" & children$height > 130, sample(7:9),
                     ifelse(children$gender=="1" & children$height <= 130, sample(4:6),
                            ifelse(children$gender=="2" & children$height > 120, sample(7:9), sample(4:6)))))
childrenNA <- bind_cols(children[1],missForest::prodNA(children[-1],noNA=0.1))

我下面的代码工作正常。

childrenNA %>%
  gather(-gender, key="key", value="val") %>%
  mutate(missing=is.na(val)) %>%
  mutate(gender=coalesce(gender, 0)) %>%
  filter(missing==TRUE) %>%
  group_by(gender, key, missing) %>%
  ggplot() +
  stat_count(aes(y=key)) +
  facet_wrap(~gender) +
  labs(x='no_missing_values', y="variable") +
  coord_flip()

但是,我的代码出现 错误:函数中的选择不能有缺失值。下面是我创建一个函数所做的。

miss_group <- function(df, facet) {
  df %>%
    gather(-facet, key="key", value="val") %>%
    mutate(missing=is.na(val)) %>%
    mutate(facet=coalesce(facet, 0)) %>%
    filter(missing==TRUE) %>%
    group_by(facet, key, missing) %>%
    ggplot() +
    stat_count(aes(y=key)) +
    facet_wrap(~facet) +
    labs(x='no_missing_values', y="variable") +
    coord_flip()
}

你能帮我解决这个错误吗?

您的数据生成代码不起作用,因为在创建此数据框(和这些变量)之前无法评估数据框中的变量(如 children$gender == 1 等)。我更新了您的代码以使其可重现:

#packages
library(tidyr)
library(dplyr)
library(ggplot2)

#make data set
set.seed(34)
children <- data.frame(ID = 1:100, gender = as.integer(sample(c(1,2),100, replace = TRUE)),
                        height = NA, weight = NA, ave_sleep = NA)
children$height <- ifelse(children$gender==1, sample(120:140), sample(110:130))
children$weight <- ifelse(children$gender==1, sample(25:35), sample(15:25))
children$ave_sleep <- ifelse(children$gender==1 & children$height > 130, sample(7:9),
                             ifelse(children$gender==1 & children$height <= 130, sample(4:6),
                             ifelse(children$gender==2 & children$height > 120, sample(7:9), sample(4:6))))
childrenNA <- bind_cols(children[1],missForest::prodNA(children[-1],noNA=0.1))

我无法复制您的确切错误消息。但我认为问题在于您如何尝试将参数 facet 传递给您的函数,以及该参数随后如何在函数中使用。我假设您想将变量的名称提交给函数,例如miss_group(df, gender)。但在函数内,这个名称应该用于索引 df 中的相应列。一种方法是使用 enquo()!!。我不确定这是执行此操作的最佳方法,但它确实有助于生成您想要的情节。

#create plot with function
miss_group <- function(df, facetname) {
  facet <- enquo(facetname)
  df %>%
    gather(-!!facet, key="key", value="val") %>%
    mutate(missing=is.na(val)) %>%
    mutate(facet=coalesce(!!facet, 0)) %>%
    filter(missing==TRUE) %>%
    group_by(!!facet, key, missing) %>%
    ggplot() +
    stat_count(aes(y=key)) +
    facet_wrap(~facet) +
    labs(x='no_missing_values', y="variable") +
    coord_flip()
}


#create plot with function
miss_group(childrenNA, gender)