不明白 Cannot Coerce type 'closure' 错误

Don't understand Cannot Coerce type 'closure' Error

我看到这是一个常见问题,但我无法通过阅读其他帖子或尝试了解对我来说是新的函数式编程来了解该怎么做。函数是 R 中的闭包,封装了它们创建的环境?我的代码是:

# Remove numbers from text
minus_TextNum <- function(df, new.df){
  new.df <- mutate(df, text = gsub(x = text, pattern = "[0-9]+|\(.*\)", replacement = "")) %>%  # and/or whatever's in brackets
    unnest_tokens(input = text, output = word) %>% 
    filter(!word %in% c(stop_words$word, "patient")) %>% 
    group_by(id) %>% 
    summarise(text = paste(word, collapse = " "))
  return(new.df)
}

minus_TextNum(TidySymptoms)

错误如下:

Error: Problem with mutate() column text. ℹ text = gsub(x = text, pattern = "[0-9]+|\(.*\)", replacement = ""). x cannot coerce type 'closure' to vector of type 'character'

我不明白什么是类型闭包,这是一个简单的函数,适用于我创建的用于测试的简单数据集。当我使用真实世界的数据集时出现问题。

感谢任何反馈。以下可重现示例:

# Remove numbers and/or anything in brackets

# Test Data
mydata <- data.frame(id = 1:8,
                     text = c("112773 Nissan Micra, Car, (10 pcs)",
                              "112774 Nissan Micra, Car, (10 pcs)",
                              "112775 Nissan Micra, Car, (10 pcs)",
                              "112776 Volkswagon Beetle, Car, (3 pcs)",
                              "112777 Toyota Corolla, Car, (12 pcs)",
                              "112778 Nissan Micra, Car, (10 pcs)",
                              "112779 Toyota Prius, Car, (9 pcs)",
                              "112780 Toyota Corolla, Car, (12 pcs)"),
                     stringsAsFactors = F)

library(dplyr)
library(tidytext)

# remove numbers from text data
data(stop_words)
minus_TextNum <- function(df, new.df){
  new.df <- mutate(df, text = gsub(x = text, pattern = "[0-9]+|\(.*\)", replacement = "")) %>%  # and/or whatevers in brackets
    unnest_tokens(input = text, output = word) %>% 
    filter(!word %in% c(stop_words$word, "car")) %>% 
    group_by(id) %>% 
    summarise(text = paste(word, collapse = " "))
  return(new.df)
}


minus_TextNum(mydata)

dput(head(TidySymptoms, n = 10)) structure(list(word = c("epiglottis", "swelled", "hinder", "swallowing", "pictures", "benadryl", "tylenol", "approximately", "30", "min" )), row.names = c(NA, 10L), class = "data.frame")

TidySymptoms 数据中没有 id 列。假设这是一个错误,并且您的数据中已有该错误,您可以在函数中进行以下更改。

  • 不需要将df.new传递给函数。
  • TidySymptoms 中的列称为 word 但您在函数中使用 text

试试这个代码。

minus_TextNum <- function(df){

  df.new <- mutate(df, text = gsub(x = word, pattern = "[0-9]+|\(.*\)", replacement = "")) %>%  
    unnest_tokens(input = text, output = word) %>% 
    filter(!word %in% c(stop_words$word, "patient")) %>% 
    group_by(id) %>% 
    summarise(text = paste(word, collapse = " "))
    return(new.df)
}

minus_TextNum(TidySymptoms)