grepl在R中的多列中

grepl in multiple columns in R

我正在尝试在 R 中跨多个列进行字符串搜索和替换。我的代码:

# Get columns of interest
selected_columns <- c(368,370,372,374,376,378,380,382,384,386,388,390,392,394)

#Perform grepl across multiple columns
df[,selected_columns][grepl('apples',df[,selected_columns],ignore.case = TRUE)] <- 'category1'

但是,我遇到了错误:

Error: undefined columns selected

提前致谢。

grep/grepl 适用于 vectors/matrix 而不是 data.frame/list. According to the?grep`

x - a character vector where matches are sought, or an object which can be coerced by as.character to a character vector.

我们可以遍历列 (lapply) 和 replace 基于匹配的值

df[, selected_columns] <- lapply(df[, selected_columns],
     function(x) replace(x, grepl('apples', x, ignore.case = TRUE), 'category1'))

dplyr

library(dplyr)
library(stringr)
df %>%
     mutate_at(selected_columns, ~ replace(., str_detect(., 'apples'), 'category1'))

假设您想要部分匹配一个单元格并替换它,您可以使用 rapply() 并使用 gsub() 将具有 "apples" 的单元格内容替换为 "category1":

df[selected_columns] <- rapply(df[selected_columns], function(x) gsub("apples", "category1", x), how = "replace")

请记住 grepl()/gsub()(正则表达式中有边界和无边界)和搜索字符串时 %in%/match() 之间的区别。