如果字符串(带标点符号)包含特定文本,则重新编码

Recode if string (with punctuation) contains certain text

如何搜索字符向量,如果给定索引处的字符串包含模式,如何替换该索引的值?

我试过这个:

List <- c(1:8)
  Types<-as.character(c(
    "ABC, the (stuff).\n\n\n fun", "meaningful", "relevant", "rewarding", 
    "unpleasant", "enjoyable", "engaging", "disinteresting"))
  for (i in List) {
    if (grepl(Types[i], "fun", fixed = TRUE))
    {Types[i]="1"
    } else if (grepl(Types[i], "meaningful", fixed = TRUE))
    {Types[i]="2"}} 

该代码适用于“有意义的”,但当字符串中有标点符号或其他内容时无效,例如“有趣”。

grepl 的第一个参数是模式,而不是字符串。

这将是您代码的字面修正:

for (i in seq_along(Types)) {
  if (grepl("fun", Types[i], fixed = TRUE)) {
    Types[i] = "1"
  } else if (grepl("meaningful", Types[i], fixed = TRUE)) {
    Types[i] = "2"
  }
}
Types
# [1] "1"              "2"              "relevant"       "rewarding"      "unpleasant"    
# [6] "enjoyable"      "engaging"       "disinteresting"

顺便说一句,List 的使用是有效的,但它有点额外:当你有这样的单独变量时,一个变量可能与另一个变量不同步。例如,如果您更新 Types 而忘记更新 List,那么它将中断(或失败)。为此,我使用 seq_along(Types) 代替。

顺便说一句:这里有一个略有不同的版本,它保留 Types 不变和 returns 一个新的矢量,并向您介绍矢量化的强大功能:

Types[grepl("fun", Types, fixed = TRUE)] <- "1"
Types[grepl("meaningful", Types, fixed = TRUE)] <- "2"
Types
# [1] "1"              "2"              "relevant"       "rewarding"      "unpleasant"    
# [6] "enjoyable"      "engaging"       "disinteresting"

下一级别(可能过于复杂?)将在一个框架中存储模式和重新编码替换(总是 1 对 1,你永远不会不小心更新一个而没有另一个,可以存储如果需要,在 CSV 中)和 Reduce 就可以了:

ptns <- data.frame(ptn = c("fun", "meaningful"), repl = c("1", "2"))
Reduce(function(txt, i) {
  txt[grepl(ptns$ptn[i], txt, fixed = TRUE)] <- ptns$repl[i]
  txt
}, seq_len(nrow(ptns)), init = Types)
# [1] "1"              "2"              "relevant"       "rewarding"      "unpleasant"    
# [6] "enjoyable"      "engaging"       "disinteresting"

尝试使用 string 包中的 str_replace(string, pattern, replacement)

你可以使用 str_replace_all:

library(stringr)
pat <- c(fun = '1', meaningful = '2')
str_replace_all(Types, setNames(pat, sprintf('(?s).*%s.*', names(pat))))

[1] "1"              "2"              "relevant"      
[4] "rewarding"      "unpleasant"     "enjoyable"     
[7] "engaging"       "disinteresting"