如果 R 中包含匹配模式,如何删除整个字符串

How do I remove entire strings if they contain a matched pattern in R

假设我有以下字符串 -

vector <- "this is a string of text containing stuff. something.com thisthat@co.uk and other stuff with something.anything"

我想删除包含 @. 的字符串,所以我想删除 something.comthisthat@co.uksomething.anything .我不想删除 stuff,因为它是句子的结尾,不包含 .。理想情况下,我希望能够使用 %>% 管道来执行此操作。

 gsub(" ?\w+[.@]\S+", "", vector)

[1] "this is a string of text containing stuff. and other stuff with"

(更多 terse/simple)gsub 方法的替代方法:

gre <- gregexpr("[^ ]+[.@][^ ]+", vector)
regmatches(vector, gre)
# [[1]]
# [1] "something.com"      "thisthat@co.uk"     "something.anything"
regmatches(vector, gre) <- ""
vector
# [1] "this is a string of text containing stuff.   and other stuff with "

这样的好处是可以任意替换。当然,我们只是在这里用 "" 替换它们,所以这有点矫枉过正,但是如果您需要以某种方式 更改 值(更改每个子字符串),那么这个是一个更强大的机制。