如何删除一组特定字符（例如“？ - ”）之前的所有内容？

Question

我想删除“?-”之前的所有内容，包括字符本身（问号、space、连字符、space）。我知道可以删除特定字符之前的所有内容。我不想删除连字符 - 之前的所有内容，因为我还有其他带有连字符的语句。我试过了，但它对带连字符的单词不起作用。

示例：

gsub(".*-", "", "To what extent do you disagree or agree with the following statements? - Statistics make me cry.")
gsub(".*-", "", "To what extent do you disagree or agree with the following statements? - T-tests are easy to interpret.")

Output:
" Statistics make me cry."
"tests are easy to interpret."

我希望第二条语句显示为 T-tests are easy to interpret

Answer 1

此处 sub 就足够了，而不是全局 g。更改模式以匹配 ?（元字符 - 因此它被转义 \），后跟零个或多个空格 (\s*)，然后是 -，然后是零或更多空格，替换为空白 ('')

sub(".*\?\s*-\s*", "", v1)
#[1] "Statistics make me cry."        "T-tests are easy to interpret."

数据

v1 <- c("To what extent do you disagree or agree with the following statements? - Statistics make me cry.", 
"To what extent do you disagree or agree with the following statements? - T-tests are easy to interpret."
)

Answer 2

您可以尝试包 stringr 并使用 str_split。但它将结果放在一个列表中，因此您必须从第一个列表中提取第二个元素。或者开始输入向量。

library(stringr)

str1 <- "To what extent do you disagree or agree with the following statements? - Statistics make me cry."
str2 <- "To what extent do you disagree or agree with the following statements? - T-tests are easy to interpret."

str_split(str1, pattern = "`? - ")[[1]][2]

[1]“统计数据让我哭泣。”

str_split(str2, pattern = "`? - ")[[1]][2]

[1]“T 检验很容易解释。”

Answer 3

如果您想在 tidyverse 中使用 stringr:

library(stringr)
str <- c("To what extent do you disagree or agree with the following statements? - Statistics make me cry.",
         "To what extent do you disagree or agree with the following statements? - T-tests are easy to interpret.")
str_split(str, ' - ', simplify = T)[,2]

如何删除一组特定字符（例如“？ - ”）之前的所有内容？

How to remove everything before a set of certain characters (e.g., "? - ")?

string

r

stringr

tidyverse

数据