在 R、gsub 和 Regex 前瞻和后视表达式中删除字符串模式之前的所有内容?
In R, gsub & Regex lookahead or lookbehind expression to remove everything BEFORE a string pattern?
在 R 中,我有一个包含一列的数据框,其中每一行都有我想删除的与特定模式匹配的重复文本:
x <- c("DOI: 10.5256/f1000research.6541.r7660 The revised article answers most of my remarks and questions in a ... Continue reading The revised article answers most of my remarks and questions in a satisfactory way.",
"DOI: 10.5256/f1000research.6601.r7701 The revision ... Continue reading The revision is approved I have read this",
"DOI: 10.5256/f1000research.6599.r7859 I have read the revised article by Horrell and D'Orazio. They have responded appropriately to ... Continue reading I have read the revised article by Horrell and D'Orazio. They have responded appropriately to the concerns/questions raised")
我可以使用什么函数删除 ... Continue reading
或 Continue reading
之前的所有内容,包括 ... Continue reading
或 Continue reading
?
使用子
包括继续阅读,
sub(".*Continue reading", "", x)
不包括继续阅读。
sub(".*(?=\bContinue reading)", "", x, perl=TRUE)
或
sub(".*\b(Continue reading)", "\1", x)
这应该删除 Continue reading
之前的所有内容
sub('.*\.{3}\s*(Continue reading.*)$', '\1', x)
如果需要删除... Continue reading
之前的字符
sub('.*(\.{3}\s*Continue reading.*)$', '\1', x)
在 R 中,我有一个包含一列的数据框,其中每一行都有我想删除的与特定模式匹配的重复文本:
x <- c("DOI: 10.5256/f1000research.6541.r7660 The revised article answers most of my remarks and questions in a ... Continue reading The revised article answers most of my remarks and questions in a satisfactory way.",
"DOI: 10.5256/f1000research.6601.r7701 The revision ... Continue reading The revision is approved I have read this",
"DOI: 10.5256/f1000research.6599.r7859 I have read the revised article by Horrell and D'Orazio. They have responded appropriately to ... Continue reading I have read the revised article by Horrell and D'Orazio. They have responded appropriately to the concerns/questions raised")
我可以使用什么函数删除 ... Continue reading
或 Continue reading
之前的所有内容,包括 ... Continue reading
或 Continue reading
?
使用子
包括继续阅读,
sub(".*Continue reading", "", x)
不包括继续阅读。
sub(".*(?=\bContinue reading)", "", x, perl=TRUE)
或
sub(".*\b(Continue reading)", "\1", x)
这应该删除 Continue reading
sub('.*\.{3}\s*(Continue reading.*)$', '\1', x)
如果需要删除... Continue reading
sub('.*(\.{3}\s*Continue reading.*)$', '\1', x)