gsub：用正则表达式替换字符串替换正则表达式匹配

Question

我需要用相等数量的零替换 2 个以上的连续 1。目前，我可以找到如下匹配项，但我不知道如何在找到匹配项时用确切数量的零替换

ind<-c(1,1,0,0,0,1,1,1,1,0,1,1,0,0,0,1,1,0,1,0,0,1,0,1,0,1,0,1,1,1,1,1,0,1,0,1,0,1,1,1,0)
gsub("([1])\1\1+","0",paste0(ind,collapse=""))

给予

"11000001100011010010101000101000"

因为它只用一个 0 替换了匹配项，但我需要

"11000000001100011010010101000000010100000"

Answer 1

您可以使用以下 gsub 替换：

ind<-c(1,1,0,0,0,1,1,1,1,0,1,1,0,0,0,1,1,0,1,0,0,1,0,1,0,1,0,1,1,1,1,1,0,1,0,1,0,1,1,1,0)
gsub("1(?=1{2,})|(?!^)\G1","0",paste(ind,collapse=""), perl=T)

见IDEONE demo，结果为[1] "11000000001100011010010101000000010100000"。

正则表达式是基于 Perl 的，因为它使用前瞻和 \G 运算符。

这个正则表达式匹配：

1 - 文字 1 如果...
(?=1{2,}) - 后跟 2 个或更多 1 或...
(?!^)\G1 - 上一场比赛之后的任何 1。

有关 \G 运算符的更多详细信息，请参阅 What good is \G in a regular expression? at perldoc.perl.org, and When is \G useful application in a regex? SO post。

Answer 2

不使用 regex 但 rle 的解决方案：

x = rle(ind)
x$values[x$lengths>2 & x$values] <- 0
inverse.rle(x)

#[1] 1 1 0 0 0 0 0 0 0 0 1 1 0 0 0 1 1 0 1 0 0 1 0 1 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0

gsub：用正则表达式替换字符串替换正则表达式匹配

gsub: replace regex match with regex replacement string

regex

r

gsub