替换字符串中的重复字符
replace duplicate characters from strings
我正在尝试从字符串中删除重复字符。
dput(test)
c("APAAAAAAAAAAAPAAPPAPAPAAAAAAAAAAAAAAAAAAAAAAAAPPAPAAAAAAPPAPAAAPAPAAAAP",
"AAA", "P", "P", "A", "P", "P", "APPPPPA", "A", "P", "AA", "PP",
"PPA", "P", "P", "A", "P", "APAP", "P", "PA")
我创建了一个函数来对字符串进行排序
strSort <- function(x)
sapply(lapply(strsplit(x, NULL), sort), paste, collapse="")
然后我用gsub删除连续的字符
gsub("(.)\1{2,}", "\1", str_Sort(test))
这个输出为
gsub("(.)\1{2,}", "\1", strSort(test))
[1] "AP" "A" "P" "P" "A" "P" "P" "AAP" "A" "P" "AA" "PP" "APP" "P" "P" "A" "P" "AAPP" "P" "AP"
输出应该只有一个A and/or一个P.
在 strsplit
输出中,我们需要在 sort
ed 元素上使用 unique
sapply(strsplit(test, ""), function(x)
paste(unique(sort(x)), collapse=""))
#[1] "AP" "A" "P" "P" "A" "P" "P" "AP" "A" "P" "A" "P" "AP" "P" "P" "A" "P" "AP" "P" "AP"
这是另一个使用 utf8ToInt
+ intToUtf8
的选项
> sapply(test, function(x) intToUtf8(sort(unique(utf8ToInt(x)))), USE.NAMES = FALSE)
[1] "AP" "A" "P" "P" "A" "P" "P" "AP" "A" "P" "A" "P" "AP" "P" "P"
[16] "A" "P" "AP" "P" "AP"
使用正则表达式你可以做到:
gsub('(?:(.)(?=(.*)\1))', '', test, perl = TRUE)
#[1] "AP" "A" "P" "P" "A" "P" "P" "PA" "A" "P" "A" "P" "PA"
#[14] "P" "P" "A" "P" "AP" "P" "PA"
正则表达式取自 。
我正在尝试从字符串中删除重复字符。
dput(test)
c("APAAAAAAAAAAAPAAPPAPAPAAAAAAAAAAAAAAAAAAAAAAAAPPAPAAAAAAPPAPAAAPAPAAAAP",
"AAA", "P", "P", "A", "P", "P", "APPPPPA", "A", "P", "AA", "PP",
"PPA", "P", "P", "A", "P", "APAP", "P", "PA")
我创建了一个函数来对字符串进行排序
strSort <- function(x)
sapply(lapply(strsplit(x, NULL), sort), paste, collapse="")
然后我用gsub删除连续的字符
gsub("(.)\1{2,}", "\1", str_Sort(test))
这个输出为
gsub("(.)\1{2,}", "\1", strSort(test))
[1] "AP" "A" "P" "P" "A" "P" "P" "AAP" "A" "P" "AA" "PP" "APP" "P" "P" "A" "P" "AAPP" "P" "AP"
输出应该只有一个A and/or一个P.
在 strsplit
输出中,我们需要在 sort
ed 元素上使用 unique
sapply(strsplit(test, ""), function(x)
paste(unique(sort(x)), collapse=""))
#[1] "AP" "A" "P" "P" "A" "P" "P" "AP" "A" "P" "A" "P" "AP" "P" "P" "A" "P" "AP" "P" "AP"
这是另一个使用 utf8ToInt
+ intToUtf8
> sapply(test, function(x) intToUtf8(sort(unique(utf8ToInt(x)))), USE.NAMES = FALSE)
[1] "AP" "A" "P" "P" "A" "P" "P" "AP" "A" "P" "A" "P" "AP" "P" "P"
[16] "A" "P" "AP" "P" "AP"
使用正则表达式你可以做到:
gsub('(?:(.)(?=(.*)\1))', '', test, perl = TRUE)
#[1] "AP" "A" "P" "P" "A" "P" "P" "PA" "A" "P" "A" "P" "PA"
#[14] "P" "P" "A" "P" "AP" "P" "PA"
正则表达式取自