替换字符串中的模式
Replacing patterns in a string
我有几个这种格式的字符串。分隔符是破折号 (-
),中间的每个 "thing" 都是一个标记。
string <- "FA-I2-I2-I2-EX-I2-I3-FA-I1-I2-TR-I1-I2-FA-I3-I1-FAFANR-I3-I2-TR-I1-I2-I1-I2-FA-I2-I1-I3-FAQU-I1-I2-I2-I2-NR-I2-I2-NR-I1-I2-I1-NR-I3-QU-I2-I3-QUNR-I2-I1-NRQUQU-I2-I1-EX"
我想识别包含字母 "I" 的标记连续出现的情况(即标记 I1、I2 和 I3)。然后我想用没有分隔符的描述替换那些。例如,最开始的字符串应转换为:
FA-I2I2I2-EX
所以基本上我想做的就是删除包含 "I".
的标记之间的所有破折号
这里有一个有点复杂的解决方案:
string1 <- gsub(string, pattern = "I1", replacement = "ZI1Z")
string2 <- gsub(string1, pattern = "I2", replacement = "ZI2Z")
string3 <- gsub(string2, pattern = "I3", replacement = "ZI3Z")
string4 <- gsub(string3, pattern = "Z-Z", replacement = "")
string5 <- gsub(string4, pattern = "Z", replacement = "")
给出:
"FA-I2I2I2-EX-I2I3-FA-I1I2-TR-I1I2-FA-I3I1-FAFANR-I3I2-TR-I1I2I1I2-FA-I2I1I3-FAQU-I1I2I2I2-NR-I2I2-NR-I1I2I1-NR-I3-QU-I2I3-QUNR-I2I1-NRQUQU-I2I1-EX"
有没有更优雅的方法来完成这个?
So basically all I want to do is to remove all the dashes between markers containing "I".
如果您的案例听起来很简单,您可以使用环视断言。
gsub('(?<=I\d)-(?=I\d)', '', string, perl = TRUE)
# [1] "FA-I2I2I2-EX-I2I3-FA-I1I2-TR-I1I2-FA-I3I1-FAFANR-I3I2-TR-I1I2I1I2-FA-I2I1I3-FAQU-I1I2I2I2-NR-I2I2-NR-I1I2I1-NR-I3-QU-I2I3-QUNR-I2I1-NRQUQU-I2I1-EX"
我有几个这种格式的字符串。分隔符是破折号 (-
),中间的每个 "thing" 都是一个标记。
string <- "FA-I2-I2-I2-EX-I2-I3-FA-I1-I2-TR-I1-I2-FA-I3-I1-FAFANR-I3-I2-TR-I1-I2-I1-I2-FA-I2-I1-I3-FAQU-I1-I2-I2-I2-NR-I2-I2-NR-I1-I2-I1-NR-I3-QU-I2-I3-QUNR-I2-I1-NRQUQU-I2-I1-EX"
我想识别包含字母 "I" 的标记连续出现的情况(即标记 I1、I2 和 I3)。然后我想用没有分隔符的描述替换那些。例如,最开始的字符串应转换为:
FA-I2I2I2-EX
所以基本上我想做的就是删除包含 "I".
的标记之间的所有破折号这里有一个有点复杂的解决方案:
string1 <- gsub(string, pattern = "I1", replacement = "ZI1Z")
string2 <- gsub(string1, pattern = "I2", replacement = "ZI2Z")
string3 <- gsub(string2, pattern = "I3", replacement = "ZI3Z")
string4 <- gsub(string3, pattern = "Z-Z", replacement = "")
string5 <- gsub(string4, pattern = "Z", replacement = "")
给出:
"FA-I2I2I2-EX-I2I3-FA-I1I2-TR-I1I2-FA-I3I1-FAFANR-I3I2-TR-I1I2I1I2-FA-I2I1I3-FAQU-I1I2I2I2-NR-I2I2-NR-I1I2I1-NR-I3-QU-I2I3-QUNR-I2I1-NRQUQU-I2I1-EX"
有没有更优雅的方法来完成这个?
So basically all I want to do is to remove all the dashes between markers containing "I".
如果您的案例听起来很简单,您可以使用环视断言。
gsub('(?<=I\d)-(?=I\d)', '', string, perl = TRUE)
# [1] "FA-I2I2I2-EX-I2I3-FA-I1I2-TR-I1I2-FA-I3I1-FAFANR-I3I2-TR-I1I2I1I2-FA-I2I1I3-FAQU-I1I2I2I2-NR-I2I2-NR-I1I2I1-NR-I3-QU-I2I3-QUNR-I2I1-NRQUQU-I2I1-EX"