gsubfn |在 Substitution 中使用变量替换文本
gsubfn | Replace text using variables in Substitution
我正在尝试删除围绕我要保留的内容的一段文本。所以我想分配变量,因为文本可能很长。这是我正在尝试做的一个例子。 [不删除文本]
Text<-'This is an example text [] test'
topheader<-'This'
bottomheader<-'test'
gsubfn(".", list(topheader = "", bottomheader = ""), Text)
[1] "This is an example text [] test"
Goal: "is an example text []"
我认为这是您正在寻找的一种解决方案:
# Your data:
Text<-'This is an example text [] test'
topheader<-'This'
bottomheader<-'test'
# A possible solution fn
gsubfn <- function(text, th, bh, th.replace="", bh.replace="") {
answer <- gsub(text,
pattern=paste0(th," (.*) ",bh),
replacement=paste0(th.replace,"\1",bh.replace)
)
return(answer)
}
# Your req'd answer
gsubfn(text=Text,th=topheader,bh=bottomheader)
# Another example
gsubfn(text=Text,th=topheader,bh=bottomheader,th.replace="@@@ ",bh.replace=" ###")
您可以将搜索词折叠成一个正则表达式字符串。
Test <- 'This is an example text testing [] test'
top <- "This"
bottom <- "test"
arg <- c(top, bottom)
arg <- paste(arg, collapse="|")
arg <- gsub("(\w+)", "\\b\1\\b", arg)
Test.c <- gsub(arg, "", Test)
Test.c <- gsub("[ ]+", " ", Test.c)
Test.c <- gsub("^[[:space:]]|[[:space:]]$", "", Test.c)
Test.c
# "is an example text []"
或使用 magrittr
管道
library(magrittr)
c(top, bottom) %>%
paste(collapse="|") %>%
gsub("(\w+)", "\\b\1\\b", .) %>%
gsub(., "", Test) %>%
gsub("[ ]+", " ", .) %>%
gsub("^[[:space:]]|[[:space:]]$", "", .) -> Test.c
Test.c
# "is an example text []"
或者使用循环
Test.c <- Test
words <- c(top, bottom)
for (i in words) {
Test.c <- gsub(paste0("\\b", i, "\\b"), "", Test)
}
Test.c <- gsub("[ ]+", " ", Test.c)
Test.c <- gsub("^[[:space:]]|[[:space:]]$", "", Test.c)
Test.c
# "is an example text []"
1) gsubfn这里有几个问题:
gsubfn
(和 gsub
)中的正则表达式必须匹配您要处理的字符串,但点仅匹配单个字符,因此它永远无法匹配 This
或 test
是4个字符串。请改用 "\w+"
。
在 list(a = x)
中 a
必须是常量,而不是变量。明确写出名称,或者如果它们在变量中则使用 setNames
。
从而修正问题中的代码:
library(gsubfn)
trimws(gsubfn("\w+", list(This = "", text = ""), Text))
## [1] "is an example [] test"
或根据 header 变量:
L <- setNames(list("", ""), c(topheader, bottomheader))
trimws(gsubfn("\w+", L, Text))
## [1] "is an example [] test"
请注意,这将替换所有出现的 topheader 和 bottomheader,而不仅仅是开头和结尾的那些;但是,这似乎是最接近您可能足够的代码。
2) sub 另一种可能就是这么简单sub
sub("^This (.*) text$", "\1", Text)
[1] "is an example [] test"
或根据 header 变量:
pat <- sprintf("^%s (.*) %s$", topheader, bottomheader)
sub(pat, "\1", Text)
## [1] "is an example [] test"
更新: 修复 (1)
我正在尝试删除围绕我要保留的内容的一段文本。所以我想分配变量,因为文本可能很长。这是我正在尝试做的一个例子。 [不删除文本]
Text<-'This is an example text [] test'
topheader<-'This'
bottomheader<-'test'
gsubfn(".", list(topheader = "", bottomheader = ""), Text)
[1] "This is an example text [] test"
Goal: "is an example text []"
我认为这是您正在寻找的一种解决方案:
# Your data:
Text<-'This is an example text [] test'
topheader<-'This'
bottomheader<-'test'
# A possible solution fn
gsubfn <- function(text, th, bh, th.replace="", bh.replace="") {
answer <- gsub(text,
pattern=paste0(th," (.*) ",bh),
replacement=paste0(th.replace,"\1",bh.replace)
)
return(answer)
}
# Your req'd answer
gsubfn(text=Text,th=topheader,bh=bottomheader)
# Another example
gsubfn(text=Text,th=topheader,bh=bottomheader,th.replace="@@@ ",bh.replace=" ###")
您可以将搜索词折叠成一个正则表达式字符串。
Test <- 'This is an example text testing [] test'
top <- "This"
bottom <- "test"
arg <- c(top, bottom)
arg <- paste(arg, collapse="|")
arg <- gsub("(\w+)", "\\b\1\\b", arg)
Test.c <- gsub(arg, "", Test)
Test.c <- gsub("[ ]+", " ", Test.c)
Test.c <- gsub("^[[:space:]]|[[:space:]]$", "", Test.c)
Test.c
# "is an example text []"
或使用 magrittr
管道
library(magrittr)
c(top, bottom) %>%
paste(collapse="|") %>%
gsub("(\w+)", "\\b\1\\b", .) %>%
gsub(., "", Test) %>%
gsub("[ ]+", " ", .) %>%
gsub("^[[:space:]]|[[:space:]]$", "", .) -> Test.c
Test.c
# "is an example text []"
或者使用循环
Test.c <- Test
words <- c(top, bottom)
for (i in words) {
Test.c <- gsub(paste0("\\b", i, "\\b"), "", Test)
}
Test.c <- gsub("[ ]+", " ", Test.c)
Test.c <- gsub("^[[:space:]]|[[:space:]]$", "", Test.c)
Test.c
# "is an example text []"
1) gsubfn这里有几个问题:
gsubfn
(和gsub
)中的正则表达式必须匹配您要处理的字符串,但点仅匹配单个字符,因此它永远无法匹配This
或test
是4个字符串。请改用"\w+"
。在
list(a = x)
中a
必须是常量,而不是变量。明确写出名称,或者如果它们在变量中则使用setNames
。
从而修正问题中的代码:
library(gsubfn)
trimws(gsubfn("\w+", list(This = "", text = ""), Text))
## [1] "is an example [] test"
或根据 header 变量:
L <- setNames(list("", ""), c(topheader, bottomheader))
trimws(gsubfn("\w+", L, Text))
## [1] "is an example [] test"
请注意,这将替换所有出现的 topheader 和 bottomheader,而不仅仅是开头和结尾的那些;但是,这似乎是最接近您可能足够的代码。
2) sub 另一种可能就是这么简单sub
sub("^This (.*) text$", "\1", Text)
[1] "is an example [] test"
或根据 header 变量:
pat <- sprintf("^%s (.*) %s$", topheader, bottomheader)
sub(pat, "\1", Text)
## [1] "is an example [] test"
更新: 修复 (1)