提高基于数据库查询的自动文本生成的可读性

Improving the readability of automated text generation based on a database query

我正在尝试提高基于数据库查询的自动文本生成的可读性。

有没有一种巧妙的方法来执行这些替换?要在 1 个命令而不是 6 个命令中执行以下操作?

x<-c("Te( )st", "Test()", "Test ()", "Test ( )", "Test ,,", "Test,, ", "Test , ")
out<-c("Test", "Test", "Test", "Test", "Test,", "Test, ", "Test,") 

x<-gsub(pattern = "( ", replacement = "(", x, fixed = T)
x<-gsub(pattern = " )", replacement = ")", x, fixed = T)
x<-gsub(pattern = " ,", replacement = ",", x, fixed = T)
x<-gsub(pattern = "()", replacement = "", x, fixed = T)
x<-gsub(pattern = ",,", replacement = ",", x, fixed = T)
x<-gsub(pattern = " ,", replacement = ",", x, fixed = T)

您可以使用 mgsub::mgsub.

a = c("( ", " )", " ,", "()",",,") #pattern
b = c("(", ")", ",", "",",")       #replacement
x<-c("Te( )st", "Test()", "Test ()", "Test ( )", "Test ,,", "Test,, ", "Test , ")

mgsub::mgsub(x, a, b, fixed = T)
#[1] "Te()st"  "Test"    "Test "   "Test ()" "Test,,"  "Test, "  "Test, " 

您可能想要添加其他模式以获得所需的输出。

您可以使用

x<-c("Te( )st", "Test()", "Test ()", "Test ( )", "Test ,,", "Test,, ", "Test , ")
gsub("\(\s*\)|\s+(?=[,)])|(?<=\()\s+|(,),+", "\1", x, perl=TRUE)
# => [1] "Test"   "Test"   "Test "  "Test "  "Test,"  "Test, " "Test, "

R demo online and the regex demo详情:

  • \(\s*\)| - (,零个或多个空格,然后是 ),或
  • \s+(?=[,)])| - 一个或多个空格,然后是 ,),或
  • (?<=\()\s+| - 一个或多个空格紧跟一个 ( 字符,或
  • (,),+ - 第 1 组捕获一个逗号,然后是一个或多个逗号。

替换为第1组值,即如果第1组匹配,则替换为单个逗号,否则为空字符串。

您可以使用 multigsub 函数,它是 R 中 gsub 函数的包装器。您可以找到文档 here.

代码如下:

multigsub(c("(", ")", ",", "()", ",,", " ,"), c("(", ")", ",", "", ",", ","), x, fixed = T)