如何从qdap::mgsub()平滑切换到textclean::mgsub()?

How to switch smoothly from qdap::mgsub() to textclean::mgsub()?

由于 R 版本问题,我需要在 qdap::mgsub()textclean::mgsub() 之间切换。功能几乎相同,除了参数的顺序:

qdap::mgsub(pattern,replacement,x)
textclean::mgsub(x,pattern,replacement)

我有很多代码使用 qdap::mgsub()。不幸的是,当我将参数传递给函数时,我没有正确命名参数。所以我需要重新排序所有这些以便能够使用 textclean::mgsub().

是否有(以编程方式)在这两个函数之间切换而无需更改参数顺序的优雅方式?

您可以使用正则表达式来替换调用旧函数的每个文件的文本中出现的位置,使用如下函数:

replace_mgsub <- function(path) {
    file_text <- readr::read_file(path)
    file_text <- gsub("qdap::mgsub\(([^, ]+) *, *([^, ]+) *, *([^\)]) *\)",
                      "textclean::mgsub\(\3, \1, \2\)", file_text)
    readr::write_file(file_text, path)
}

然后你会调用每个相关的 path (我假设你知道你需要调用函数的文件列表;如果没有,请在下面评论,我可以添加一些东西) .这是函数 gsub() 部分的演示:

file_text <- "qdap::mgsub(pattern,replacement,x)"
cat(gsub("qdap::mgsub\(([^, ]+) *, *([^, ]+) *, *([^\)]) *\)",
         "textclean::mgsub\(\3, \1, \2\)", file_text))
#> textclean::mgsub(x, pattern, replacement)
file_text <- "# I'll have in this part some irrelevant code
# to show it won't interfere with that
y = rnorm(1000)
qdap::mgsub(pattern,replacement,x)
z = rnorm(10)
# And also demonstrate multiple occurrences of the function
# as well as illustrate that it doesn't matter if you have spaces
# between comma separated arguments
qdap::mgsub(pattern, replacement, x)"
cat(gsub("qdap::mgsub\(([^, ]+) *, *([^, ]+) *, *([^\)]) *\)",
         "textclean::mgsub\(\3, \1, \2\)", file_text))
#> # I'll have in this part some irrelevant code
#> # to show it won't interfere with that
#> y = rnorm(1000)
#> textclean::mgsub(x, pattern, replacement)
#> z = rnorm(10)
#> # And also demonstrate multiple occurrences of the function
#> # as well as illustrate that it doesn't matter if you have spaces
#> # between comma separated arguments
#> textclean::mgsub(x, pattern, replacement)

思考@duckmayr 的回答,我想到了另一个解决我问题的方法:

首先运行这个函数:

reorder_mgsub <- function(pattern,replacement,x){
  output <- textclean::mgsub(x,pattern,replacement)
  return(output)
}

其次:查找 qdap::mgsub 并将其替换为 reorder_mgsub

这个解决方案可能不太优雅,因为我必须手动执行第 2 步,但对我来说效果很好。

嗯,您也可以重新分配包中的原始函数以适合您的代码。

即使用mgsub的源代码,

reorder_mgsub <- function(pattern,replacement,x, leadspace = FALSE, trailspace = FALSE, 
fixed = TRUE, trim = FALSE, order.pattern = fixed, safe = FALSE, 
...){
    if (!is.null(list(...)$ignore.case) & fixed) {
        warning(paste0("`ignore.case = TRUE` can't be used with `fixed = TRUE`.\n", 
            "Do you want to set `fixed = FALSE`?"), call. = FALSE)
    }
    if (safe) {
        return(mgsub_regex_safe(x = x, pattern = pattern, replacement = replacement, 
            ...))
    }
    if (leadspace | trailspace) {
        replacement <- spaste(replacement, trailing = trailspace, 
            leading = leadspace)
    }
    if (fixed && order.pattern) {
        ord <- rev(order(nchar(pattern)))
        pattern <- pattern[ord]
        if (length(replacement) != 1) 
            replacement <- replacement[ord]
    }
    if (length(replacement) == 1) {
        replacement <- rep(replacement, length(pattern))
    }
    if (any(!nzchar(pattern))) {
        good_apples <- which(nzchar(pattern))
        pattern <- pattern[good_apples]
        replacement <- replacement[good_apples]
        warning(paste0("Empty pattern found (i.e., `pattern = \"\"`).\n", 
            "This pattern and replacement have been removed."), 
            call. = FALSE)
    }
    for (i in seq_along(pattern)) {
        x <- gsub(pattern[i], replacement[i], x, fixed = fixed, 
            ...)
    }
    if (trim) {
        x <- gsub("\s+", " ", gsub("^\s+|\s+$", "", x, perl = TRUE), 
            perl = TRUE)
    }
    x
}

其次是

assignInNamespace('mgsub', reorder_mgsub, 'textclean')

应该将更新后的函数分配给 textclean 包的命名空间,并且使用 textclean::mgsub 的任何代码现在都将使用更新后的函数。这样就不需要更改所有代码了。