如何在 dplyr::across 中的 .names 参数中使用字符串操作函数

Question

虽然我试图搜索它是否重复，但我找不到类似的问题。（虽然有 similar 一个，但这与我的要求有些不同）

我的问题是我们是否可以在 dplyr::across 的 .names 参数中使用 substr 或 stringr::str_remove 等字符串操作函数。作为一个可重现的例子考虑这个

library(dplyr)
iris %>%
  summarise(across(starts_with('Sepal'), mean, .names = '{.col}_mean'))

  Sepal.Length_mean Sepal.Width_mean
1          5.843333         3.057333

现在我的问题是我想重命名输出列说 str_remove(.col, 'Sepal') 以便我的输出列名称只是 Length.mean 和 Width.mean 。为什么我要问，因为这个论点的 description 指出

.names
A glue specification that describes how to name the output columns. This can use {.col} to stand for the selected column name, and {.fn} to stand for the name of the function being applied. The default (NULL) is equivalent to "{.col}" for the single function case and "{.col}_{.fn}" for the case where a list is used for .fns.

我已经尝试了很多可能性，包括以下，但是 none 这些都行得通

library(tidyverse)
library(glue)
iris %>%
  summarise(across(starts_with('Sepal'), mean, 
                   .names = glue('{xx}_mean', xx = str_remove(.col, 'Sepal'))))

Error: Problem with `summarise()` input `..1`.
x argument `str` should be a character vector (or an object coercible to)
i Input `..1` is `(function (.cols = everything(), .fns = NULL, ..., .names = NULL) ...`.
Run `rlang::last_error()` to see where the error occurred.


#OR
iris %>%
  summarise(across(starts_with('Sepal'), mean, 
                   .names = glue('{xx}_mean', xx = str_remove(glue('{.col}'), 'Sepal'))))

我知道这可以通过使用 rename_with 添加另一个步骤来解决，所以我不关注那个答案。

Answer 1

这可行，但可能有一些注意事项。您可以在粘合规范中使用函数，这样您就可以用这种方式清理字符串。但是，当我尝试转义 "." 时出现错误，我认为这与 across 解析字符串的方式有关。如果你需要更动态的东西，你可能想在那个时候深入研究源代码。

为了使用 {.fn} 助手，至少要像这样动态创建粘合字符串，函数需要一个名称；否则，您会在 .fns 参数中获得函数索引的数字。我用第二个函数对此进行了测试，并使用 lst 进行自动命名。

library(dplyr)
iris %>%
  summarise(across(starts_with('Sepal'), .fns = lst(mean, max), 
                   .names = '{stringr::str_remove(.col, "^[A-Za-z]+.")}_{.fn}'))
#>   Length_mean Length_max Width_mean Width_max
#> 1    5.843333        7.9   3.057333       4.4

如何在 dplyr::across 中的 .names 参数中使用字符串操作函数

How to use string manipulation functions inside .names argument in dplyr::across

r

dplyr

across

r-glue