在映射函数的公式中将变量传递给 .x$

Question

第一个示例有效，但我想按照示例 2 的思路将更多重复代码移动到函数中。我还没有找到正确的 rlang 方法来处理 .x$x .

library(tidyverse)

# This produces the desired result:
slope <- function(y, x){
  coef(lm(y ~ x))[2]
}

tribble(
  ~group, ~y, ~x1, ~x2,
  "a", 1, 1, 4,
  "a", 2, 2, 5,
  "a", 3, 3, 6,
  "b", 1, 4, 8,
  "b", 2, 6, 12,
  "b", 3, 8, 16
) |> 
  nest(data = -group) |>
  mutate(
    slope1 = map_dbl(data, ~ slope(.x$y, .x$x1)),
    slope2 = map_dbl(data, ~ slope(.x$y, .x$x2))
  )
#> # A tibble: 2 × 4
#>   group data             slope1 slope2
#>   <chr> <list>            <dbl>  <dbl>
#> 1 a     <tibble [3 × 3]>    1     1   
#> 2 b     <tibble [3 × 3]>    0.5   0.25

# I would like to use rlang to further simplify with something like this,
# but I can't see how to combine the variable with .x$:
slope <- function(x) {
  map_dbl(data, ~ coef(lm(.x$y ~ .x$x))[2])
}

tribble(
  ~group, ~y, ~x1, ~x2,
  "a", 1, 1, 4,
  "a", 2, 2, 5,
  "a", 3, 3, 6,
  "b", 1, 4, 8,
  "b", 2, 6, 12,
  "b", 3, 8, 16
) |> 
  nest(data = -group) |>
  mutate(
    slope1 = slope(x1),
    slope2 = slope(x2)
  )
#> Error in `mutate()`:
#> ! Problem while computing `slope1 = slope(x1)`.
#> Caused by error in `purrr:::stop_bad_type()`:
#> ! `.x` must be a vector, not a function

^{由 reprex package (v2.0.1)}

于 2022-05-24 创建

Answer 1

要使代码在第二个版本中运行，您的 slope 函数将需要一个数据参数以及您希望回归的列的名称。列名然后需要将其放入 lm 调用的公式中，这可以通过几种方式完成，也许最简单的方法是使用一些字符串解析和 as.formula。为了简单起见，我还会使用 lm 的 data 参数。

slope <- function(data, var) {
  
  f <- as.formula(paste('y', deparse(substitute(var)), sep = "~"))
  map_dbl(data, ~ coef(lm(f, data = .x))[2])
}

所以你有：

tribble(
  ~group, ~y, ~x1, ~x2,
  "a", 1, 1, 4,
  "a", 2, 2, 5,
  "a", 3, 3, 6,
  "b", 1, 4, 8,
  "b", 2, 6, 12,
  "b", 3, 8, 16
) |> 
  nest(data = -group) |>
  mutate(
    slope1 = slope(data, x1),
    slope2 = slope(data, x2)
  )

#> # A tibble: 2 x 4
#>   group data             slope1 slope2
#>   <chr> <list>            <dbl>  <dbl>
#> 1 a     <tibble [3 x 3]>    1     1   
#> 2 b     <tibble [3 x 3]>    0.5   0.25

就我个人而言，我认为更稳健和通用的方法是传递一个公式：

slope <- function(data, f) map_dbl(data, ~ coef(lm(f, data = .x))[2])

tribble(
  ~group, ~y, ~x1, ~x2,
  "a", 1, 1, 4,
  "a", 2, 2, 5,
  "a", 3, 3, 6,
  "b", 1, 4, 8,
  "b", 2, 6, 12,
  "b", 3, 8, 16
) |> 
  nest(data = -group) |>
  mutate(
    slope1 = slope(data, y ~ x1),
    slope2 = slope(data, y ~ x2)
  )

#> # A tibble: 2 x 4
#>   group data             slope1 slope2
#>   <chr> <list>            <dbl>  <dbl>
#> 1 a     <tibble [3 x 3]>    1     1   
#> 2 b     <tibble [3 x 3]>    0.5   0.25

在映射函数的公式中将变量传递给 .x$

Pass a variable to .x$ in a formula of a map function

r

tidyverse

rlang