pmap 中的错误 UseMethod("filter_") 中的错误:'filter_' 的适用方法未应用于 class "character" 的对象

error in a pmap Error in UseMethod("filter_") : no applicable method for 'filter_' applied to an object of class "character"

我编写了一个函数,当我将输入变量作为数据框输入时,该函数运行良好。但是当我想使用 pmap 将输入作为数据框列表输入时,出现以下错误:

Error in UseMethod("filter_") : no applicable method for 'filter_' applied to an object of class "character"

这是导致错误的数据和函数的第一部分,我在此处未显示的部分函数中使用了 y 和 a 参数:

x <- tibble::tibble(x1 = sample(0:1, 8, replace = TRUE),
                    x2 = sample(0:25, 8, replace = FALSE),
                    x3 = sample(1:3, 8, replace = TRUE),
                    strata =c("a", "b", "c", "d", "a", "b", "c", "d"))
y <- tibble::tibble(rate = sample(0:1, 8, replace = TRUE),
                    strata =c("a", "b", "c", "d", "a", "b", "c", "d") )

a <- tibble::tibble(sample(10:80, 4, replace = FALSE))
example <- function(x, y, a , d){

  CR <- x %>% filter(x1, x2>0) %>%
    group_by(x3) %>%
    summarise(avg_revenue = mean(x2), revenue = sum(x2))
  return(CR)
}

example(x,y,a, d = 0.1)

但是当我在此函数上调用 pmap 时:

df <- tibble::tibble(x = x %>% group_by(strata) %>% nest(),
                     y = y %>% group_by(strata) %>% nest(),
                     a = a)
pmap(df, example, d= 0.1)

我收到上面提到的错误。

我不相信 df 正在创造您希望它创造的 df。我相信这会做你想要的......如果我正确理解这个问题。但是 y 没有在你的函数中的任何地方使用,所以我不清楚它的目的是什么。我相信还有一种更好的方法可以使用 mapnest 来执行此操作,但我还是不确定您要做什么。

library(tidyverse)
x <- tibble::tibble(x1 = sample(0:1, 8, replace = TRUE),
                    x2 = sample(0:25, 8, replace = FALSE),
                    x3 = sample(1:3, 8, replace = TRUE),
                    strata =c("a", "b", "c", "d", "a", "b", "c", "d"))
y <- tibble::tibble(rate = sample(0:1, 8, replace = TRUE),
                    strata =c("a", "b", "c", "d", "a", "b", "c", "d") )

a <- tibble::tibble(a = sample(10:80, 4, replace = FALSE))

example <- function(x, y, a , d){
  CR <- x %>% filter(x1, x2>0) %>%
    group_by(x3) %>%
    summarise(avg_revenue = mean(x2), revenue = sum(x2))
  return(CR)
}

example(x,y,a, d = 0.1)
#> # A tibble: 1 x 3
#>      x3 avg_revenue revenue
#>   <int>       <dbl>   <int>
#> 1     1           5      10
df <- bind_cols(x, select(y, rate)) %>% 
  group_by(strata) %>% 
  nest(x = c(x1, x2, x3), 
       y = c(rate)) %>% 
  bind_cols(a) %>% ungroup()
pmap(select(df, -strata), example)
#> [[1]]
#> # A tibble: 0 x 3
#> # … with 3 variables: x3 <int>, avg_revenue <dbl>, revenue <int>
#> 
#> [[2]]
#> # A tibble: 0 x 3
#> # … with 3 variables: x3 <int>, avg_revenue <dbl>, revenue <int>
#> 
#> [[3]]
#> # A tibble: 1 x 3
#>      x3 avg_revenue revenue
#>   <int>       <dbl>   <int>
#> 1     1           4       4
#> 
#> [[4]]
#> # A tibble: 1 x 3
#>      x3 avg_revenue revenue
#>   <int>       <dbl>   <int>
#> 1     1           6       6
pmap_dfr(select(df, -strata), example, d = 0.1, .id = 'strata')
#> # A tibble: 2 x 4
#>   strata    x3 avg_revenue revenue
#>   <chr>  <int>       <dbl>   <int>
#> 1 3          1           4       4
#> 2 4          1           6       6

reprex package (v0.3.0)

创建于 2019-12-17

正如 CLedbetter 在其有用的回答中也提到的那样,当 pmapdf 的输入数据帧格式不正确时,就会出现此错误。 pmap 期望 df 仅包含其正在运行的函数已知的列。 为此,我用 inner_join 编辑了 df,然后我们仍然有函数 example() 不知道的列 strata。 正如在 R 中 pmap 函数的帮助中提到的那样,为了使 pmap 函数忽略函数 example() 未使用的列, 我在 example() 的定义中使用了“...”,这样 pmap 就可以跳过函数中未使用的数据帧的第一列 strata

因此更新后的代码将是:

x <- tibble::tibble(x1 = sample(0:1, 8, replace = TRUE),
                    x2 = sample(0:25, 8, replace = FALSE),
                    x3 = sample(1:3, 8, replace = TRUE),
                    strata =c("a", "b", "c", "d", "a", "b", "c", "d"))
y <- tibble::tibble(rate = sample(0:1, 8, replace = TRUE),
                    strata =c("a", "b", "c", "d", "a", "b", "c", "d") )

a <- tibble::tibble(sample(10:80, 4, replace = FALSE))

# Note the addition of the "..." to the function input definition

example <- function(x, y, a , d, ...){

  CR <- x %>% filter(x1, x2>0) %>%
    group_by(x3) %>%
    summarise(avg_revenue = mean(x2), revenue = sum(x2))
  return(CR)
}

example(x,y,a, d = 0.1)

# Note the change in the reformatting of df with an inner_join

df <- inner_join(x %>% group_by(strata) %>% nest(),
                 y %>% group_by(strata) %>% nest(), 
                 by = "strata") %>% rename(x = data.x, y = data.y )

# with these changes pmap produces the output 
pmap(df, example, d= 0.1)