如何在函数中将 lapply 与 dplyr 组合

How to combine lapply with dplyr in a function

下面是我创建的示例数据框以及预期的输出。

df = data.frame(color = c("Yellow", "Blue", "Green", "Red", "Magenta"),
                values = c(24, 24, 34, 45, 49),
                Quarter = c("Period1","Period2" , "Period3", "Period3", "Period1"),
                Market = c("Camden", "StreetA", "DansFireplace", "StreetA", "DansFireplace"))


dfXQuarter = df %>% group_by(Quarter) %>% summarise(values = sum(values)) %>%
  mutate(cut = "Quarter") %>% data.frame()

colnames(dfXQuarter)[1] = "Grouping"

dfXMarket = df %>% group_by(Market) %>% summarise(values = sum(values)) %>% 
  mutate(cut = "Market")%>% data.frame()
colnames(dfXMarket)[1] = "Grouping"


df_all = rbind(dfXQuarter, dfXMarket)

现在为了简洁起见,我想将其编译成一个函数并使用lapply。 以下是我的尝试 -

list = c("Market", "Quarter")


df_all <- do.call(rbind, lapply(list, function(x){
  df_l= df %>% group_by(x) %>% 
    summarise(values = sum(values)) %>% 
    mutate(cut= x) %>% 
    data.frame()
   colnames(df_l)[df_l$x] = "Grouping"
  df_l
}))

这段代码出错了。

我需要输出是 'df_all' 输出的精确副本,以便进一步操作。

如何正确编写此函数?

不漂亮但可以工作并且不需要整齐的功能:

groupwise_summation <- function(df, grouping_vecs){


  # Split, apply, combine: 

  tmpdf <- do.call(rbind, lapply(split(df, df[,grouping_vecs]), function(x){sum(x$values)}))

  # Clean up the df: 

  data.frame(cbind(cut = row.names(tmpdf), value = as.numeric(tmpdf)), row.names = NULL)


}


# Apply and combine:

df_all <- rbind(groupwise_summation(df, c("Quarter")), groupwise_summation(df, c("Market")))


# Note inside the c(), you can use multiple grouping variables.

我们可以使用purrr::map_dfr

library(dplyr)
library(purrr) 
#Don't use the R build-in type e.g. list in variables name 
lst <- c("Market", "Quarter")
#Use map if you need the output as a list
map_dfr(lst, ~df %>% group_by("Grouping"=!!sym(.x)) %>% 
                                   summarise(values = sum(values)) %>%
                                   mutate(cut = .x) %>% 
                                   #To avoid the warning massage from bind_rows
                                   mutate_if(is.factor, as.character))

# A tibble: 6 x 3
  Grouping      values cut    
  <chr>          <dbl> <chr>  
1 Camden            24 Market 
2 DansFireplace     83 Market 
3 StreetA           69 Market 
4 Period1           73 Quarter
5 Period2           24 Quarter
6 Period3           79 Quarter

我们可以通过

解决第一个解决方案
  1. group_by(x) 更改为 group_by_at(x),因为这里的 x 是一个字符串。
  2. 使用 colnames(df_l)[colnames(df_l)==x] <- "Grouping" 命名分组变量。