是否可以使用 magrittr 在单个工作流程中创建两个数据框?

Is is possible to create two data frames in a single work flow with magrittr?

开始使用 magrittr pipe operators and was curious if two data frames could be created in a single flow. For example, it would be helpful to produce a non-aggregated data frame for plotting and an aggregated data frame to order factors (aggregate ordering example)。

这是一个相当人为的例子,它说明了这个问题:

library(dplyr)
library(tidyr)
library(magrittr)
library(ggplot2) # msleep

vore_count <- 
  na.exclude(msleep) %>%
  group_by(vore, order) %>%
  summarise(count = n()) %>%
  ungroup()

agg <- vore_count %>% 
  spread(vore, count)

vore_countagg是否可以在同一个流程中生成?

我尝试了以下方法(以及使用 %T>%),但显然行不通。

vore_count <- 
  na.exclude(msleep) %>%
  group_by(vore, order) %>%
  summarise(count = n()) %>%
  ungroup() %>%
      agg <- spread(vore, count)

您可以在管道中使用 list(),然后在计算第一个 data.frame 后连接 agg。这里我直接用mtcars。结果是两个数据框的命名列表。

library(dplyr)
library(tidyr)

na.exclude(mtcars) %>%
    group_by(cyl, disp) %>%
    summarise(count = n()) %>%
    ungroup %>%
    list(cyl_count = .) %>%
    c(list(agg = spread(.$cyl_count, cyl, count)))

如果你想将这些分配给全局环境,你可以在管道的末尾添加以下行

... %>%
    list2env(globalenv())

ls(pattern = "agg|cyl_count")
# [1] "agg"       "cyl_count"

pipeR.

的边赋值更容易
library(pipeR)
library(dplyr)
library(ggplot2) 
library(tidyr)
na.exclude(msleep) %>>%
  group_by(vore, order) %>>%
  summarise(count = n()) %>>%
  ungroup() %>>%
  (~ vore_count) %>>% 
  spread(vore, count)%>>% 
  (~ agg)

虽然我能理解这种诱惑,但 IMO 只应从一个 workflow/pipeline 中分配一个作业。它更简洁、更易于阅读,并且更好地练习。理想情况下,每个管道都应该只有一个目的。一种输入,一种输出。