按函数中的两组变量分组

Question

我正在使用下面的示例数据集：

mytable <- read.table(text=
                        "group team num  ID
1   a   x    1    9
2   a   x    2    4
3   a   y    3    5
4   a   y    4    9
5   b   x    1    7
6   b   y    4    4
7   b   x    3    9
8   b   y    2    8",
                      header = TRUE, stringsAsFactors = FALSE)

我想为每组要分组的变量创建单独的数据框，我也想按两个变量分组...我不确定该怎么做。例如，我想要一个单独的数据框，按团队和 ID 对数据进行分组......我该怎么做？

library(dplyr)

lapply(c("group","team","ID",c("team","ID")), function(x){
  group_by(mytable,across(c(x,num)))%>%summarise(Count = n()) %>% mutate(new=x)%>% as.data.frame()
})

Answer 1

基于 tidyverse 的这个是否能满足您的需求？

library(tidyverse)

ytable %>% 
  group_by(team, ID) %>% 
  group_split()
<list_of<
  tbl_df<
    group: character
    team : character
    num  : integer
    ID   : integer
  >
>[7]>
[[1]]
# A tibble: 1 × 4
  group team    num    ID
  <chr> <chr> <int> <int>
1 a     x         2     4

[[2]]
# A tibble: 1 × 4
  group team    num    ID
  <chr> <chr> <int> <int>
1 b     x         1     7

[[3]]
# A tibble: 2 × 4
  group team    num    ID
  <chr> <chr> <int> <int>
1 a     x         1     9
2 b     x         3     9

[[4]]
# A tibble: 1 × 4
  group team    num    ID
  <chr> <chr> <int> <int>
1 b     y         4     4

[[5]]
# A tibble: 1 × 4
  group team    num    ID
  <chr> <chr> <int> <int>
1 a     y         3     5

[[6]]
# A tibble: 1 × 4
  group team    num    ID
  <chr> <chr> <int> <int>
1 b     y         2     8

[[7]]
# A tibble: 1 × 4
  group team    num    ID
  <chr> <chr> <int> <int>
1 a     y         4     9

Answer 2

看看这是不是你想要的。

library(dplyr)

cols <- list("group","team","ID", c("team","ID"))

lapply(cols, function(x, dat = mytable){
  dat2 <- dat %>%
    group_by(across({{x}})) %>% 
    summarise(Count = n()) %>% 
    mutate(new = toString(x)) %>% 
    as.data.frame()
  return(dat2)
})

# `summarise()` has grouped output by 'team'. You can override using the `.groups` argument.
# [[1]]
#   group Count   new
# 1     a     4 group
# 2     b     4 group
# 
# [[2]]
#   team Count  new
# 1    x     4 team
# 2    y     4 team
# 
# [[3]]
#   ID Count new
# 1  4     2  ID
# 2  5     1  ID
# 3  7     1  ID
# 4  8     1  ID
# 5  9     3  ID
# 
# [[4]]
#   team ID Count      new
# 1    x  4     1 team, ID
# 2    x  7     1 team, ID
# 3    x  9     2 team, ID
# 4    y  4     1 team, ID
# 5    y  5     1 team, ID
# 6    y  8     1 team, ID
# 7    y  9     1 team, ID

按函数中的两组变量分组

group by two sets of vars in a function

for-loop

r

lapply

dplyr