为不同的值添加列并计数到 R 中的相同 tibble

Add column for distinct values and count to same tibble in R

我想合并两个小标题。它们按相同的变量分组,但我希望在同一个 table 中看到它们。第一个是:

df %>%
    filter(Cancelled == FALSE) %>%
    count(School)

这给了我“学校”的计数:

School count
Comm 42
IR 52
Business 34
Nursing 23

下一个是:

df%>%
    filter(Cancelled == FALSE) %>%
    group_by(School) %>%
    summarise(n_distinct(ID))

这给了我每个“学校”中唯一“ID”值的计数。:

School unique
Comm 17
IR 18
Business 14
Nursing 12

基本上,我希望计数为一行,唯一值计数为第二行:

School count unique
Comm 17 42
IR 18 52
Business 14 34
Nursing 12 23

提前致谢!

*编辑:更好地描述原始数据

dput(data)
structure(list(ID = c(1986, 3707, 2467, 3087, 2155, 3133, 2531, 
3112, 2042, 2912, 1305, 1519, 2411, 3630, 2015, 2943, 2873, 1591, 
3127, 3733, 3492, 3156, 3907, 3877, 2050, 2956, 1280, 3544, 1465, 
1410, 3946, 2868, 2288, 3722, 1611, 3188, 3609, 2847, 1803, 2580, 
1928, 1775, 2774, 1259, 3851, 2135, 3046, 1480, 2480, 2240, 3279, 
3983, 2042, 3754, 1851, 3528, 3161, 2547, 3068, 2739, 3936, 3290, 
2465, 2839, 2139, 2635, 1655, 3903, 2333, 1787, 2913, 2764, 2791, 
1501, 2101, 3312, 3428, 3502, 1826, 3823, 3064, 2705, 1917, 1427, 
1627, 1519, 3811, 3661, 3034, 1977, 2502, 3240, 1575, 2882, 3651, 
2065, 2366, 2016, 2991, 1996), School = c("Nursing", "Business", 
"Comm", "Nursing", "Business", "Nursing", "Nursing", "Nursing", 
"Nursing", "Nursing", "IR", "Comm", "Nursing", "IR", "Nursing", 
"Comm", "Business", "Business", "Business", "Nursing", "Nursing", 
"Nursing", "Comm", "Nursing", "Business", "Nursing", "Comm", 
"Business", "IR", "IR", "Nursing", "Business", "Business", "IR", 
"Business", "Business", "Business", "Comm", "Nursing", "Comm", 
"IR", "Nursing", "Nursing", "Nursing", "Nursing", "Comm", "Nursing", 
"Business", "IR", "Comm", "Comm", "Business", "IR", "Nursing", 
"Nursing", "IR", "Comm", "Business", "IR", "IR", "Nursing", "IR", 
"Nursing", "Nursing", "Nursing", "Business", "Comm", "Nursing", 
"IR", "IR", "Business", "Comm", "IR", "Nursing", "Nursing", "Business", 
"Nursing", "Comm", "Business", "Business", "Nursing", "Nursing", 
"Nursing", "Nursing", "Nursing", "Nursing", "Comm", "Nursing", 
"IR", "Business", "Nursing", "Comm", "Nursing", "Comm", "Nursing", 
"Nursing", "IR", "Business", "Nursing", "Comm")), row.names = c(NA, 
-100L), class = c("tbl_df", "tbl", "data.frame"))

我们可以使用 left_join:

library(dplyr)
left_join(df, df1, by="School")
    School count unique
1     Comm    42     17
2       IR    52     18
3 Business    34     14
4  Nursing    23     12

你可以在一个管道中完成所有事情,但它不一定看起来更干净:

library(tidyverse)
data %>%
  count(School, name = 'count') %>%
  left_join(., data %>%
                 group_by(School) %>%
                 summarize(unique = n_distinct(ID)),
            by = 'School')

其中给出了您的示例数据:

# A tibble: 4 x 3
  School   count unique
  <chr>    <int>  <int>
1 Business    22     22
2 Comm        18     18
3 IR          17     17
4 Nursing     43     43

我 guess/assume 您的示例数据只是巧合,每个学校没有重复的 ID,因此计数和唯一值相同。