如何使用 DPLYR 在一列中汇总组的唯一值?
How do I summarize unique values of group in one column using DPLYR?
目前我有以下代码:
categories <- df %>% #this is a very large df but that should not matter to my question
group_by(category, subcategory, IV_type) %>%
summarise(n = n())
产生以下 df:
category <- c('a','a','a','a','b','b','b','c','c')
subcategory <- c(1,1,2,3,4,4,5,6,7)
N <- c(21,13,7,9,11,17,19,23,27)
type <- c('nom', 'ord', 'nom', 'scale', 'nom', 'scale', 'nom', 'scale', 'scale')
categories <- data.frame(category, subcategory, N, type)
但是,我想获得这个数据框:
category1 <- c('a','a','a','b','b','c','c')
subcategory1 <- c(1,2,3,4,5,6,7)
N1 <- c(34,7,9,28,19,23,27)
type1 <- c('nom, ord', 'nom', 'scale', 'nom, scale', 'nom', 'scale', 'scale')
categories1 <- data.frame(category1, subcategory1, N1, type1)
我的尝试:
categories <- df %>%
group_by(category, subcategory) %>%
summarise(n = n(), unique_types = unique(type))
不幸的是,这会引发错误。有谁知道我怎样才能做到这一点?
您可以使用以下内容:
categories %>%
group_by(category, subcategory) %>%
summarise(N = sum(N), type = toString(unique(type)), .groups = 'drop')
category subcategory N type
<chr> <dbl> <dbl> <chr>
1 a 1 34 nom, ord
2 a 2 7 nom
3 a 3 9 scale
4 b 4 28 nom, scale
5 b 5 19 nom
6 c 6 23 scale
7 c 7 27 scale
目前我有以下代码:
categories <- df %>% #this is a very large df but that should not matter to my question
group_by(category, subcategory, IV_type) %>%
summarise(n = n())
产生以下 df:
category <- c('a','a','a','a','b','b','b','c','c')
subcategory <- c(1,1,2,3,4,4,5,6,7)
N <- c(21,13,7,9,11,17,19,23,27)
type <- c('nom', 'ord', 'nom', 'scale', 'nom', 'scale', 'nom', 'scale', 'scale')
categories <- data.frame(category, subcategory, N, type)
但是,我想获得这个数据框:
category1 <- c('a','a','a','b','b','c','c')
subcategory1 <- c(1,2,3,4,5,6,7)
N1 <- c(34,7,9,28,19,23,27)
type1 <- c('nom, ord', 'nom', 'scale', 'nom, scale', 'nom', 'scale', 'scale')
categories1 <- data.frame(category1, subcategory1, N1, type1)
我的尝试:
categories <- df %>%
group_by(category, subcategory) %>%
summarise(n = n(), unique_types = unique(type))
不幸的是,这会引发错误。有谁知道我怎样才能做到这一点?
您可以使用以下内容:
categories %>%
group_by(category, subcategory) %>%
summarise(N = sum(N), type = toString(unique(type)), .groups = 'drop')
category subcategory N type
<chr> <dbl> <dbl> <chr>
1 a 1 34 nom, ord
2 a 2 7 nom
3 a 3 9 scale
4 b 4 28 nom, scale
5 b 5 19 nom
6 c 6 23 scale
7 c 7 27 scale