R - 分组后，如何获得重复值的最大次数？

Question

假设我有这样的数据集：

 id <- c(1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3)
 foo <- c('a', 'b', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'a', 'a')
 dat <- data.frame(id, foo)

即

对于每个 id，我如何获得 foo 值的最大重复值

即

   id  max_repeat
1   1   1
2   2   3
3   3   2

例如，id 2 的 max_repeat 为 3，因为其中一个值 foo (b) 重复了 3 次。

Answer 1

使用tidyverse:

dat %>%
 group_by(id, foo) %>% #Grouping by id and foo
 tally() %>% #Calculating the count
 group_by(id) %>%
 summarise(res = max(n)) #Keeping the max count per id

     id   res
  <dbl> <dbl>
1    1.    1.
2    2.    3.
3    3.    2.

Answer 2

dplyr

library(tidyverse)

dat %>% 
  group_by(id) %>% 
  summarise(max_repeat = max(tabulate(foo)))

# # A tibble: 3 x 2
#      id max_repeat
#   <dbl>      <int>
# 1     1          1
# 2     2          3
# 3     3          2

data.table

library(data.table)
setDT(dat)

dat[, .(max_repeat = max(tabulate(foo))), by = id]

#    id max_repeat
# 1:  1          1
# 2:  2          3
# 3:  3          2

base（如果需要可以使用setNames更改名称）

aggregate(foo ~ id, dat, function(x) max(tabulate(x)))
#   id foo
# 1  1   1
# 2  2   3
# 3  3   2

Answer 3

如果没有包，您可以组合两个 aggregate()s，一个有长度，一个有最大值。

x1 <- with(dat, aggregate(list(count=id), list(id=id, foo=foo), FUN=length))
x2 <- with(x1, aggregate(list(max_repeat=count), list(id=id), FUN=max))

产量：

> x2
  id max_repeat
1  1          1
2  2          3
3  3          2

数据：

dat <- structure(list(id = c(1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3), foo = structure(c(1L, 
2L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 1L, 1L), .Label = c("a", "b", 
"c"), class = "factor")), class = "data.frame", row.names = c(NA, 
-11L))

R - 分组后，如何获得重复值的最大次数？

R - After grouping, how do I get the maximum times a value is repeated?

r

duplicates

group-summaries

dplyr