如何通过删除重复项并在 R 中添加出现频率来扩大此数据框？

Question

我试过下面的代码，但频率列只给了我 0 和 1。我想要实际计数。

data2 <- as.data.frame(table(unique.data.frame(data))))

数据框最初看起来像这样（但很大）：

ID    Rating
12    Good
12    Good
16    Good
16    Bad
16    Very Bad
34    Very Good

我想要这个：

ID    Rating    Freq
12    Good      2
16    Good      1
16    Bad       1
16    Very Bad  1
34    Very Good 1

Answer 1

你可以在 dplyr

中这样做

library(dplyr)
df %>% group_by(ID, Rating) %>% tally()

并自动排序：

df %>% group_by(ID, Rating) %>% tally(sort = TRUE)

Answer 2

可以使用count()函数，通过ID和Rating的组合进行计数：

> library(dplyr)
> data_count <- count(data, c("ID", "Rating"))
> data_count
  ID    Rating    Freq
  12    Good      2
  16    Good      1
  16    Bad       1
  16    Very Bad  1
  34    Very Good 1

Answer 3

代码中的 unique 给出了数据集的唯一行，因此 table 输出将只是“1”或“0”，具体取决于组合是否存在。相反，我们可以在整个数据集上应用 table，并且 subset 不是 '0'

的 "Freq"

 subset(as.data.frame(table(df1)), Freq!=0)
 #   ID    Rating Freq
 #2  16       Bad    1
 #4  12      Good    2
 #5  16      Good    1
 #8  16  Very Bad    1
 #12 34 Very Good    1

如何通过删除重复项并在 R 中添加出现频率来扩大此数据框？

How do I widen this data frame by removing the duplicates and adding frequencies of occurrences instead in R?

r

data-manipulation

bigdata