将数据框列转换为 R 中的频率分布

Question

我最近开始研究 R 中的一些统计问题，我有一个疑问。我通常在 python 中编码并发现 "collections.Counter" 函数非常有用。然而，我没有在 R 中找到任何这样的等效命令，这令人惊讶，因为频率在统计中被大量使用。

例如我有这个table（数据框）-

df ->

c1          c2
reading1    2
reading2    3
reading3    1
reading4    3
reading5    2
reading6    4
reading7    1
reading8    2
reading9    4
reading10   5

我想在 R-

中得到这个

value    frequency
    1    2
    2    3
    3    2
    4    2
    5    1

我希望这能说明我想做什么.. 感谢任何帮助

出于说明目的 - 在 python 中我可以这样做 -

df_c2 = [2,3,1,3,2,4,1,2,4,5]
counter=collections.Counter(df$c2)
print (counter)

and get this - Counter({2: 3, 1: 2, 3: 2, 4: 2, 5: 1})
which I can manipulate using loops.

Answer 1

最简单的方法是使用 table()，其中 return 是一个名为 vector():

> table(df$c2)

1 2 3 4 5 
2 3 2 2 1

您可以 return 一个 data.frame 像这样：

> data.frame(table(df$c2))
  Var1 Freq
1    1    2
2    2    3
3    3    2
4    4    2
5    5    1

当然你也可以使用像"tidyverse"这样的包。

library(tidyverse)
df %>% 
  select(c2) %>% 
  group_by(c2) %>% 
  summarise(freq = n())
# # A tibble: 5 x 2
#      c2  freq
#   <int> <int>
# 1     1     2
# 2     2     3
# 3     3     2
# 4     4     2
# 5     5     1

将数据框列转换为 R 中的频率分布

Convert a data frame column into a frequency distribution in R

r

frequency

count