freq table 用于 r 中的多个变量

Question

我想将项目变量与猫作为频率交叉表 table。

df1 <- data.frame(cat =   c(1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4),
                  item1 = c(0,0,1,0,1,1,0,0,0,1,0,1,0,0,1,0,0,1),
                  item2 = c(1,1,0,1,0,1,1,0,0,0,1,0,1,1,0,0,1,0),
                  item3 = c(0,0,1,0,1,0,0,0,1,0,1,1,1,0,0,1,0,1))

> table(df1$cat, df1$item1)
   
    0 1
  1 3 1
  2 3 2
  3 3 2
  4 2 2

有没有办法把cat的所有项变量freq table一起打印出来？

谢谢

Answer 1

你可以试试这个：

List <- list()
for(i in 2:dim(df1)[2])
{
  List[[i-1]] <- table(df1$cat, df1[,i])
}

[[1]]
   
    0 1
  1 3 1
  2 3 2
  3 3 2
  4 2 2

[[2]]
   
    0 1
  1 1 3
  2 3 2
  3 2 3
  4 3 1

[[3]]
   
    0 1
  1 3 1
  2 3 2
  3 2 3
  4 2 2

Answer 2

您可以使用 tally() 获取每个组组合的频率。

library(tidyverse)
df1 <- data.frame(cat = c(1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4),
                  item1 = c(0,0,1,0,1,1,0,0,0,1,0,1,0,0,1,0,0,1),
                  item2 = c(1,1,0,1,0,1,1,0,0,0,1,0,1,1,0,0,1,0),
                  item3 = c(0,0,1,0,1,0,0,0,1,0,1,1,1,0,0,1,0,1)) 

df1 %>% mutate_if(is.numeric, as.factor) %>% 
  group_by(cat, item1, item2, item3, .drop=F) %>% 
  tally()

首先将您的变量转换为因子，然后您可以使用 group_by(, .drop=F) %>% tally() 计算所有变量，包括所有零频率分组。删除 .drop=F 以删除所有零频率。

   cat item1 item2 item3 n
1    1     0     0     0 0
2    1     0     0     1 0
3    1     0     1     0 3
4    1     0     1     1 0
5    1     1     0     0 0
6    1     1     0     1 1
7    1     1     1     0 0
8    1     1     1     1 0
9    2     0     0     0 1
10   2     0     0     1 1
11   2     0     1     0 1
12   2     0     1     1 0
13   2     1     0     0 0
14   2     1     0     1 1
15   2     1     1     0 1
16   2     1     1     1 0
17   3     0     0     0 0
18   3     0     0     1 0
19   3     0     1     0 1
20   3     0     1     1 2
21   3     1     0     0 1
22   3     1     0     1 1
23   3     1     1     0 0
24   3     1     1     1 0
25   4     0     0     0 0
26   4     0     0     1 1
27   4     0     1     0 1
28   4     0     1     1 0
29   4     1     0     0 1
30   4     1     0     1 1
31   4     1     1     0 0
32   4     1     1     1 0

或者，如果那样太笨重，您也可以尝试 table1() 来自 library(table1)。

library(tidyverse)
library(table1)
df1 <- data.frame(cat = c(1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4),
                  item1 = c(0,0,1,0,1,1,0,0,0,1,0,1,0,0,1,0,0,1),
                  item2 = c(1,1,0,1,0,1,1,0,0,0,1,0,1,1,0,0,1,0),
                  item3 = c(0,0,1,0,1,0,0,0,1,0,1,1,1,0,0,1,0,1)) 

df1 <- df1 %>% mutate_if(is.numeric, as.factor)

table1(~ item1 + item2 + item3 | cat, data=df1)

获得频率和百分比的table。第一行是您的 cat 变量。

table1() 非常适合生成 HTML 频率 tables。极力推荐。您可以做很多格式化和标签来使 table 呈现 table。 Here is a tutorial

Answer 3

这是 base-R 中的快速解决方案

aggregate(.~ cat, df1, table)

  cat item1.0 item1.1 item2.0 item2.1 item3.0 item3.1
1   1       3       1       1       3       3       1
2   2       3       2       3       2       3       2
3   3       3       2       2       3       2       3
4   4       2       2       3       1       2       2

Answer 4

这是另一种使用来自基础 R 的 ftable 和 stack 的方法：

x <- ftable(cbind(cat = df1[, 1], stack(df1[-1])), row.vars = 1, col.vars = c(3, 2))
x
#     ind    item1   item2   item3  
#     values     0 1     0 1     0 1
# cat                               
# 1              3 1     1 3     3 1
# 2              3 2     3 2     3 2
# 3              3 2     2 3     2 3
# 4              2 2     3 1     2 2

此方法的一个（有争议的）缺点是用于将 ftables 转换为更多可用对象的默认 data.table 或 data.frame 方法会将输出转换为长格式。但是，如果你想保持宽格式，你可以 grab SOfun 并使用 ftable2dt。

library(SOfun)
ftable2dt(x)
#    cat item1_0 item1_1 item2_0 item2_1 item3_0 item3_1
# 1:   1       3       1       1       3       3       1
# 2:   2       3       2       3       2       3       2
# 3:   3       3       2       2       3       2       3
# 4:   4       2       2       3       1       2       2

freq table 用于 r 中的多个变量

freq table for multiple variables in r

r

crosstab