Table 个数据框的列数

Question

我有一个数据框，其中包含以字符串形式提供的分类数据列。每列的类别相同，例如：

myDF=data.frame(col1=sample(c("a","b","c"),10,replace=T),
                col2=sample(c("a","b","c"),10,replace=T),
                col3=sample(c("a","b","c"),10,replace=T))

我想在每个类别中按列生成 table 个计数。

当所有列都包含所有类别时，这可以通过 apply 使用函数 table 完成，例如：

> myDF
   col1 col2 col3
1     a    c    a
2     b    b    b
3     a    a    b
4     b    b    a
5     c    c    a
6     a    a    a
7     a    c    c
8     a    a    c
9     c    a    a
10    a    a    b
> apply(myDF,2,table)
  col1 col2 col3
a    6    5    5
b    2    2    3
c    2    3    2

但是，如果一列缺少某些类别，这将不起作用，因为 table 不知道期望的类别：

myDF=data.frame(col1=sample(c("a","b","c"),10,replace=T),
                col2=sample(c("a","b","c"),10,replace=T),
                col3=sample(c("a","b"),10,replace=T))

给出：

> myDF
   col1 col2 col3
1     c    a    a
2     a    a    b
3     b    a    a
4     c    c    a
5     c    a    a
6     c    c    a
7     c    b    a
8     c    b    b
9     a    a    a
10    b    b    a
> apply(myDF,2,table)    
$col1

a b c 
2 2 6 

$col2

a b c 
5 3 2 

$col3

a b 
8 2

我如何生成一个看起来像第一个的 table，任何缺失的类别都为 0？

Answer 1

您可以收集所有因子水平并在 apply:

中使用它们

#get the levels from the whole data.frame
all_levels <- levels(unlist(myDF))

#convert each column to factor using the levels from above
#and then use table (which will return a zero for any missing levels)
apply(myDF, 2, function(x) table(factor(x, levels = all_levels)))

输出：

  col1 col2 col3
a    1    4    7
b    5    2    3
c    4    4    0

> myDF
   col1 col2 col3
1     b    a    a
2     c    b    a
3     c    c    b
4     b    a    b
5     b    c    a
6     c    c    a
7     c    b    a
8     b    a    b
9     a    c    a
10    b    a    a

Answer 2

我们可以使用mtabulate

library(qdapTools)
t(mtabulate(myDF))
#    col1 col2 col3
#a    2    5    8
#b    2    3    2
#c    6    2    0

它适用于 OP post

中提到的两种情况

Table 个数据框的列数

Table of column counts for a data frame

r

apply

tabular

dataframe