如何使用 xtabs 创建频率表
How to create frequency tables with xtabs
> data(infert, package = "datasets")
> tt = xtabs(~education + induced + spontaneous, data = infert)
> ftable(tt)
spontaneous 0 1 2
education induced
0-5yrs 0 2 1 1
1 1 0 1
2 6 0 0
6-11yrs 0 46 19 13
1 15 9 3
2 10 5 0
12+ yrs 0 19 27 15
1 29 7 3
2 13 3 0
xtabs
生成漂亮的表格,但我想知道是否有办法让它显示行总计和列总计。此外,它是否有可能显示某种频率,即 N/row 总数和 N/column 总数?
我试过gmodels
包中的CrossTable
功能,效果很好。但是,它似乎只适用于 2 个变量,而我想一次比较 2 个以上的变量。
> library(gmodels)
> CrossTable(infert$education, infert$induced, expected = TRUE)
Cell Contents
|-------------------------|
| N |
| Expected N |
| Chi-square contribution |
| N / Row Total |
| N / Col Total |
| N / Table Total |
|-------------------------|
Total Observations in Table: 248
| infert$induced
infert$education | 0 | 1 | 2 | Row Total |
-----------------|-----------|-----------|-----------|-----------|
0-5yrs | 4 | 2 | 6 | 12 |
| 6.919 | 3.290 | 1.790 | |
| 1.232 | 0.506 | 9.898 | |
| 0.333 | 0.167 | 0.500 | 0.048 |
| 0.028 | 0.029 | 0.162 | |
| 0.016 | 0.008 | 0.024 | |
-----------------|-----------|-----------|-----------|-----------|
6-11yrs | 78 | 27 | 15 | 120 |
| 69.194 | 32.903 | 17.903 | |
| 1.121 | 1.059 | 0.471 | |
| 0.650 | 0.225 | 0.125 | 0.484 |
| 0.545 | 0.397 | 0.405 | |
| 0.315 | 0.109 | 0.060 | |
-----------------|-----------|-----------|-----------|-----------|
12+ yrs | 61 | 39 | 16 | 116 |
| 66.887 | 31.806 | 17.306 | |
| 0.518 | 1.627 | 0.099 | |
| 0.526 | 0.336 | 0.138 | 0.468 |
| 0.427 | 0.574 | 0.432 | |
| 0.246 | 0.157 | 0.065 | |
-----------------|-----------|-----------|-----------|-----------|
Column Total | 143 | 68 | 37 | 248 |
| 0.577 | 0.274 | 0.149 | |
-----------------|-----------|-----------|-----------|-----------|
Statistics for All Table Factors
Pearson's Chi-squared test
------------------------------------------------------------
Chi^2 = 16.53059 d.f. = 4 p = 0.002383898
您可以使用 prop.table
生成频率表并使用 addmargins
添加边距
data(infert, package='datasets')
prop.table(addmargins(xtabs(~education + induced + spontaneous, data=infert)))
您和@ajerneck 可能会喜欢以下解决方案:
printProp.xtab<-function(xtab,fmt='%s (%1.2f%%)',big.mark=',',na.print="NA",...) {
## PURPOSE: print an xtab with percentages in
## parentheses in addition to counts at every value.
## TODO: alignment the percentages at the decimal point.
xtab.am<-addmargins(xtab)
xtab.pt.am<-addmargins(prop.table(xtab,...))
res<-sprintf(fmt,format(xtab.am,big.mark=big.mark),100*xtab.pt.am)
attributes(res)<-attributes( xtab.am)
print(quote=FALSE
,na.print=na.print
,res)
}
在示例数据上使用它:
printProp.xtab(xtabs(~education + induced + spontaneous, data=infert))
, , spontaneous = 0
induced
education 0 1 2 Sum
0-5yrs 2 (0.81%) 1 (0.40%) 6 (2.42%) 9 (3.63%)
6-11yrs 46 (18.55%) 15 (6.05%) 10 (4.03%) 71 (28.63%)
12+ yrs 19 (7.66%) 29 (11.69%) 13 (5.24%) 61 (24.60%)
Sum 67 (27.02%) 45 (18.15%) 29 (11.69%) 141 (56.85%)
, , spontaneous = 1
induced
education 0 1 2 Sum
0-5yrs 1 (0.40%) 0 (0.00%) 0 (0.00%) 1 (0.40%)
6-11yrs 19 (7.66%) 9 (3.63%) 5 (2.02%) 33 (13.31%)
12+ yrs 27 (10.89%) 7 (2.82%) 3 (1.21%) 37 (14.92%)
Sum 47 (18.95%) 16 (6.45%) 8 (3.23%) 71 (28.63%)
, , spontaneous = 2
induced
education 0 1 2 Sum
0-5yrs 1 (0.40%) 1 (0.40%) 0 (0.00%) 2 (0.81%)
6-11yrs 13 (5.24%) 3 (1.21%) 0 (0.00%) 16 (6.45%)
12+ yrs 15 (6.05%) 3 (1.21%) 0 (0.00%) 18 (7.26%)
Sum 29 (11.69%) 7 (2.82%) 0 (0.00%) 36 (14.52%)
, , spontaneous = Sum
induced
education 0 1 2 Sum
0-5yrs 4 (1.61%) 2 (0.81%) 6 (2.42%) 12 (4.84%)
6-11yrs 78 (31.45%) 27 (10.89%) 15 (6.05%) 120 (48.39%)
12+ yrs 61 (24.60%) 39 (15.73%) 16 (6.45%) 116 (46.77%)
Sum 143 (57.66%) 68 (27.42%) 37 (14.92%) 248 (100.00%)
> data(infert, package = "datasets")
> tt = xtabs(~education + induced + spontaneous, data = infert)
> ftable(tt)
spontaneous 0 1 2
education induced
0-5yrs 0 2 1 1
1 1 0 1
2 6 0 0
6-11yrs 0 46 19 13
1 15 9 3
2 10 5 0
12+ yrs 0 19 27 15
1 29 7 3
2 13 3 0
xtabs
生成漂亮的表格,但我想知道是否有办法让它显示行总计和列总计。此外,它是否有可能显示某种频率,即 N/row 总数和 N/column 总数?
我试过gmodels
包中的CrossTable
功能,效果很好。但是,它似乎只适用于 2 个变量,而我想一次比较 2 个以上的变量。
> library(gmodels)
> CrossTable(infert$education, infert$induced, expected = TRUE)
Cell Contents
|-------------------------|
| N |
| Expected N |
| Chi-square contribution |
| N / Row Total |
| N / Col Total |
| N / Table Total |
|-------------------------|
Total Observations in Table: 248
| infert$induced
infert$education | 0 | 1 | 2 | Row Total |
-----------------|-----------|-----------|-----------|-----------|
0-5yrs | 4 | 2 | 6 | 12 |
| 6.919 | 3.290 | 1.790 | |
| 1.232 | 0.506 | 9.898 | |
| 0.333 | 0.167 | 0.500 | 0.048 |
| 0.028 | 0.029 | 0.162 | |
| 0.016 | 0.008 | 0.024 | |
-----------------|-----------|-----------|-----------|-----------|
6-11yrs | 78 | 27 | 15 | 120 |
| 69.194 | 32.903 | 17.903 | |
| 1.121 | 1.059 | 0.471 | |
| 0.650 | 0.225 | 0.125 | 0.484 |
| 0.545 | 0.397 | 0.405 | |
| 0.315 | 0.109 | 0.060 | |
-----------------|-----------|-----------|-----------|-----------|
12+ yrs | 61 | 39 | 16 | 116 |
| 66.887 | 31.806 | 17.306 | |
| 0.518 | 1.627 | 0.099 | |
| 0.526 | 0.336 | 0.138 | 0.468 |
| 0.427 | 0.574 | 0.432 | |
| 0.246 | 0.157 | 0.065 | |
-----------------|-----------|-----------|-----------|-----------|
Column Total | 143 | 68 | 37 | 248 |
| 0.577 | 0.274 | 0.149 | |
-----------------|-----------|-----------|-----------|-----------|
Statistics for All Table Factors
Pearson's Chi-squared test
------------------------------------------------------------
Chi^2 = 16.53059 d.f. = 4 p = 0.002383898
您可以使用 prop.table
生成频率表并使用 addmargins
data(infert, package='datasets')
prop.table(addmargins(xtabs(~education + induced + spontaneous, data=infert)))
您和@ajerneck 可能会喜欢以下解决方案:
printProp.xtab<-function(xtab,fmt='%s (%1.2f%%)',big.mark=',',na.print="NA",...) {
## PURPOSE: print an xtab with percentages in
## parentheses in addition to counts at every value.
## TODO: alignment the percentages at the decimal point.
xtab.am<-addmargins(xtab)
xtab.pt.am<-addmargins(prop.table(xtab,...))
res<-sprintf(fmt,format(xtab.am,big.mark=big.mark),100*xtab.pt.am)
attributes(res)<-attributes( xtab.am)
print(quote=FALSE
,na.print=na.print
,res)
}
在示例数据上使用它:
printProp.xtab(xtabs(~education + induced + spontaneous, data=infert))
, , spontaneous = 0
induced
education 0 1 2 Sum
0-5yrs 2 (0.81%) 1 (0.40%) 6 (2.42%) 9 (3.63%)
6-11yrs 46 (18.55%) 15 (6.05%) 10 (4.03%) 71 (28.63%)
12+ yrs 19 (7.66%) 29 (11.69%) 13 (5.24%) 61 (24.60%)
Sum 67 (27.02%) 45 (18.15%) 29 (11.69%) 141 (56.85%)
, , spontaneous = 1
induced
education 0 1 2 Sum
0-5yrs 1 (0.40%) 0 (0.00%) 0 (0.00%) 1 (0.40%)
6-11yrs 19 (7.66%) 9 (3.63%) 5 (2.02%) 33 (13.31%)
12+ yrs 27 (10.89%) 7 (2.82%) 3 (1.21%) 37 (14.92%)
Sum 47 (18.95%) 16 (6.45%) 8 (3.23%) 71 (28.63%)
, , spontaneous = 2
induced
education 0 1 2 Sum
0-5yrs 1 (0.40%) 1 (0.40%) 0 (0.00%) 2 (0.81%)
6-11yrs 13 (5.24%) 3 (1.21%) 0 (0.00%) 16 (6.45%)
12+ yrs 15 (6.05%) 3 (1.21%) 0 (0.00%) 18 (7.26%)
Sum 29 (11.69%) 7 (2.82%) 0 (0.00%) 36 (14.52%)
, , spontaneous = Sum
induced
education 0 1 2 Sum
0-5yrs 4 (1.61%) 2 (0.81%) 6 (2.42%) 12 (4.84%)
6-11yrs 78 (31.45%) 27 (10.89%) 15 (6.05%) 120 (48.39%)
12+ yrs 61 (24.60%) 39 (15.73%) 16 (6.45%) 116 (46.77%)
Sum 143 (57.66%) 68 (27.42%) 37 (14.92%) 248 (100.00%)