R 到 Latex 摘要 table 按年份分类变量

R to Latex summary table for categorical variables by year

year <- c(2000,2000,2000,2001,2001,2001)
gender <- c("F","M","M","F","F","M")
grade <- c("A","B","C","C","B","A")
df <- data.frame(year,gender,grade)

我想做一个总结table但尽量减少手动代码并尽可能自动化该过程。在我的项目中,我有 170 个变量要总结。 我尝试了 tidyverse group by 但没有得到我想要的结果。 我将使用 xtable 移动到乳胶文件。 (我尝试 add.to.row 但未能在第一行添加“性别”。)

这是我想要的结果。

请帮我画一下table。我需要 table.

中的变量名

您可以使用 pivot_longersummarise 生成汇总值。

library(tidyverse)

df %>% 
  pivot_longer(-year) %>% 
  group_by(year, name, value) %>% 
  summarise(n = n()) %>% 
  mutate(prop = round(n / sum(n), 3) * 100)

# A tibble: 10 x 5
# Groups:   year, name [4]
    year name   value     n  prop
   <dbl> <chr>  <chr> <int> <dbl>
 1  2000 gender F         1  33.3
 2  2000 gender M         2  66.7
 3  2000 grade  A         1  33.3
 4  2000 grade  B         1  33.3
 5  2000 grade  C         1  33.3
 6  2001 gender F         2  66.7
 7  2001 gender M         1  33.3
 8  2001 grade  A         1  33.3
 9  2001 grade  B         1  33.3
10  2001 grade  C         1  33.3

您还可以通过在格式化字符串中加入值,然后使用 pivot_wider:

来更接近您想要的 table
df %>% 
  pivot_longer(-year) %>% 
  group_by(year, name, value) %>% 
  summarise(n = n()) %>% 
  mutate(prop = round(n / sum(n), 3) * 100,
         summary_str = glue::glue("{n}({prop}%)")) %>% 
  pivot_wider(id_cols = c(name, value), names_from = "year", 
              values_from = "summary_str") 

  name   value `2000`   `2001`  
  <chr>  <chr> <glue>   <glue>  
1 gender F     1(33.3%) 2(66.7%)
2 gender M     2(66.7%) 1(33.3%)
3 grade  A     1(33.3%) 1(33.3%)
4 grade  B     1(33.3%) 1(33.3%)
5 grade  C     1(33.3%) 1(33.3%)

我在评论中提到您可以在 tables 包中执行此操作。这是一个例子:

year <- c(2000,2000,2000,2001,2001,2001)
gender <- c("F","M","M","F","F","M")
grade <- c("A","B","C","C","B","A")

# Our table treats the columns as factors, so save them that way
# I'll change the names to the way we'd like them to appear.

df <- data.frame(Year = factor(year), 
                 Gender = factor(gender),
                 Grade = factor(grade))

library(tables)
# write a small function to format the percent values the way you want.
fmtPercent <- function(x, digits = 1) paste0("(", format(x, digits = digits), "\%)")

# Calculate the table object.
tab <- tabular(Gender + Grade ~ Year * Heading()*(1 + Percent("col")*Format(fmtPercent())),
               data = df)

# Print it as text.
tab
#>                                    
#>           Year                     
#>           2000         2001        
#>  Gender F 1    (33\%) 2    (67\%)
#>         M 2    (67\%) 1    (33\%)
#>  Grade  A 1    (33\%) 1    (33\%)
#>         B 1    (33\%) 1    (33\%)
#>         C 1    (33\%) 1    (33\%)

reprex package (v2.0.0)

于 2021-07-31 创建

我在百分号之前添加转义符的原因是它可以在 LaTeX 中正确打印。在 R Markdown 文档的 PDF 输出中,它看起来像这样: