R 到 Latex 摘要 table 按年份分类变量
R to Latex summary table for categorical variables by year
year <- c(2000,2000,2000,2001,2001,2001)
gender <- c("F","M","M","F","F","M")
grade <- c("A","B","C","C","B","A")
df <- data.frame(year,gender,grade)
我想做一个总结table但尽量减少手动代码并尽可能自动化该过程。在我的项目中,我有 170 个变量要总结。
我尝试了 tidyverse group by 但没有得到我想要的结果。
我将使用 xtable 移动到乳胶文件。 (我尝试 add.to.row 但未能在第一行添加“性别”。)
这是我想要的结果。
请帮我画一下table。我需要 table.
中的变量名
您可以使用 pivot_longer
和 summarise
生成汇总值。
library(tidyverse)
df %>%
pivot_longer(-year) %>%
group_by(year, name, value) %>%
summarise(n = n()) %>%
mutate(prop = round(n / sum(n), 3) * 100)
# A tibble: 10 x 5
# Groups: year, name [4]
year name value n prop
<dbl> <chr> <chr> <int> <dbl>
1 2000 gender F 1 33.3
2 2000 gender M 2 66.7
3 2000 grade A 1 33.3
4 2000 grade B 1 33.3
5 2000 grade C 1 33.3
6 2001 gender F 2 66.7
7 2001 gender M 1 33.3
8 2001 grade A 1 33.3
9 2001 grade B 1 33.3
10 2001 grade C 1 33.3
您还可以通过在格式化字符串中加入值,然后使用 pivot_wider
:
来更接近您想要的 table
df %>%
pivot_longer(-year) %>%
group_by(year, name, value) %>%
summarise(n = n()) %>%
mutate(prop = round(n / sum(n), 3) * 100,
summary_str = glue::glue("{n}({prop}%)")) %>%
pivot_wider(id_cols = c(name, value), names_from = "year",
values_from = "summary_str")
name value `2000` `2001`
<chr> <chr> <glue> <glue>
1 gender F 1(33.3%) 2(66.7%)
2 gender M 2(66.7%) 1(33.3%)
3 grade A 1(33.3%) 1(33.3%)
4 grade B 1(33.3%) 1(33.3%)
5 grade C 1(33.3%) 1(33.3%)
我在评论中提到您可以在 tables
包中执行此操作。这是一个例子:
year <- c(2000,2000,2000,2001,2001,2001)
gender <- c("F","M","M","F","F","M")
grade <- c("A","B","C","C","B","A")
# Our table treats the columns as factors, so save them that way
# I'll change the names to the way we'd like them to appear.
df <- data.frame(Year = factor(year),
Gender = factor(gender),
Grade = factor(grade))
library(tables)
# write a small function to format the percent values the way you want.
fmtPercent <- function(x, digits = 1) paste0("(", format(x, digits = digits), "\%)")
# Calculate the table object.
tab <- tabular(Gender + Grade ~ Year * Heading()*(1 + Percent("col")*Format(fmtPercent())),
data = df)
# Print it as text.
tab
#>
#> Year
#> 2000 2001
#> Gender F 1 (33\%) 2 (67\%)
#> M 2 (67\%) 1 (33\%)
#> Grade A 1 (33\%) 1 (33\%)
#> B 1 (33\%) 1 (33\%)
#> C 1 (33\%) 1 (33\%)
由 reprex package (v2.0.0)
于 2021-07-31 创建
我在百分号之前添加转义符的原因是它可以在 LaTeX 中正确打印。在 R Markdown 文档的 PDF 输出中,它看起来像这样:
year <- c(2000,2000,2000,2001,2001,2001)
gender <- c("F","M","M","F","F","M")
grade <- c("A","B","C","C","B","A")
df <- data.frame(year,gender,grade)
我想做一个总结table但尽量减少手动代码并尽可能自动化该过程。在我的项目中,我有 170 个变量要总结。 我尝试了 tidyverse group by 但没有得到我想要的结果。 我将使用 xtable 移动到乳胶文件。 (我尝试 add.to.row 但未能在第一行添加“性别”。)
这是我想要的结果。
请帮我画一下table。我需要 table.
中的变量名您可以使用 pivot_longer
和 summarise
生成汇总值。
library(tidyverse)
df %>%
pivot_longer(-year) %>%
group_by(year, name, value) %>%
summarise(n = n()) %>%
mutate(prop = round(n / sum(n), 3) * 100)
# A tibble: 10 x 5
# Groups: year, name [4]
year name value n prop
<dbl> <chr> <chr> <int> <dbl>
1 2000 gender F 1 33.3
2 2000 gender M 2 66.7
3 2000 grade A 1 33.3
4 2000 grade B 1 33.3
5 2000 grade C 1 33.3
6 2001 gender F 2 66.7
7 2001 gender M 1 33.3
8 2001 grade A 1 33.3
9 2001 grade B 1 33.3
10 2001 grade C 1 33.3
您还可以通过在格式化字符串中加入值,然后使用 pivot_wider
:
df %>%
pivot_longer(-year) %>%
group_by(year, name, value) %>%
summarise(n = n()) %>%
mutate(prop = round(n / sum(n), 3) * 100,
summary_str = glue::glue("{n}({prop}%)")) %>%
pivot_wider(id_cols = c(name, value), names_from = "year",
values_from = "summary_str")
name value `2000` `2001`
<chr> <chr> <glue> <glue>
1 gender F 1(33.3%) 2(66.7%)
2 gender M 2(66.7%) 1(33.3%)
3 grade A 1(33.3%) 1(33.3%)
4 grade B 1(33.3%) 1(33.3%)
5 grade C 1(33.3%) 1(33.3%)
我在评论中提到您可以在 tables
包中执行此操作。这是一个例子:
year <- c(2000,2000,2000,2001,2001,2001)
gender <- c("F","M","M","F","F","M")
grade <- c("A","B","C","C","B","A")
# Our table treats the columns as factors, so save them that way
# I'll change the names to the way we'd like them to appear.
df <- data.frame(Year = factor(year),
Gender = factor(gender),
Grade = factor(grade))
library(tables)
# write a small function to format the percent values the way you want.
fmtPercent <- function(x, digits = 1) paste0("(", format(x, digits = digits), "\%)")
# Calculate the table object.
tab <- tabular(Gender + Grade ~ Year * Heading()*(1 + Percent("col")*Format(fmtPercent())),
data = df)
# Print it as text.
tab
#>
#> Year
#> 2000 2001
#> Gender F 1 (33\%) 2 (67\%)
#> M 2 (67\%) 1 (33\%)
#> Grade A 1 (33\%) 1 (33\%)
#> B 1 (33\%) 1 (33\%)
#> C 1 (33\%) 1 (33\%)
由 reprex package (v2.0.0)
于 2021-07-31 创建我在百分号之前添加转义符的原因是它可以在 LaTeX 中正确打印。在 R Markdown 文档的 PDF 输出中,它看起来像这样: