如何 decimal-align Latex 中的回归系数 table 在 rmarkdown 文档中输出

How to decimal-align regression coefficients in Latex table output in rmarkdown document

rmarkdown 文档中,我正在创建一个 Latex table 回归系数和标准误差,以便在单个 table 中比较多个回归模型。我想垂直对齐每个模型的系数,以便系数的小数点垂直向下排列一列。

我正在使用 texreg 创建 table。默认情况下,系数不是 decimal-aligned(相反,每个字符串都在其列中居中),我正在寻找一种获取系数 decimal-aligned 的方法。我不拘泥于 texreg,所以如果您有使用 xtablepanderstargazer 或任何其他方法的解决方案,我也会对此感兴趣.理想情况下,我想要一个可以在 rmarkdown 文档中以编程方式实现的解决方案,而不是在将文档呈现为 .tex 文件后调整 latex 标记。

作为奖励,我还希望能够在 table 标题中添加换行符。例如,在 texreg 中,您可以使用 custom.model.names 参数来设置每个回归模型的列名称。在下面的示例中,我希望将 "Add Horsepower and AM" 分成两行,这样该列就不需要那么宽了。我试过 "Add Horsepower \newline and AM" 但这只是将 "ewline" 添加到最后一列 header 并且忽略了“\n”。

这是一个可重现的例子:

---
title: "Regression Table"
author: "eipi10"
date: "August 15, 2016"
header-includes:
    - \usepackage{dcolumn}
output: pdf_document
---

```{r, echo=FALSE, message=FALSE, results="asis"}
library(texreg)

m1 = glm(mpg ~ wt + factor(cyl), data=mtcars)
m2 = glm(mpg ~ wt + factor(cyl) + hp + factor(am), data=mtcars)

texreg(list(m1,m2),
       single.row=TRUE, 
       custom.model.names=c("Base Model", "Add Horsepower and AM"),
       custom.coef.names=c("Intercept", "Weight","Cyl: 6", "Cyl: 8", "Horsepower","AM: 1"))
```

下面是输出 table 的样子:

这是使用 broom 的尝试。不过,您仍然需要清理标签。

library(broom)
library(dplyr)
library(pander)
library(tidyr)

m1 = glm(mpg ~ wt + factor(cyl), data=mtcars)
m2 = glm(mpg ~ wt + factor(cyl) + hp + factor(am), data=mtcars)
base <- tidy(m1) %>% select(term, estimate) %>% mutate(type = "base_model")
with_am_hp <- tidy(m2) %>% select(term, estimate) %>% mutate(type = "Add_Horsepower_and_AM")
models <- bind_rows(base, with_am_hp)
formatted_models <- models  %>% spread(type, estimate)

m1_glance <- glance(m1) %>% mutate(type = "base_model")
m2_glance <- glance(m2) %>% mutate(type = "Add_Horsepower_and_AM")
glance_table <- data.frame("Add_Horsepower_and_AM" = unlist(glance(m2)), "base_model" = unlist(glance(m1))) %>% mutate(term = row.names(.))

full_results <- bind_rows(formatted_models, glance_table)
pandoc.table(full_results, justify = "left")

这需要相当多的争论,但我认为它会让你接近你想要的。我用了xtable。主要思想是为每个模型创建两列,一列右对齐(系数),另一列左对齐(标准误差)。因此,对于具有两个模型的 table,我们有五列。 Headers 并且摘要统计显示在跨两列的单元格中。

首先,我们有 header.tex,利用 p. 27 of the xtable vignette:

\usepackage{array}
\usepackage{tabularx}
\newcolumntype{L}[1]{>{\raggedright\let\newline\
\arraybackslash\hspace{0pt}}m{#1}}
\newcolumntype{C}[1]{>{\centering\let\newline\
\arraybackslash\hspace{0pt}}m{#1}}
\newcolumntype{R}[1]{>{\raggedleft\let\newline\
\arraybackslash\hspace{0pt}}m{#1}}
\newcolumntype{P}[1]{>{\raggedright\tabularxbackslash}p{#1}}

.Rmd 文件。我从 this answer 那里了解到 add.to.row

---
title: "Regression Table"
author: "eipi10"
date: "August 15, 2016"
header-includes:
    - \usepackage{dcolumn}
output: 
  pdf_document:
    includes:
      in_header: header.tex
---

```{r, echo=FALSE, message=FALSE, results="asis"}
library(xtable)
library(broom)   

m1 = glm(mpg ~ wt + factor(cyl), data=mtcars)
m2 = glm(mpg ~ wt + factor(cyl) + hp + factor(am), data=mtcars)

p_val <- c(0, 0.001, 0.01, 0.05, 1)
stars <- sapply(3:0, function(x) paste0(rep("*", x), collapse=""))

make_tbl <- function(model) {
  coefs <- summary(model)$coefficients
  coef_col <- round(coefs[,1], 2)
  se_col <- round(coefs[,2], 2)
  star_col <- stars[findInterval(coefs[,4], p_val)]
  tbl <- data.frame(coef=coef_col)
  tbl$se <- sprintf("(%0.2f)%s", se_col, star_col)
  tbl
}

make_addtorow <- function(row.name, terms) {
  # xtable allows the addition of custom rows. This function
  # makes a row with a one column (which is used for the row
  # names for the model statistics), 
  # followed by two columns that each span two columns.
  paste0(row.name, 
  paste0('& \multicolumn{2}{C{3cm}}{', 
         terms, 
         '}', 
        collapse=''), 
  '\\')
}

tbl1 <- make_tbl(m1)
tbl2 <- make_tbl(m2)
combo <- merge(tbl1, tbl2, by = "row.names", all = TRUE)[,-1]
rownames(combo) <- c("Intercept", "AM: 1", "Cyl: 6", "Cyl: 8", "Horsepower", "Weight")
sum_stats <- round(rbind(glance(m1), glance(m2)), 2)

addtorow <- list()
addtorow$pos <- list(0, 6, 6, 6, 6, 6)
addtorow$command <- c(
  make_addtorow("", c("Base model", "Add Horsepower and AM")),
  make_addtorow("\hline AIC", sum_stats$AIC), # Draw a line after coefficients
  make_addtorow("BIC", sum_stats$BIC),
  make_addtorow("Log Likelihood", sum_stats$logLik),
  make_addtorow("Deviance", sum_stats$deviance),
  make_addtorow("Num. obs.", sum_stats$df.null + 1)
  )

xtbl <- xtable(combo, add.to.row = addtorow, include.colnames = FALSE,  
               comment = FALSE)
# Specify column alignment for tabularx environment
# We're using the custom column types we created in header.tex
# \hskip specifies the width between columns
align(xtbl) <- c("L{2.5cm}", "R{1.5cm}@{\hskip 0.1cm}", "L{1.5cm}", 
                           "R{1.5cm}@{\hskip 0.1cm}","L{1.5cm}")

print(xtbl, 
      tabular.environment = "tabularx", # tabularx takes two arguments
      width = ".60\textwidth",         # width, and alignment (specified above)
      add.to.row = addtorow, 
      include.colnames = FALSE,
      comment = FALSE)
```