使用 compareGroups 时 pdf_document 和 bookdown::pdf_document2 之间的不同行为
Different behavior between pdf_document and bookdown::pdf_document2 when using compareGroups
我在使用 bookdown::pdf_document2 编织文档时遇到问题,而在使用标准 pdf_document 时不会出现这些文档。
具体来说,我正在使用 compareGroups 库和 export2md 函数来输出比较 tables,如下所示:
这在我使用output:pdf_document
时成功了。但是,当我使用 output: bookdown:pdf_document2
.
时,table 未正确创建
tex
文件明显不同,我可以手动将 table 从 pdf_document 输出的 tex 复制到 pdf_document2。有没有人对如何获取 bookdown 以正确创建 table 有任何想法?我已经用我在此处发现的错误创建了一个回购协议以获取更多详细信息:https://github.com/vitallish/bookdown-bug
概览
bookdown::pdf_document2()
不同于rmarkdwon::pdf_document()
,前者将$opts_knit$kable.force.latex
设置为TRUE
,而后者则保持默认值(FALSE
)。
检查.md
文件
我觉得从.md
到.tex
的过程应该是一样的,.tex
个文件的不同可能是因为.md
个文件的不同。所以我运行下面的代码来保留中间.md
个文件。
rmarkdown::render('pdf_document.Rmd', clean = FALSE)
file.remove('pdf_document.utf8.md');
rmarkdown::render('pdf_document2.Rmd', clean = FALSE)
file.remove('pdf_document2.utf8.md');
pdf_document.knit.md
Table: Summary descriptives table by groups of `Sex'
Var Male N=1101 Female N=1193 p.overall
----------------------------------------------- --------------- ----------------- -----------
Recruitment year: 0.506
1995 206 (18.7%) 225 (18.9%)
2000 390 (35.4%) 396 (33.2%)
2005 505 (45.9%) 572 (47.9%)
Age 54.8 (11.1) 54.7 (11.0) 0.840
Smoking status: <0.001
Never smoker 301 (28.1%) 900 (77.5%)
Current or former < 1y 410 (38.3%) 183 (15.7%)
Former >= 1y 360 (33.6%) 79 (6.80%)
Systolic blood pressure 134 (18.9) 129 (21.2) <0.001
Diastolic blood pressure 81.7 (10.2) 77.8 (10.5) <0.001
pdf2_document.knit.md
\begin{table}
\caption{(\#tab:md-output)Summary descriptives table by groups of `Sex'}
\centering
\begin{tabular}[t]{l|c|c|c}
\hline
Var & Male N=1101 & Female N=1193 & p.overall\
\hline
Recruitment year: & & & 0.506\
\hline
\ \ \ \ 1995 & 206 (18.7\%) & 225 (18.9\%) & \
\hline
\ \ \ \ 2000 & 390 (35.4\%) & 396 (33.2\%) & \
\hline
\ \ \ \ 2005 & 505 (45.9\%) & 572 (47.9\%) & \
\hline
Age & 54.8 (11.1) & 54.7 (11.0) & 0.840\
\hline
Smoking status: & & & <0.001\
\hline
\ \ \ \ Never smoker & 301 (28.1\%) & 900 (77.5\%) & \
\hline
\ \ \ \ Current or former < 1y & 410 (38.3\%) & 183 (15.7\%) & \
\hline
\ \ \ \ Former >= 1y & 360 (33.6\%) & 79 (6.80\%) & \
\hline
Systolic blood pressure & 134 (18.9) & 129 (21.2) & <0.001\
\hline
Diastolic blood pressure & 81.7 (10.2) & 77.8 (10.5) & <0.001\
\hline
\end{tabular}
\end{table}
这就解释了为什么您在 pdf 输出中看到不同的外观。
探索
进一步探究原因,
> pdf1 <- rmarkdown::pdf_document()
> pdf2 <- bookdown::pdf_document2()
> all.equal(pdf, pdf2)
[1] "Length mismatch: comparison on first 11 components"
[2] "Component “knitr”: Component “opts_knit”: target is NULL, current is list"
[3] "Component “pandoc”: Component “args”: Lengths (8, 12) differ (string compare on first 8)"
[4] "Component “pandoc”: Component “args”: 8 string mismatches"
[5] "Component “pandoc”: Component “ext”: target is NULL, current is character"
[6] "Component “pre_processor”: target, current do not match when deparsed"
[7] "Component “post_processor”: target is NULL, current is function"
自从 knitr 将 Rmarkdown 转换为 pandoc markdown,我猜 $knitr
导致了 .md
个文件的差异。
> all.equal(pdf$knitr, pdf2$knitr)
[1] "Component “opts_knit”: target is NULL, current is list"
> pdf2$knitr$opts_knit
$bookdown.internal.label
[1] TRUE
$kable.force.latex
[1] TRUE
kable
是输出table的函数,所以$knitr$opts_knit$kable.force.latex
可能是根本原因。
验证
为了验证我的假设,
pdf3 <- pdf2
pdf3$knitr$opts_knit$kable.force.latex = FALSE
rmarkdown::render('pdf_document3.Rmd', clean = FALSE, output_format = pdf3)
file.remove('pdf_document3.utf8.md')
pdf_document3.knit.md
Var Male N=1101 Female N=1193 p.overall
----------------------------------------------- --------------- ----------------- -----------
Recruitment year: 0.506
1995 206 (18.7%) 225 (18.9%)
2000 390 (35.4%) 396 (33.2%)
2005 505 (45.9%) 572 (47.9%)
Age 54.8 (11.1) 54.7 (11.0) 0.840
Smoking status: <0.001
Never smoker 301 (28.1%) 900 (77.5%)
Current or former < 1y 410 (38.3%) 183 (15.7%)
Former >= 1y 360 (33.6%) 79 (6.80%)
Systolic blood pressure 134 (18.9) 129 (21.2) <0.001
Diastolic blood pressure 81.7 (10.2) 77.8 (10.5) <0.001
哇哦!
高级
实际上compareGroups::export2md
使用knitr::kable
作为工作马,
> compareGroups::export2md
function (x, which.table = "descr", nmax = TRUE, header.labels = c(),
caption = NULL, ...)
{
if (!inherits(x, "createTable"))
stop("x must be of class 'createTable'")
...
if (ww %in% c(1)) {
...
table1 <- table1[-1, , drop = FALSE]
return(knitr::kable(table1, align = align, row.names = FALSE,
caption = caption[1]))
}
if (ww %in% c(2)) {
table2 <- prepare(x, nmax = nmax, c())[[2]]
...
return(knitr::kable(table2, align = align, row.names = FALSE,
caption = caption[2]))
}
}
使用 kable.force.latex
作为内部选项来调整其输出。如果您浏览 knitr 的 GitHub 存储库,您可以在 R/utils.R
文件
中找到以下代码
kable = function(
x, format, digits = getOption('digits'), row.names = NA, col.names = NA,
align, caption = NULL, format.args = list(), escape = TRUE, ...
) {
# determine the table format
if (missing(format) || is.null(format)) format = getOption('knitr.table.format')
if (is.null(format)) format = if (is.null(pandoc_to())) switch(
out_format() %n% 'markdown',
latex = 'latex', listings = 'latex', sweave = 'latex',
html = 'html', markdown = 'markdown', rst = 'rst',
stop('table format not implemented yet!')
) else if (isTRUE(opts_knit$get('kable.force.latex')) && is_latex_output()) {
# force LaTeX table because Pandoc's longtable may not work well with floats
# http://tex.stackexchange.com/q/276699/9128
'latex'
} else 'pandoc'
if (is.function(format)) format = format()
...
structure(res, format = format, class = 'knitr_kable')
}
结论
$knitr$opts_knit$kable.force.latex = TRUE
导致 bookdown::pdf_document2()
在 .md
文件中插入乳胶代码,而 rmarkdown::pdf_document()
保留降价代码,这让 pandoc 有机会给出一个漂亮的 table.
我不认为这是一个错误。 Yihui Xie(bookdown 的作者)可能有一些特殊的理由这样做。 bookdown::pdf_document2()
永远不需要与 rmarkdown::pdf_document()
相同。
export2md 的这个问题已在 github 上可用的最新版本 compareGroups 软件包 (4.0) 中得到解决。您可以通过键入以下内容来安装此最新版本:
library(devtools)
devtools::install_github("isubirana/compareGroups")
希望这个版本能尽快提交给CRAN
我在使用 bookdown::pdf_document2 编织文档时遇到问题,而在使用标准 pdf_document 时不会出现这些文档。
具体来说,我正在使用 compareGroups 库和 export2md 函数来输出比较 tables,如下所示:
这在我使用output:pdf_document
时成功了。但是,当我使用 output: bookdown:pdf_document2
.
tex
文件明显不同,我可以手动将 table 从 pdf_document 输出的 tex 复制到 pdf_document2。有没有人对如何获取 bookdown 以正确创建 table 有任何想法?我已经用我在此处发现的错误创建了一个回购协议以获取更多详细信息:https://github.com/vitallish/bookdown-bug
概览
bookdown::pdf_document2()
不同于rmarkdwon::pdf_document()
,前者将$opts_knit$kable.force.latex
设置为TRUE
,而后者则保持默认值(FALSE
)。
检查.md
文件
我觉得从.md
到.tex
的过程应该是一样的,.tex
个文件的不同可能是因为.md
个文件的不同。所以我运行下面的代码来保留中间.md
个文件。
rmarkdown::render('pdf_document.Rmd', clean = FALSE)
file.remove('pdf_document.utf8.md');
rmarkdown::render('pdf_document2.Rmd', clean = FALSE)
file.remove('pdf_document2.utf8.md');
pdf_document.knit.md
Table: Summary descriptives table by groups of `Sex'
Var Male N=1101 Female N=1193 p.overall
----------------------------------------------- --------------- ----------------- -----------
Recruitment year: 0.506
1995 206 (18.7%) 225 (18.9%)
2000 390 (35.4%) 396 (33.2%)
2005 505 (45.9%) 572 (47.9%)
Age 54.8 (11.1) 54.7 (11.0) 0.840
Smoking status: <0.001
Never smoker 301 (28.1%) 900 (77.5%)
Current or former < 1y 410 (38.3%) 183 (15.7%)
Former >= 1y 360 (33.6%) 79 (6.80%)
Systolic blood pressure 134 (18.9) 129 (21.2) <0.001
Diastolic blood pressure 81.7 (10.2) 77.8 (10.5) <0.001
pdf2_document.knit.md
\begin{table}
\caption{(\#tab:md-output)Summary descriptives table by groups of `Sex'}
\centering
\begin{tabular}[t]{l|c|c|c}
\hline
Var & Male N=1101 & Female N=1193 & p.overall\
\hline
Recruitment year: & & & 0.506\
\hline
\ \ \ \ 1995 & 206 (18.7\%) & 225 (18.9\%) & \
\hline
\ \ \ \ 2000 & 390 (35.4\%) & 396 (33.2\%) & \
\hline
\ \ \ \ 2005 & 505 (45.9\%) & 572 (47.9\%) & \
\hline
Age & 54.8 (11.1) & 54.7 (11.0) & 0.840\
\hline
Smoking status: & & & <0.001\
\hline
\ \ \ \ Never smoker & 301 (28.1\%) & 900 (77.5\%) & \
\hline
\ \ \ \ Current or former < 1y & 410 (38.3\%) & 183 (15.7\%) & \
\hline
\ \ \ \ Former >= 1y & 360 (33.6\%) & 79 (6.80\%) & \
\hline
Systolic blood pressure & 134 (18.9) & 129 (21.2) & <0.001\
\hline
Diastolic blood pressure & 81.7 (10.2) & 77.8 (10.5) & <0.001\
\hline
\end{tabular}
\end{table}
这就解释了为什么您在 pdf 输出中看到不同的外观。
探索
进一步探究原因,
> pdf1 <- rmarkdown::pdf_document()
> pdf2 <- bookdown::pdf_document2()
> all.equal(pdf, pdf2)
[1] "Length mismatch: comparison on first 11 components"
[2] "Component “knitr”: Component “opts_knit”: target is NULL, current is list"
[3] "Component “pandoc”: Component “args”: Lengths (8, 12) differ (string compare on first 8)"
[4] "Component “pandoc”: Component “args”: 8 string mismatches"
[5] "Component “pandoc”: Component “ext”: target is NULL, current is character"
[6] "Component “pre_processor”: target, current do not match when deparsed"
[7] "Component “post_processor”: target is NULL, current is function"
自从 knitr 将 Rmarkdown 转换为 pandoc markdown,我猜 $knitr
导致了 .md
个文件的差异。
> all.equal(pdf$knitr, pdf2$knitr)
[1] "Component “opts_knit”: target is NULL, current is list"
> pdf2$knitr$opts_knit
$bookdown.internal.label
[1] TRUE
$kable.force.latex
[1] TRUE
kable
是输出table的函数,所以$knitr$opts_knit$kable.force.latex
可能是根本原因。
验证
为了验证我的假设,
pdf3 <- pdf2
pdf3$knitr$opts_knit$kable.force.latex = FALSE
rmarkdown::render('pdf_document3.Rmd', clean = FALSE, output_format = pdf3)
file.remove('pdf_document3.utf8.md')
pdf_document3.knit.md
Var Male N=1101 Female N=1193 p.overall
----------------------------------------------- --------------- ----------------- -----------
Recruitment year: 0.506
1995 206 (18.7%) 225 (18.9%)
2000 390 (35.4%) 396 (33.2%)
2005 505 (45.9%) 572 (47.9%)
Age 54.8 (11.1) 54.7 (11.0) 0.840
Smoking status: <0.001
Never smoker 301 (28.1%) 900 (77.5%)
Current or former < 1y 410 (38.3%) 183 (15.7%)
Former >= 1y 360 (33.6%) 79 (6.80%)
Systolic blood pressure 134 (18.9) 129 (21.2) <0.001
Diastolic blood pressure 81.7 (10.2) 77.8 (10.5) <0.001
哇哦!
高级
实际上compareGroups::export2md
使用knitr::kable
作为工作马,
> compareGroups::export2md
function (x, which.table = "descr", nmax = TRUE, header.labels = c(),
caption = NULL, ...)
{
if (!inherits(x, "createTable"))
stop("x must be of class 'createTable'")
...
if (ww %in% c(1)) {
...
table1 <- table1[-1, , drop = FALSE]
return(knitr::kable(table1, align = align, row.names = FALSE,
caption = caption[1]))
}
if (ww %in% c(2)) {
table2 <- prepare(x, nmax = nmax, c())[[2]]
...
return(knitr::kable(table2, align = align, row.names = FALSE,
caption = caption[2]))
}
}
使用 kable.force.latex
作为内部选项来调整其输出。如果您浏览 knitr 的 GitHub 存储库,您可以在 R/utils.R
文件
kable = function(
x, format, digits = getOption('digits'), row.names = NA, col.names = NA,
align, caption = NULL, format.args = list(), escape = TRUE, ...
) {
# determine the table format
if (missing(format) || is.null(format)) format = getOption('knitr.table.format')
if (is.null(format)) format = if (is.null(pandoc_to())) switch(
out_format() %n% 'markdown',
latex = 'latex', listings = 'latex', sweave = 'latex',
html = 'html', markdown = 'markdown', rst = 'rst',
stop('table format not implemented yet!')
) else if (isTRUE(opts_knit$get('kable.force.latex')) && is_latex_output()) {
# force LaTeX table because Pandoc's longtable may not work well with floats
# http://tex.stackexchange.com/q/276699/9128
'latex'
} else 'pandoc'
if (is.function(format)) format = format()
...
structure(res, format = format, class = 'knitr_kable')
}
结论
$knitr$opts_knit$kable.force.latex = TRUE
导致 bookdown::pdf_document2()
在 .md
文件中插入乳胶代码,而 rmarkdown::pdf_document()
保留降价代码,这让 pandoc 有机会给出一个漂亮的 table.
我不认为这是一个错误。 Yihui Xie(bookdown 的作者)可能有一些特殊的理由这样做。 bookdown::pdf_document2()
永远不需要与 rmarkdown::pdf_document()
相同。
export2md 的这个问题已在 github 上可用的最新版本 compareGroups 软件包 (4.0) 中得到解决。您可以通过键入以下内容来安装此最新版本:
library(devtools)
devtools::install_github("isubirana/compareGroups")
希望这个版本能尽快提交给CRAN