rmarkdown 中的 Loops、knitr 和 xtable 以在多个报告中创建唯一的表格

Loops, knitr and xtable in rmarkdown to create unique tables in multiple reports

我正在全面修改我的问题。我意识到它很长,我的观点已经迷失了。

这是我需要做的:

为包含 table 的学校创建自动报告,将他们的数据与学校所在的地区以及整个州进行比较。状态是整个数据集。

这是我的理解:

如何创建横穿数据并为每所学校创建唯一的 PDF 报告的自动循环。 This post 对建立生成报告的框架很有帮助。

以下是我需要帮助的内容:

我需要一个包含以下列的 table:学校、学区、州 我还需要 table 的第一列包含一行:样本大小、平均值、标准偏差。

我试图在 for 循环的上下文中创建它,因为我需要在创建的每个唯一 pdf 中有一个唯一的 table。如果有更好的方法,我很想听听。

无论如何,这是我测试过的可重现示例。我在创建 table 方面还没有走得太远。

如有任何帮助,我们将不胜感激。

driver.r:

# Create dataset
set.seed(500)
School <- rep(seq(1:20), 2)
District <- rep(c(rep("East", 10), rep("West", 10)), 2)
Score <- rnorm(40, 100, 15)
Student.ID <- sample(1:1000,8,replace=T)
school.data <- data.frame(School, District, Score, Student.ID)

#prepare for multicore processing 
require(parallel)
# generate the rmd files, one for each school in df
library(knitr)
mclapply(unique(school.data$School), function(x) 
  knit("F:/sample-auto/auto.Rmd", 
       output=paste('report_', x, '.Rmd', sep="")))

# generate PDFs from the rmd files, one for each school in df
mclapply(unique(school.data$School), function(x)
  rmarkdown::render(paste0("F:/sample-auto/", paste0('report_', x, '.Rmd'))))

auto.Rmd:

---
title: "Automated Report Generation for Data"
author: "ME"
date: "February 5, 2015"
output: 
  pdf_document:
  toc: true
  number_sections: true
---

```{r, echo=FALSE}
library(xtable)
library(plyr)
df <- data.frame(school.data)
subgroup <- df[school.data$School == x,]
```

# Start of attempt 

```{r results='asis', echo=FALSE}
 for(school in unique(subgroup$School))
{
subgroup2 <- subgroup[subgroup$School == school,]
savename <- paste(x, school)
df2<- mean(subgroup2$Score, na.rm=TRUE)
df2 <- data.frame(df2)
print(xtable(df2))
}
```

我还尝试将循环替换为:

```{r results='asis', echo=FALSE}
df2 <- ddply(school.data, .(School), summarise, n = length(School), mean =      
mean(Score), sd = sd(Score))
print(xtable(df2))
```

这给了我我不想要的东西,因为所有学校都获得了每所学校的数据,而不仅仅是他们学校的数据。

如果您在将数据传递给 .rmd 文件之前使用循环对数据进行子集化,则您实际上不需要 plyr 或 ddply 来为您执行 split/apply/combine。由于您有很多观察,因此开销可能会很明显。

此外,如果您在 运行 .rmd 之前创建子组,则您也不需要文件内的循环。你只需要用你想要的统计数据制作一个数据框并使用 xtable

---
title: "Automated Report Generation for Data"
author: "ME"
date: "February 5, 2015"
output: 
  pdf_document:
    toc: true
    number_sections: true
---

```{r, echo=FALSE}
library(xtable)
library(plyr)
# Create dataset
set.seed(500)
School <- rep(seq(1:20), 2)
District <- rep(c(rep("East", 10), rep("West", 10)), 2)
Score <- rnorm(40, 100, 15)
Student.ID <- sample(1:1000,8,replace=T)
school.data <- data.frame(School, District, Score, Student.ID)


x <- unique(school.data$School)[1]
subgroup <- school.data[school.data$School == x, ]
```

# Start of attempt 

```{r results='asis', echo=FALSE}
options(xtable.comment = FALSE)
## for one school, it is redundant to split based on school but works
## likewise, it is redundant to have a loop here to split based on school
## if you have already used a loop to create the subgroup data 
res <- ddply(subgroup, .(School), summarise,
             n = length(School),
             mean = mean(Score),
             SD = sd(Score),
             VAR = var(Score))
xtable(res)

## if you pass in the entire data frame you will get all schools
## then you can subset the one you want
res <- ddply(school.data, .(School), summarise,
             n = length(School),
             mean = mean(Score),
             SD = sd(Score),
             VAR = var(Score))

xtable(res[res$School %in% x, ])
```