如何在 rmarkdown 代码块中解析文本 table

How to parse a text table in rmarkdown code chunks

有一个 rmarkdown 文件,其中 markdown table 将定期更新。内容应该在代码块中进行解析,例如ggplot 可以使用。我不想在代码块或单独的文件中维护 table。

如何从代码块中读取 table?

您可以在下面找到带有 markdown table 的入门 rmarkdown 代码。

---
title: "Parse tables"
output: html_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(message = FALSE, warning = FALSE)
```

# Step 1: Create markdown table as text

That table will be manually updated directly in the markdown file.

Table: Project Timeline

| date       | description |
|------------|-------------|
| 2020-05-11 | Milestone 1 |
| 2020-07-11 | Milestone 2 |
| 2020-07-20 | Milestone 3 |


# Step 2: Parse the table above

The table should be maintained as a markdown table. That seems to be more easy than working directly with
`tibble` or `tribble`. How can I read the table from the code chunk? 

```{r}
library(tidyverse)
df <- tibble(date = c("2020-05-11", "2020-07-11", "2020-07-20"), 
             description = c("Milestone 1", "Milestone 2", "Milestone 3"))
df
```

在代码块中,将 readLines 应用到您的 Rmd 文件以在向量中获取此文件的行:

allLines <- readLines("yourFile.Rmd")

Select以|开头和结尾的行,去掉第二行(也就是分隔行"|-----|-----|"):

tableLines <- allLines[grep("^\|.*\|$", allLines)][-2]

然后使用下面的代码,您可以得到 table 作为矩阵,其第一行包含列名:

tableAsMatrix <- t(sapply(strsplit(tableLines, "\|"), function(pieces){
  stringr::str_trim(pieces[-1])
}))

最后将这个去掉第一行的矩阵转换为数据帧,并使用它的第一行设置列名:

setNames(as.data.frame(tableAsMatrix[-1,,drop = FALSE]), tableAsMatrix[1,])

完整代码

---
title: "Parse tables"
output: html_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(message = FALSE, warning = FALSE)
```

# Step 1: Create markdown table as text

That table will be manually updated directly in the markdown file.

Table: Project Timeline

| date       | description |
|------------|-------------|
| 2020-05-11 | Milestone 1 |
| 2020-07-11 | Milestone 2 |
| 2020-07-20 | Milestone 3 |


# Step 2: Parse the table above

The table should be maintained as a markdown table. How can I read the table from the code chunk? 

```{r}
allLines <- readLines("ParseTable.Rmd")

tableLines <- allLines[grep("^\|.*\|$", allLines)][-2]

tableAsMatrix <- t(sapply(strsplit(tableLines, "\|"), function(pieces){
  stringr::str_trim(pieces[-1])
}))

df <- setNames(as.data.frame(tableAsMatrix[-1,,drop = FALSE]), tableAsMatrix[1,])

df
```