pandas 呈现为 html 的数据框在包含在 jekyll 站点中时奇怪地显示（带有文字标签）

Question

转载：

Create a new jekyll site
创建一个 iPython 笔记本，test.ipynb，包含以下内容和运行它：
```
import pandas as pd    
df = pd.DataFrame([[4,5]], columns=['A', 'B'])
df
```
运行 jupyter nbconvert --to html --template basic test.ipynb
将上一步创建的test.html文件复制到jekyll站点根目录下的_includes目录
编辑 _posts 下的默认 welcome-to-jekyll.markdown 文件以添加行：{% include test.html %}
运行 jekyll serve 并导航到示例 post

我希望看到我的数据框的 table 表示，因为它出现在我的 iPython 笔记本中。我实际上得到的是一团糟，其中一些标签被适当地解释，而另一些只是呈现为文本。它看起来像：

<th id="T_6c043882_e6d3_11e6_badf_889ffafd94e7" class="row_heading level0 row0" rowspan=1> 0
<th class="col_heading level0 col0" colspan=1> A <th class="col_heading level0 col1" colspan=1> B
4   5

Answer 1

问题出在 pandas 生成的 colspan=1 等属性上。在 HTML5 中引用属性值是可选的，但在 not in XHTML 中是可选的。

kramdown，jekyll默认使用的markdown解析器，只支持有效的XHTML.

我发现的最佳解决方法是规范化笔记本中的 HTML。即，将上面仅 df 的行替换为...

import IPython.core.display as di
from BeautifulSoup import BeautifulSoup as BS
def sanitize_style(s):
    soup = BS(s.render())
    return soup.prettify()

di.display(di.HTML(sanitize_style(df.style)))

向 this answer 致敬使用 BeautifulSoup 规范化 HTML 的想法。除了包装未加引号的属性值外，它还执行诸如关闭未关闭标签之类的操作，这在上面的最小示例中不是问题，但可能会导致更复杂的示例出现问题。

如果您要渲染很多这样的表格，可能值得使用自定义 nbconvert 配置文件在下游进一步进行规范化，这样您就不必在笔记本中重复很多样板文件代码。

您也可以尝试从 kramdown 解析器切换到 redcarpet（将 _config.yml 中的行 markdown: kramdown 更改为 markdown: redcarpet，并将 gem 'redcarpet' 添加到您的 Gemfile 中）。这解决了我的 XHTML 问题，但引入了一些不相关的怪癖。另请注意 as of May 2016, Github-pages will only support kramdown.

pandas 呈现为 html 的数据框在包含在 jekyll 站点中时奇怪地显示（带有文字标签）

pandas dataframe rendered as html shows up weirdly (with literal tags) when included in jekyll site

xhtml

jekyll

pandas

ipython-notebook