强制 Sphinx 解释 Python 文档字符串中的 Markdown 而不是 reStructuredText

Force Sphinx to interpret Markdown in Python docstrings instead of reStructuredText

我正在使用 Sphinx 来记录一个 python 项目。我想在我的文档字符串中使用 Markdown 来格式化它们。 即使我使用 recommonmark 扩展,它也只涵盖手动编写的 .md 文件,而不是文档字符串。

我在扩展程序中使用 autodocnapoleonrecommonmark

如何让 sphinx 在我的文档字符串中解析 markdown

Sphinx 的 Autodoc 扩展在每次处理 doc-string 时都会发出一个名为 autodoc-process-docstring 的事件。我们可以连接到该机制以将语法从 Markdown 转换为 reStructuredText。

不幸的是,Recommonmark does not expose a Markdown-to-reST converter. It maps the parsed Markdown directly to a Docutils object, i.e., the same representation that Sphinx 本身是从 reStructuredText 内部创建的。

相反,我使用 Commonmark for the conversion in my projects. Because it's fast — much faster than Pandoc, for example. Speed is important as the conversion happens on the fly and handles each doc-string individually. Other than that, any Markdown-to-reST converter would do. M2R2 将是第三个示例。其中任何一个的缺点是它们不支持 Recommonmark 的语法扩展,例如对文档其他部分的 cross-references。只是基本的 Markdown。

要插入 Commonmark doc-string 转换器,请确保已安装软件包 (pip install commonmark) 并将以下内容添加到 Sphinx 的配置文件 conf.py:

import commonmark

def docstring(app, what, name, obj, options, lines):
    md  = '\n'.join(lines)
    ast = commonmark.Parser().parse(md)
    rst = commonmark.ReStructuredTextRenderer().render(ast)
    lines.clear()
    lines += rst.splitlines()

def setup(app):
    app.connect('autodoc-process-docstring', docstring)

同时,Recommonmark 在 doc-strings 中也是 deprecated in May 2021. The Sphinx extension MyST, a more feature-rich Markdown parser, is the replacement recommended by Sphinx and by Read-the-Docs. MyST does not yet support Markdown,但是可以使用与上面相同的钩子通过 Commonmark 获得有限的支持。

此处概述的方法的一种可能替代方法是使用 MkDocs with the MkDocStrings plug-in,这将从流程中完全消除 Sphinx 和 reStructuredText。

基于@john-hennig 的回答,以下内容将保留重组后的文本字段,例如::py:attr::py:class: 等。这允许您引用其他类,等等

import re
import commonmark

py_attr_re = re.compile(r"\:py\:\w+\:(``[^:`]+``)")

def docstring(app, what, name, obj, options, lines):
    md  = '\n'.join(lines)
    ast = commonmark.Parser().parse(md)
    rst = commonmark.ReStructuredTextRenderer().render(ast)
    lines.clear()
    lines += rst.splitlines()

    for i, line in enumerate(lines):
        while True:
            match = py_attr_re.search(line)
            if match is None:
                break 

            start, end = match.span(1)
            line_start = line[:start]
            line_end = line[end:]
            line_modify = line[start:end]
            line = line_start + line_modify[1:-1] + line_end
        lines[i] = line

def setup(app):
    app.connect('autodoc-process-docstring', docstring)

我不得不扩展 john-hen 接受的答案,以允许 Args: 条目的多行描述被视为单个参数:

def docstring(app, what, name, obj, options, lines):
  wrapped = []
  literal = False
  for line in lines:
    if line.strip().startswith(r'```'):
      literal = not literal
    if not literal:
      line = ' '.join(x.rstrip() for x in line.split('\n'))
    indent = len(line) - len(line.lstrip())
    if indent and not literal:
      wrapped.append(' ' + line.lstrip())
    else:
      wrapped.append('\n' + line.strip())
  ast = commonmark.Parser().parse(''.join(wrapped))
  rst = commonmark.ReStructuredTextRenderer().render(ast)
  lines.clear()
  lines += rst.splitlines()

def setup(app):
  app.connect('autodoc-process-docstring', docstring)

目前的@john-hennig 很棒,但在 python 风格的多行 Args: 中似乎失败了。这是我的修复:


def docstring(app, what, name, obj, options, lines):
    md = "\n".join(lines)
    ast = commonmark.Parser().parse(md)
    rst = commonmark.ReStructuredTextRenderer().render(ast)

    lines.clear()
    lines += _normalize_docstring_lines(rst.splitlines())


def _normalize_docstring_lines(lines: list[str]) -> list[str]:
    """Fix an issue with multi-line args which are incorrectly parsed.

    ```
    Args:
        x: My multi-line description which fit on multiple lines
          and continue in this line.
    ```

    Is parsed as (missing indentation):

    ```
    :param x: My multi-line description which fit on multiple lines
    and continue in this line.
    ```

    Instead of:

    ```
    :param x: My multi-line description which fit on multiple lines
        and continue in this line.
    ```

    """
    is_param_field = False

    new_lines = []
    for l in lines:
        if l.lstrip().startswith(":param"):
            is_param_field = True
        elif is_param_field:
            if not l.strip():  # Blank line reset param
                is_param_field = False
            else:  # Restore indentation
                l = "    " + l.lstrip()
        new_lines.append(l)
    return new_lines


def setup(app):
    app.connect("autodoc-process-docstring", docstring)