在没有服务器的情况下使用pydoc将模块索引写入文件

Question

我目前使用以下命令为我的 Python 库编写文档：

python -m pydoc -w "\myserver.com\my_library"

这很好用，我在 my_library HTML 文件中找到了来自 class / 方法 / 函数文档字符串的文档。这甚至记录了在子文件夹中找到的 Python 个文件。

我现在想创建并保存一个索引，它可以访问所有这些文件。

pydoc 文档说，如果您启动服务器，这是可能的：

pydoc -b will start the server and additionally open a web browser to a module index page. Each served page has a navigation bar at the top where you can Get help on an individual item, Search all modules with a keyword in their synopsis line, and go to the Module index, Topics and Keywords pages.

但是，我希望编写模块索引页面，包括相对指向单个文件文档的链接，但不包含服务器解决方案。然后我可以将索引 + 单个文件 [每个 py 文件一个] 存储在用户可访问的目录中。

这可能吗，或者是否有更好的方法来解决这个问题？

我看过 Sphinx，但这对我的要求来说似乎有些过分了。

Answer 1

基本可以通过运行一个小脚本实现：

导入待文档模块，
将文档写入 html 个文件，然后
将动态生成 index.html 的内部函数的输出写入文件 index.html。

这不是一个非常好的解决方案，因为它依赖于 pydoc 模块的内部结构，但相当紧凑：

import pydoc
import importlib

module_list = ['sys']
for m in module_list:
    importlib.import_module(m)
    pydoc.writedoc(m)

#the monkey patching optionally goes here

with open('index.html','w') as out_file:
    out_file.write(pydoc._url_handler('index.html'))

还有一个缺陷是，它还创建了指向所有内置模块等的链接，我们没有（我猜也不想）为其生成文档。

我们能否从 pydoc.py 中复制创建 index.html 文件的函数并将其修改为仅添加所需模块的链接？不幸的是，这不是直截了当的，因为该函数使用一些非本地范围来实现它的一些逻辑。

下一个最佳解决方案是猴子修补生成此页面的 index_html() 方法以仅列出我们的模块。

不幸的是，pydoc._url_handler 使用本地函数来实现它，而不是 class 方法。所以从这里开始有点棘手。

猴子补丁有一个解决方案，但有点hack：

在调用 _url_handler 之前，我们需要：

定义一个补丁版本，只为我们的module_list中的元素生成链接（绕过__placeholder__是因为我们的module_list没有定义在函数运行的范围，所以我们需要做一些对应于硬编码到函数中的事情。）
修补 pydoc 模块的源代码以使用该本地函数而不是最初定义的

这是通过以下方式实现的：

import inspect, ast

__placeholder__ = None

#our patched version, needs to have same name and signature as original
def html_index():
    """Module Index page."""
    names= __placeholder__

    def bltinlink(name):
        return '<a href="%s.html">%s</a>' % (name, name)

    heading = html.heading(
        '<big><big><strong>Index of Modules</strong></big></big>',
        '#ffffff', '#7799ee')
    contents = html.multicolumn(names, bltinlink)
    contents = [heading, '<p>' + html.bigsection(
        'Module List', '#ffffff', '#ee77aa', contents)]

    contents.append(
        '<p align=right><font color="#909090" face="helvetica,'
        'arial"><strong>pydoc</strong> by Ka-Ping Yee'
        '&lt;ping@lfw.org&gt;</font>')
    return 'Index of Modules', ''.join(contents)

#get source and replace __placeholder__ with our module_list
s=inspect.getsource(html_index).replace('__placeholder__', str(module_list))

#create abstract syntax tree, and store the actual function definition in l_index
l_index=ast.parse(s).body[0]
#ast.dump(l_index) #check if you want

#now obtain source from unpatched pydoc, generate ast patch it and recompile:
s= inspect.getsource(pydoc)
m = ast.parse(s)

def find_named_el_ind(body, name):
    '''find named element in ast body'''
    found=False
    for i,e in enumerate(body):
        if hasattr(e,'name') and e.name == name:
            found=True
            break
    if not found: raise ValueError('not found!')
    return i

#find and replace html_index with our patched html_index
i_url_handler = find_named_el_ind(m.body, '_url_handler')
i_html_index = find_named_el_ind(m.body[i_url_handler].body, 'html_index')
m.body[i_url_handler].body[i_html_index] = l_index

#compile and replace module in memory
co = compile(m, '<string>', 'exec')
exec(co, pydoc.__dict__)

#ast.dump(m.body[i_url_handler]) #check ast if you will

在没有服务器的情况下使用pydoc将模块索引写入文件

Write module index to file with pydoc without server

python

documentation

pydoc