从 Python 中的调用结构生成自记录流程图

Question

我在 Python 中有许多小型的几行函数，它们对数量之间的物理关系进行编码。它们相互依存，因此脚本可能如下所示：

a = f1(x,y)
b = f2(x,a)
c = f3(a,b,z)

使用 x,y,z 我知道的一些固定输入，并且 c 在最后阶段使用所需的模型参数。

我想用这样一段代码自动创建graphs/flowcharts，每个节点都是一个函数，每条边对应一个return value/argument。当然，节点和边都应该增加某种文档字符串。

动机基本上是实际的可视化和错误检查，因为我会有很多这样的微型网络。我对调用图不感兴趣，因为我只关心一组特定的函数，而不是所有函数。

我想解决这个问题的一种方法是编写 classes 来保存每个函数（和参数？）的所有元数据，并使每个 function/variable 成为这样一个实例class。我不确定的是我将如何提取图表的数据。有这样做的通用方法吗？这是一个好方法吗？

Answer 1

假设您的函数是在单独的库中定义的

todoc/library.py

def f1(x, y):
    """
    f1 is an example concatter

    :param x: Foo (string)
    :param y: Bar (string)

    :return:  FooBar (string)
    """

    return x + y


def f2(x, a):
    """
    f2 is an example multiplier

    :param x: Foo (string)
    :param a: Baz (int)
    :return:  Foo * Baz
    """

    return x * a

你的 analyze/document 脚本之一是

scriptA.py

from todoc.library import f1, f2

x = 'FOO'
y = 'BAR'
z = 3

a = f1(x, y)
b = f2(a, z)

print(b)

现在你可以使用下面的脚本来分析你的脚本A

analyze_for_doc.py

#!/usr/bin/env python3

import argparse
import ast
from importlib import import_module
from pathlib import Path


class PythonAnalyzer(ast.NodeVisitor):  # Parse python source
    def __init__(self, tree, all_=False, watch=None, recurse=False):
        self._tree = tree
        self._all = all_
        self._recurse = recurse
        self._watch = watch
        self._stack = []

    def run(self):
        self.visit(self._tree)
        return self._stack

    def generic_visit(self, node):
        ncn = node.__class__.__name__
        if (
            (isinstance(self._watch, str)
             and node.__class__.__name__ == self._watch) or
            (isinstance(self._watch, (list, tuple))
             and node.__class__.__name__ in self._watch)
        ):
            self._stack.append(node)
            if self._recurse:
                self._all = True
                super(PythonAnalyzer, self).generic_visit(node)
                self._all = False

        else:
            if self._all:
                self._stack.append(node)

            super(PythonAnalyzer, self).generic_visit(node)

    def show(self, verbose=False):
        print(f'{self.__class__.__name__:<40s} [{len(self._stack):4d}]')
        for i, node in enumerate(self._stack):
            if verbose:
                print(f'{i:4d} {node.__class__.__name__:<30s} '
                      f'{id(node)} {node} {node.__dict__}')
            else:
                print(f'{i:4d} {node.__class__.__name__:<30s} '
                      f'{id(node):<12x} {node}')


def main(opts):
    content = opts.file.open().read()
    tree = ast.parse(content)

    if opts.debug:
        pa = PythonAnalyzer(tree, all_=True)
        pa.run()
        pa.show(verbose=opts.verbose)

    pa = PythonAnalyzer(tree, watch=('Call', 'ImportFrom'))
    stack = pa.run()

    print(f'Filename: {opts.file}', '=' * 70, sep='\n')
    modules = [m
               for m in stack
               if (isinstance(m, ast.ImportFrom)
                   and m.module.startswith('todoc.'))]

    fun_to_document = []
    for module in modules:
        print(f'    Module: {module.module}')
        funs = module.names

        mod = import_module(module.module)

        for fun in funs:
            print(f'        Fun: {fun.name}')
            fun_obj = getattr(mod, fun.name)

            if doc := getattr(fun_obj, '__doc__'):
                for line in doc.splitlines():
                    print(f'           |{line}')
                fun_to_document.append(fun.name)

    print('')

    for call_ in stack:
        if isinstance(call_, ast.Call):
            if call_.func.id not in fun_to_document:
                continue
            print(f'Calling {call_.func.id} in line {call_.lineno} '
                  f'with args={call_.args} kwargs={call_.keywords}')


if __name__ == '__main__':
    parser = argparse.ArgumentParser('analyze python for doc')
    parser.add_argument('file', type=Path)
    parser.add_argument('--debug', action='store_true')
    parser.add_argument('--verbose', action='store_true')
    opts = parser.parse_args()

    main(opts)

调用analyze_for_doc.py scriptA.py会输出

Filename: scriptA.py
======================================================================
    Module: todoc.library
        Fun: f1
           |
           |    f1 is an example concatter
           |
           |    :param x: Foo (string)
           |    :param y: Bar (string)
           |
           |    :return:  FooBar (string)
           |    
        Fun: f2
           |
           |    f2 is an example multiplier
           |
           |    :param x: Foo (string)
           |    :param a: Baz (int)
           |    :return:  Foo * Baz
           |    

Calling f1 in line 7 with args=[<ast.Name object at 0x102a589d0>, <ast.Name object at 0x102ac9460>] kwargs=[] Calling f2 in line 8 with args=[<ast.Name object at 0x102b28850>, <ast.Name object at 0x102b28820>] kwargs=[]

这应该为您提供一个起点，如何分析您的 python 脚本以创建文档信息。

从 Python 中的调用结构生成自记录流程图

Generate a self-documenting flow chart from a call structure in Python

python

graph