如何检测在 Python 模块导入时执行的顶级 print() 和日志记录调用？

Question

我想检测对 print() 和 logging（例如 logging.info()）的调用是否可达到顶层，即在模块加载时执行，但构建失败如果找到。

我维护着一项其他团队经常提交的服务，因此我希望将其作为 CI 中的一种 lint 检查。我该怎么做？

我不关心非顶级调用（例如函数内部调用）。如果其他团队真的愿意，我想继续允许他们这样做，因为当他们执行自己的代码时。

到目前为止，我 tried/encountered 有几件事没有成功，通常是我关心的所有 python 文件的动态 import_module 然后：

pytest 的 capsys/capfd 功能，可能是由于错误？ https://github.com/pytest-dev/pytest/issues/5997#issuecomment-1028710193
例如：

# foo.py
print("hello")

from importlib import import_module

def test_does_not_print(capfd):
    import_module('foo')
    out, err = capfd.readouterr()

    assert out == ""  # surprise: this will pass

走过ast：（很难判断是否可以到达顶层）
Mock Python's built in print function

Answer 1

注意： 下面是一个解决方法，因为 capsys/capfd 应该可以解决这个问题，但对我的不起作用出于未知原因的特定项目。

我已经能够通过运行time monkeypatching 独立脚本中的 print 和 logging.info 函数来完成此操作，我可以运行在 [=21] =]，例如：

import builtins
from contextlib import contextmanager
import functools as ft
from importlib import import_module
import logging
import os
import sys

orig_print = builtins.print
orig_info, orig_warning, orig_error, orig_critical = logging.info, logging.warning, logging.error, logging.critical
NO_ARG = object()
sys.path.insert(0, 'src')


def main():
    orig_print("Checking files for print() & logging on import...")
    for path in files_under_watch():
        orig_print("  " + path)
        output = detect_toplevel_output(path)
        if output:
            raise SyntaxWarning(f"Top-level output (print & logging) detected in {path}: {output}")


def files_under_watch():
    for root, _, files in os.walk('src'):
        for file in files:
            if should_watch_file(file):  # your impl here
                yield os.path.join(root, file)


def detect_toplevel_output(python_file_path):
    with capture_print() as printed, capture_logging() as logged:
        module_name = python_file_path[:-3].replace('/', '.')
        import_module(module_name)

    output = {'print': printed, 'logging': logged}
    return {k: v for k, v in output.items() if v}


@contextmanager
def capture_print():
    calls = []

    @ft.wraps(orig_print)
    def captured_print(*args, **kwargs):
        calls.append((args, kwargs))
        return orig_print(*args, **kwargs)

    builtins.print = captured_print
    yield calls
    builtins.print = orig_print


@contextmanager
def capture_logging():
    calls = []

    @ft.wraps(orig_info)
    def captured_info(*args, **kwargs):
        calls.append(('info', args, kwargs))
        return orig_info(*args, **kwargs)

    @ft.wraps(orig_warning)
    def captured_warning(*args, **kwargs):
        calls.append(('warning', args, kwargs))
        return orig_warning(*args, **kwargs)

    @ft.wraps(orig_error)
    def captured_error(*args, **kwargs):
        calls.append(('error', args, kwargs))
        return orig_error(*args, **kwargs)

    @ft.wraps(orig_critical)
    def captured_critical(*args, **kwargs):
        calls.append(('critical', args, kwargs))
        return orig_critical(*args, **kwargs)

    logging.info, logging.warning, logging.error, logging.critical = captured_info, captured_warning, captured_error, captured_critical
    yield calls
    logging.info, logging.warning, logging.error, logging.critical = orig_info, orig_warning, orig_error, orig_critical


if __name__ == '__main__':
    main()

如何检测在 Python 模块导入时执行的顶级 print() 和日志记录调用？

How to detect top-level print() and logging calls that execute on Python module import?

python

continuous-integration

pytest