如何检测在 Python 模块导入时执行的顶级 print() 和日志记录调用?

How to detect top-level print() and logging calls that execute on Python module import?

我想检测对 print()logging(例如 logging.info())的调用是否可达到顶层,即在模块加载时执行,但构建失败如果找到。

我维护着一项其他团队经常提交的服务,因此我希望将其作为 CI 中的一种 lint 检查。我该怎么做?

我不关心非顶级调用(例如函数内部调用)。如果其他团队真的愿意,我想继续允许他们这样做,因为当他们执行自己的代码时。

到目前为止,我 tried/encountered 有几件事没有成功,通常是我关心的所有 python 文件的动态 import_module 然后:

# foo.py
print("hello")
from importlib import import_module

def test_does_not_print(capfd):
    import_module('foo')
    out, err = capfd.readouterr()

    assert out == ""  # surprise: this will pass

注意: 下面是一个解决方法,因为 capsys/capfd 应该可以解决这个问题,但对我的不起作用出于未知原因的特定项目。

我已经能够通过 运行time monkeypatching 独立脚本中的 printlogging.info 函数来完成此操作,我可以 运行 在 [=21] =],例如:

import builtins
from contextlib import contextmanager
import functools as ft
from importlib import import_module
import logging
import os
import sys

orig_print = builtins.print
orig_info, orig_warning, orig_error, orig_critical = logging.info, logging.warning, logging.error, logging.critical
NO_ARG = object()
sys.path.insert(0, 'src')


def main():
    orig_print("Checking files for print() & logging on import...")
    for path in files_under_watch():
        orig_print("  " + path)
        output = detect_toplevel_output(path)
        if output:
            raise SyntaxWarning(f"Top-level output (print & logging) detected in {path}: {output}")


def files_under_watch():
    for root, _, files in os.walk('src'):
        for file in files:
            if should_watch_file(file):  # your impl here
                yield os.path.join(root, file)


def detect_toplevel_output(python_file_path):
    with capture_print() as printed, capture_logging() as logged:
        module_name = python_file_path[:-3].replace('/', '.')
        import_module(module_name)

    output = {'print': printed, 'logging': logged}
    return {k: v for k, v in output.items() if v}


@contextmanager
def capture_print():
    calls = []

    @ft.wraps(orig_print)
    def captured_print(*args, **kwargs):
        calls.append((args, kwargs))
        return orig_print(*args, **kwargs)

    builtins.print = captured_print
    yield calls
    builtins.print = orig_print


@contextmanager
def capture_logging():
    calls = []

    @ft.wraps(orig_info)
    def captured_info(*args, **kwargs):
        calls.append(('info', args, kwargs))
        return orig_info(*args, **kwargs)

    @ft.wraps(orig_warning)
    def captured_warning(*args, **kwargs):
        calls.append(('warning', args, kwargs))
        return orig_warning(*args, **kwargs)

    @ft.wraps(orig_error)
    def captured_error(*args, **kwargs):
        calls.append(('error', args, kwargs))
        return orig_error(*args, **kwargs)

    @ft.wraps(orig_critical)
    def captured_critical(*args, **kwargs):
        calls.append(('critical', args, kwargs))
        return orig_critical(*args, **kwargs)

    logging.info, logging.warning, logging.error, logging.critical = captured_info, captured_warning, captured_error, captured_critical
    yield calls
    logging.info, logging.warning, logging.error, logging.critical = orig_info, orig_warning, orig_error, orig_critical


if __name__ == '__main__':
    main()