Python的标准库可以做结构化日志记录吗?

Can structured logging be done with Pythons standard library?

我最近阅读了有关结构化日志记录的内容 (here)。这个想法似乎不是通过将简单的字符串作为一行附加到日志文件来记录,而是通过 JSON 个对象来记录。这使得通过自动工具分析日志文件成为可能。

Python logging 库可以进行结构化日志记录吗?如果没有,有没有"mainstream"的解决方案(比如像numpy/scipy是科学计算的主流解决方案)?我找到了 structlog,但我不确定它有多普遍。

您是否看过 python docs site section describing Implementing structured logging 解释了 python 内置记录器如何用于结构化日志记录?

下面是上面网站上列出的一个简单示例。

import json
import logging

class StructuredMessage(object):
    def __init__(self, message, **kwargs):
        self.message = message
        self.kwargs = kwargs

    def __str__(self):
        return '%s >>> %s' % (self.message, json.dumps(self.kwargs))

m = StructuredMessage   # optional, to improve readability

logging.basicConfig(level=logging.INFO, format='%(message)s')
logging.info(m('message 1', foo='bar', bar='baz', num=123, fnum=123.456))

这导致以下日志。

message 1 >>> {"fnum": 123.456, "num": 123, "bar": "baz", "foo": "bar"}

希望对您有所帮助。

如果您安装 python-json-logger(288 颗星,70 个分叉)并具有如下所示的日志配置 (YAML),您将获得一个结构化的日志文件。

version: 1
formatters:
    detailed:
        class: logging.Formatter
        format: '[%(asctime)s]:[%(levelname)s]: %(message)s'
    json:
        class: pythonjsonlogger.jsonlogger.JsonFormatter
        format: '%(asctime)s %(levelname)s %(message)s'
handlers:
    console:
        class: logging.StreamHandler
        level: INFO
        formatter: detailed
    file:
        class: logging.FileHandler
        filename: logfile.log
        level: DEBUG
        formatter: json
root:
    level: DEBUG
    handlers:
        - console
        - file

例外情况

您可能还想使异常/回溯使用结构化格式。

py3.2 起,可以使用标准库执行此操作,无需外部依赖项:

from datetime import datetime
import json
import logging
import traceback


APP_NAME = 'hello world json logging'
APP_VERSION = 'git rev-parse HEAD'
LOG_LEVEL = logging._nameToLevel['INFO']


class JsonEncoderStrFallback(json.JSONEncoder):
  def default(self, obj):
    try:
      return super().default(obj)
    except TypeError as exc:
      if 'not JSON serializable' in str(exc):
        return str(obj)
      raise


class JsonEncoderDatetime(JsonEncoderStrFallback):
  def default(self, obj):
    if isinstance(obj, datetime):
      return obj.strftime('%Y-%m-%dT%H:%M:%S%z')
    else:
      return super().default(obj)


logging.basicConfig(
  format='%(json_formatted)s',
  level=LOG_LEVEL,
  handlers=[
    # if you wish to also log to a file -- logging.FileHandler(log_file_path, 'a'),
    logging.StreamHandler(sys.stdout),
  ],
)


_record_factory_bak = logging.getLogRecordFactory()
def record_factory(*args, **kwargs) -> logging.LogRecord:
  record = _record_factory_bak(*args, **kwargs)
  
  record.json_formatted = json.dumps(
    {
      'level': record.levelname,
      'unixtime': record.created,
      'thread': record.thread,
      'location': '{}:{}:{}'.format(
        record.pathname or record.filename,
        record.funcName,
        record.lineno,
      ),
      'exception': record.exc_info,
      'traceback': traceback.format_exception(*record.exc_info) if record.exc_info else None,
      'app': {
        'name': APP_NAME,
        'releaseId': APP_VERSION,
        'message': record.getMessage(),
      },
    },
    cls=JsonEncoderDatetime,
  )

  # clear exc data since it is included in the json format
  # without clearing this, logging.exception will print the
  # traceback across multiple lines, which is not json formatted
  record.exc_info = None
  record.exc_text = None

  return record
logging.setLogRecordFactory(record_factory)

正在调用 logging.info('HELLO %s', 'WORLD') ...

...结果为 {"level": "INFO", "unixtime": 1623532882.421775, "thread": 4660305408, "location": "<ipython-input-3-abe3276ceab4>:<module>:1", "exception": null, "traceback": null, "app": {"name": "hello world json logging", "releaseId": "git rev-parse HEAD", "message": "HELLO WORLD"}}