解析 dbus 监视器输出消息

Parsing dbus monitor output messages

我正在尝试解析 dbus 监视器输出消息。它的大部分消息都是多行条目(包括参数)。我需要解析单个日志消息并将其连接到单个行条目。

dbus-monitor 输出信息如下所示,

method call time=462.117843 sender=:1.62 -> destination=org.freedesktop.filehandler serial=122 path=/org/freedesktop/filehandler/routing; interface=org.freedesktop.filehandler.routing; member=start
int16 29877
uint16 0
method return time=462.117844 sender=org.freedesktop.filehandler -> destination=:1.62 serial=2210 reply_serial=122
int16 29877
uint16 0
method call time=462.117845 sender=:1.62 -> destination=org.freedesktop.filehandler serial=123 path=/org/freedesktop/filehandler/routing; interface=org.freedesktop.filehandler.routing; member=comment
string "starting .."
string "routing"
method return time=462.117846 sender=:1.19 -> destination=:1.62 serial=2212 reply_serial=123
int12 -23145
signal time=463.11223 sender=:1.64 -> destination=(null destination) serial=124 path=/org/freedesktop/fileserver; interface=org.freedesktop.DBus.Properties; member=PropertiesChanged
  string "com.freedesktop.Systemserver"
  array[
    dict entry(
      string "SystemTime"
      variant       struct{
            byte 12
            byte 9
            byte 0
        }
    )
  ]
  array [
  ]

这是 正则表达式 我试图对 dbus 消息进行分组(参数未分组),

\b(signal|method call|method return)\b time=([\d,.]*) sender=([\w,.,:,(,), ]*) -> destination=([\w,.,:,(,), ]*) serial=([(,),\w]*) (?:path=([\w,\/]*); interface=([\w,.]*); member=([\w,_,-]*))?(?:reply_serial=([\d]*))?

我希望输出格式如下,

C [sender,serial] path interface+member (parameter1, parameter2, ...)
R [destination,reply_serial] interface+member (parameter1, parameter2, ...)
S [sender, serial] path interface+member (parameter1, parameter2, ...)

上述 dbus-monitor 消息的示例输出如下所示,

C [:1.62,122] /org/freedesktop/filehandler/routing org.freedesktop.filehandler.routing.start (29877,0)
R [:1.62,122] org.freedesktop.filehandler.routing.start (29877,0)
C [:1.62,123] /org/freedesktop/filehandler/routing org.freedesktop.filehandler.routing.comment ("starting", "routing")
R [:1.62,123] org.freedesktop.filehandler.routing.comment (-23145)
S [:1.64, 124] /org/freedesktop/fileserver org.freedesktop.DBus.Properties.PropertiesChanged ("com.freedesktop.Systemserver"[("SystemTime",{12,9,0})][])

在条目通常是多行的情况下,如何实现上述预期结果?此外,SIGNALS 具有多重封装,因此难以访问参数。有人可以帮助将这些 dbus 消息解析为预期格式吗?

如果您绝对必须使用 dbus-monitor,最好通过将 --pcap 选项传递给它来使用其 PCAP 输出模式。在 well-documented structured format which can be read by libpcap.

中输出

由于您已经有了一个可用的正则表达式,因此您可以通过将其与 re.split 一起使用来构建它以获取所需的消息部分。请注意,这会为每个捕获组生成一个单独的字符串,并为每个消息条目生成一个带有参数的字符串。此示例假定所有消息都在字符串 messages:

import re
import sys
regex = r'\b(signal|method call|method return)\b time=([\d,.]*) sender=([\w,.,:,(,), ]*) -> destination=([\w,.,:,(,), ]*) serial=([(,),\w]*) (?:path=([\w,\/]*); interface=([\w,.]*); member=([\w,_,-]*))?(?:reply_serial=([\d]*))?'
m = re.split(regex, messages)
m = m[1:]                       # discard empty? text before first match
remember = dict()
while m:    # each match group is 9 capturing groups + 1 parameter group
    if m[0] == 'method call':
        print "C [{2},{4}] {5} {6}.{7}".format(*m),
        remember[m[4]] = m[6:8] # store interface+member for return
    if m[0] == 'method return':
        m[6:8] = remember[m[8]] # recall stored interface+member
        print "R [{3},{8}] {6}.{7}".format(*m),
    if m[0] == 'signal':
        print "S [{2}, {4}] {5} {6}.{7}".format(*m),
    # now handle parameters
    sep = "("
    for p in m[9].split('\n')[1:-1]:    # except empty string at start and end
        if p[-1] in "[](){}":           # with "encapsulations":
            p = p[-1]                   #   delete spaces, "array", "dict ..."
        p = re.sub('^\s*\w*\s*', '', p) # delete spaces and data type
        if p[-1] in "])}":
            sep = ''                    # no separator before closing
        print sep+p,
        sys.stdout.softspace=0
        if p[-1] in "[](){}":   sep = ''
        else:                   sep = ', '  # separator after data item
    print ")"
    m = m[10:]                  # delete the processed match group of 10

样本数据的输出是:

C [:1.62,122] /org/freedesktop/filehandler/routing org.freedesktop.filehandler.routing.start (29877, 0)
R [:1.62,122] org.freedesktop.filehandler.routing.start (29877, 0)
C [:1.62,123] /org/freedesktop/filehandler/routing org.freedesktop.filehandler.routing.comment ("starting ..", "routing")
R [:1.62,123] org.freedesktop.filehandler.routing.comment (-23145)
S [:1.64, 124] /org/freedesktop/fileserver org.freedesktop.DBus.Properties.PropertiesChanged ("com.freedesktop.Systemserver", [("SystemTime", {12, 9, 0})][])

Can you suggest how the code can be rewritten to process line by line?

这里我重新整理了一下:

import re
import sys
regex = r'\b(signal|method call|method return)\b time=([\d,.]*) sender=([\w,.,:,(,), ]*) -> destination=([\w,.,:,(,), ]*) serial=([(,),\w]*) (?:path=([\w,\/]*); interface=([\w,.]*); member=([\w,_,-]*))?(?:reply_serial=([\d]*))?'
remember = dict()
sep = None
for line in open('dbusl.in'):
    m = re.match(regex, line)
    if m:
        if sep is not None: print ")"   # end the previous parameter group
        m = list(m.groups())        # each match is 9 capturing groups
        if m[0] == 'method call':
            print "C [{2},{4}] {5} {6}.{7}".format(*m),
            remember[m[4]] = m[6:8]     # store interface+member for return
        if m[0] == 'method return':
            m[6:8] = remember.pop(m[8]) # recall stored interface+member
            print "R [{3},{8}] {6}.{7}".format(*m),
        if m[0] == 'signal':
            print "S [{2}, {4}] {5} {6}.{7}".format(*m),
        sep = "("
    else:
        p = line.rstrip()               # now handle parameters
        if p[-1] in "[](){}":           # with "encapsulations":
            p = p[-1]                   #   delete spaces, "array", "dict ..."
        p = re.sub('^\s*\w*\s*', '', p) # delete spaces and data type
        if p[-1] in "])}":
            sep = ''                    # no separator before closing
        print sep+p,
        sys.stdout.softspace=0
        if p[-1] in "[](){}":   sep = ''
        else:                   sep = ', '  # separator after data item
print ")"                       # end the previous parameter group

注意我也把m[6:8] = remember[m[8]]改成m[6:8] = remember.pop(m[8])为了释放不再需要的接口+成员数据的内存。