BGP MRT格式解析

BGP MRT format parsing

我正在尝试解析下载的 BGP 跟踪 here. It is said that the BGP packet traces are stored in the files with prefix updates and these MRT format files can be read by PyBGPdump

我下载了一个文件并按照 instruction (or this better formatted one):

cnt = 0
dump = pybgpdump.BGPDump('sample.dump.gz')
for mrt_h, bgp_h, bgp_m in dump:
    cnt += 1
print cnt, 'BGP messages in the MRT dump'

但是,我得到了这个错误:

Traceback (most recent call last):
  File "bgp-stats.py", line 8, in <module>
    for mrt_h, bgp_h, bgp_m in dump:
  File "/usr/local/lib/python2.7/dist-packages/pybgpdump.py", line 61, in next
    bgp_m = dpkt.bgp.BGP(bgp_h.data)
  File "/usr/local/lib/python2.7/dist-packages/dpkt/dpkt.py", line 89, in __init__
    self.unpack(args[0])
  File "/usr/local/lib/python2.7/dist-packages/dpkt/bgp.py", line 152, in unpack
    self.data = self.update = self.Update(self.data)
  File "/usr/local/lib/python2.7/dist-packages/dpkt/dpkt.py", line 89, in __init__
    self.unpack(args[0])
  File "/usr/local/lib/python2.7/dist-packages/dpkt/bgp.py", line 247, in unpack
    attr = self.Attribute(self.data)
  File "/usr/local/lib/python2.7/dist-packages/dpkt/dpkt.py", line 89, in __init__
    self.unpack(args[0])
  File "/usr/local/lib/python2.7/dist-packages/dpkt/bgp.py", line 326, in unpack
    self.data = self.as_path = self.ASPath(self.data)
  File "/usr/local/lib/python2.7/dist-packages/dpkt/dpkt.py", line 89, in __init__
    self.unpack(args[0])
  File "/usr/local/lib/python2.7/dist-packages/dpkt/bgp.py", line 376, in unpack
    seg = self.ASPathSegment(self.data)
  File "/usr/local/lib/python2.7/dist-packages/dpkt/dpkt.py", line 94, in __init__
    (self.__class__.__name__, args[0]))
dpkt.dpkt.UnpackError: invalid ASPathSegment: '\x1d\xf6\x00\x00\x1d\xf6\x00\x00\x1d\xf6\x00\x00F\xe0'

好像是格式问题。我搜索 "sample.dump.gz" 并找到它 here。结果还不错:

(999, 'BGP messages in the MRT dump')

有什么见解吗?所有跟踪文件都不可读,我不知道如何从我找到的 repo 中解析文件。

非常感谢!

目前这是 dpkt 库中的错误。官方存储库中有一个 open issue,但它是 2015 年的。 问题是 BGP 更新解析器将 AS 路径中的 AS 编号视为 2 octet/byte AS 编号,即使它们被编码为 4 octet/byte AS 编号。因此,当它到达长度为 two

的 4 字节编码 AS 路径的开头时
\x00\x00\xab\xcd   \x00\x00\x12\x34

它会尝试读取两个 2 字节的 AS 数字,然后停止。因此,它读取的不是 43981 4660,而是 0 43981 并错误地解释剩余字节。

目前还没有快速解决方法,因为这个问题非常棘手。为了了解 AS 路径是如何编码的,必须查看 BGP Open 消息中协商的功能。不确定其他解析器如何处理这个问题。

您可以在 repo 中解决问题或尝试使用替代库,例如 mrtparse