获取 header 个 HTTP 负载

Question

我得到一段代码，应该解码 pcap 文件以将图像写入目录。我是通过wireshark抓包，浏览http网站获取图片，放在一个目录下。


def get_header(payload):
    try :
        header_raw = payload[:payload.index(b'\r\n\r\n')+2]
    except ValueError:
        sys.stdout.write('-')
        sys.stdout.flush()
        return None

    header = dict(re.findall(r'(?P<name>.*?): (?P<value>.*?)\r\n', header_raw.decode())) 
    # This line of code is supposed to split out the headers

    if 'Content-Type' not in header:
        return None
    return header

当我尝试运行它时，它给了我这个：

Traceback (most recent call last):
  File "/home/kali/Documents/Programs/Python/recapper.py", line 79, in <module>
    recapper.get_responses()
  File "/home/kali/Documents/Programs/Python/recapper.py", line 62, in get_responses
    header = get_header(payload)
  File "/home/kali/Documents/Programs/Python/recapper.py", line 24, in get_header
    header = dict(re.findall(r"(?P<name>.*?): (?P<value>.*?)\r\n", header_raw.decode()))
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc6 in position 1: invalid continuation byte

我尝试了不同的方法，但都做不好。任何比我更有经验的人都可以告诉我问题是什么，或者如果我做错了，我该如何拆分 header？

Answer 1

更新：我发现我必须使用的编码不是 utf-8，而是 ISO-8859-1

像这样：header = dict(re.findall(r'(?P<name>.*?): (?P<value>.*?)\r\n', header_raw.decode(ISO-8859-1)))，而且有效！

获取 header 个 HTTP 负载

Get header of a HTTP payload

python

http

wireshark