与 curl 一起工作,请求失败;如何修复我的请求代码?

Works with curl, fails with requests; how can I fix my requests code?

我正在尝试编写一个 python 模块来与硬件设备上的固定 HTTP 服务器通信,以便向其发送数据。我可以通过 curl 正确发送数据,但由于某些原因,当我在 python.

中使用 requests 模块时,它无法正常工作

我已经确认(通过使用 httpbin。org/post)这两个请求是相同的,但出于某种原因,只有通过 curl 发送的请求才有效。

当我查看两个请求的 tcpdump 时,我确实看到了不同之处:初始握手本质上是相同的,然后数据作为三个单独的数据包发送(在两种情况下)。

来自 curl 的通信 post-handshake 看起来像:

17:58:31.691251 IP CLIENT.56184 > SERVER.http: Flags [P.], seq 1:232, ack 1, win 29200, length 231: HTTP: POST /index.html HTTP/1.1
E.....@.@.....n:..n..x.P.......(P.r.5h..POST /index.html HTTP/1.1
User-Agent: curl/7.29.0
Host: SERVER
Accept: */*
Content-Length: 1258
Expect: 100-continue
Content-Type: multipart/form-data; boundary=----------------------------61700007fd77

.........7.?`)+.
17:58:31.766389 IP SERVER.http > CLIENT.56184: Flags [.], ack 232, win 1817, length 0
E..(;.....Ks..n...n:.P.x...(....P.... ........................
17:58:32.692418 IP CLIENT.56184 > SERVER.http: Flags [P.], seq 232:486, ack 1, win 29200, length 254: HTTP
E..&..@.@.....n:..n..x.P.......(P.r.5...------------------------------61700007fd77
< Data for packet 2 >

..........8.?`..
17:58:32.856104 IP SERVER.http > CLIENT.56184: Flags [.], ack 486, win 1563, length 0
E..(;.....Km..n...n:.P.x...(....P.... ..............x...8.?`R.
17:58:32.856139 IP CLIENT.56184 > SERVER.http: Flags [P.], seq 486:1490, ack 1, win 29200, length 1004: HTTP
E.....@.@.....n:..n..x.P.......(P.r.8m..[ID]
< Data for packet 3 >

....8.?`...6....
17:58:32.919921 IP SERVER.http > CLIENT.56184: Flags [.], ack 1490, win 2048, length 0
E..(;.....Kl..n...n:.P.x...(....P....O..................8.?`O.
17:58:32.924255 IP SERVER.http > CLIENT.56184: Flags [P.], seq 1:121, ack 1490, win 2048, length 120: HTTP: HTTP/1.0 200 OK
E...;.....J...n...n:.P.x...(....P....o..HTTP/1.0 200 OK
Content-Type: text/javascript
Access-Control-Allow-Origin: *
Content-length: 0
Connection: close

........8.?`._.7

非常干净:当我读到这篇文章时,我们发送第一个数据包,它是确认的,我们发送第二个,等等,最后我们在收到一个愉快的响应后关闭连接。

但是,来自请求的通信也不起作用。产生这个的示例代码是:

import requests

headers = {"User-Agent": "test client"}
files = {"binary": ("filename", "file contents", "application/octet-stream")}
data = {"type": "upload"}

requests.post("remote.host.url/index.html", data=data, files=files, headers=headers)

这会产生更脏的输出:

18:24:46.311756 IP CLIENT.56212 > SERVER.http: Flags [P.], seq 1:289, ack 1, win 29200, length 288: HTTP: POST /index.html HTTP/1.1
E..H..@.@.....n:..n....P.9.N..v.P.r.5...POST /index.html HTTP/1.1
Host: SERVER
User-Agent: test client
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive
Content-Length: 1247
Content-Type: multipart/form-data; boundary=d8a887dda41b5a35f61ccf79b26d7b4e

........^.?`.C..
18:24:46.311772 IP CLIENT.56212 > SERVER.http: Flags [.], seq 289:1313, ack 1, win 29200, length 1024: HTTP
E..(..@.@.....n:..n....P.9.n..v.P.r.8...--d8a887dda41b5a35f61ccf79b26d7b4e
< Data from packet 2 >

........^.?`+Z..
18:24:46.311777 IP CLIENT.56212 > SERVER.http: Flags [P.], seq 1313:1536, ack 1, win 29200, length 223: HTTP
E.....@.@.....n:..n....P.9.n..v.P.r.5`..
< Data from packet 3 >

................
18:24:46.525743 IP SERVER.http > CLIENT.56212: Flags [.], ack 289, win 1760, length 0
E..([D....,%..n...n:.P....v..9.nP....0..................^.?`..
18:24:46.800583 IP CLIENT.56212 > SERVER.http: Flags [.], seq 289:1313, ack 1, win 29200, length 1024: HTTP
E..(..@.@.....n:..n....P.9.n..v.P.r.8...--d8a887dda41b5a35f61ccf79b26d7b4e
< Data from packet 2, again >

........^.?`.../
18:24:46.803014 IP SERVER.http > CLIENT.56212: Flags [.], ack 1313, win 2048, length 0
E..([E....,$..n...n:.P....v..9.nP...................p...^.?`.R
18:24:46.803033 IP CLIENT.56212 > SERVER.http: Flags [P.], seq 1313:1536, ack 1, win 29200, length 223: HTTP
E.....@.@.....n:..n....P.9.n..v.P.r.5`..
< Data from packet 3, again >

.........^.?`k?.
18:24:46.813645 IP SERVER.http > CLIENT.56212: Flags [F.], seq 1, ack 1536, win 1825, length 0
E..([F....,#..n...n:.P....v..9.MP..!....................^.?`h.
18:24:46.813813 IP CLIENT.56212 > SERVER.http: Flags [F.], seq 1536, ack 2, win 29200, length 0
E..(..@.@.....n:..n....P.9.M..v.P.r.4...........^.?`...0
18:24:46.814339 IP SERVER.http > CLIENT.56212: Flags [.], ack 1537, win 1824, length 0
E..([G....,"..n...n:.P....v..9.NP.. ....................^.?`..
18:24:46.816550 IP CLIENT.56214 > SERVER.http: Flags [S], seq 1228421461, win 29200, options [mss 1460,sackOK,TS val 3666736130 ecr 0,nop,wscale 7], length 0
E..<.W@.@.8...n:..n....PI89U......r.4..........
................^.?`0..0....
18:24:46.817006 IP SERVER.http > CLIENT.56214: Flags [S.], seq 416609351, ack 1228421462, win 2048, options [mss 1460], length 0
E..,[H....,...n...n:.P.....GI89V`.......................^.?`..
18:24:46.817021 IP CLIENT.56214 > SERVER.http: Flags [.], ack 1, win 29200, length 0
E..(.X@.@.9...n:..n....PI89V...HP.r.4...........^.?`.0.0
18:24:46.817049 IP CLIENT.56214 > SERVER.http: Flags [P.], seq 1:289, ack 1, win 29200, length 288: HTTP: POST /index.html HTTP/1.1
E..H.Y@.@.7...n:..n....PI89V...HP.r.5...POST /index.html HTTP/1.1
Host: SERVER
User-Agent: test (EPICS base 7.0.4-E3-7.0.4-patch IOC)
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive
Content-Length: 1247
Content-Type: multipart/form-data; boundary=04a493e5def4d0baf76026663f63ae61

........^.?`.g.0
18:24:46.817063 IP CLIENT.56214 > SERVER.http: Flags [.], seq 289:1313, ack 1, win 29200, length 1024: HTTP
E..(.Z@.@.5...n:..n....PI8:v...HP.r.8...--04a493e5def4d0baf76026663f63ae61
< Data from packet 2, again! >

....p...^.?`.z.0
18:24:46.817068 IP CLIENT.56214 > SERVER.http: Flags [P.], seq 1313:1536, ack 1, win 29200, length 223: HTTP
E....[@.@.8/..n:..n....PI8>v...HP.r.5`..
< Data from packet 3, again! >

etc.

我注意到的第一件事是,在这种情况下,所有三个数据包都在第一个数据包被确认之前发送;之后发送第二个数据包,确认,然后发送第三个数据包。

然而,在这之后,由于某种原因,整个事情再次发送,我们再也没有收到 HTTP/1.0 200 OK 消息和良好的回应。

我知道两者之间发送的 HTTP headers 略有不同,但即使同步它们也无法修复两者之间的通信。我还注意到数据包大小不同,但我无法想象这是一个问题。

我还注意到,通过 curl 发送的数据包都设置了 PUSH 标志,但在 python 端却不一致。但除此之外,我真的看不出有什么不同。

所以我的问题是:为什么两者的行为不同,在这种情况下如何让 python 请求模块的行为更像 curl

Python 的请求不支持“Expect: 100-continue”([1], [2]),并且如果您正在与实际上需要 100-continue 的服务器进行大型帖子通信(并且看起来是这样),你最好的选择是找到一个支持它的 http 库(例如 libcurl/Pycurl)

手动添加 Expect: 100-continue header 到请求 http-request 可能也行不通,因为客户端应该发送那个 header,然后等待对于 100 Continue 响应,然后发送 body,但是当只是将 header 添加到请求时,这并没有神奇地告诉 Requests 它必须“等待 100-continue在发送 body" 之前响应,请求将立即发送 body 而无需等待,所以.. 是的,找到一个实际上原生支持它的 http 库。 (喜欢 libcurl/pycurl)

.. 如果你能被激怒,如果你去 relevant Requests feature request 表达你的支持就好了。