HTTP 响应中的错误 chunk-size:Net/HTTP/Methods.pm 第 542 行

Bad chunk-size in HTTP response: Net/HTTP/Methods.pm line 542

提出类似问题的问题:

.


我正在使用 Perl 模块 WWW::Mechanize 来抓取网站。据我了解,WWW::Mechanize使用Net::HTTP模块来实现HTTP协议。

这是问题所在:

my $url = 'https://somewebsite.com/a/b/c?skey=svalue';
my $browser = WWW::Mechanize->new();
$browser->get($url);

当我执行上面的代码片段时(假设所有导入都已到位),我得到一个 空响应内容 ,响应中有以下响应错误 header object 共 WWW:Mechanize:

'x-died' = "Bad chunk-size in HTTP response: { at path/ to/perl/vendor/lib/Net/HTTP/Methods.pm line 542."

注意异常消息中的“{”。然后我尝试调试 Methods.pm 模块以查看发生了什么,看起来异常发生在 read_entity_body 子例程中。

我也为 url 做了一个 curl 并得到了以下响应 headers:

< HTTP/1.1 200 OK
< Set-Cookie: JSESSIONID=C61B57BA5DD0A05912C98CE1CFBAD435; Path=/; HttpOnly
< X-Frame-Options: DENY
< Transfer-Encoding: chunked
< Strict-Transport-Security: max-age=31536000 ; includeSubDomains
< Server: Apache-Coyote/1.1
< Cache-Control: no-cache, no-store, max-age=0, must-revalidate
< X-Content-Type-Options: nosniff
< Content-Disposition: attachment;filename=f.txt
< Pragma: no-cache
< Expires: 0
< X-XSS-Protection: 1; mode=block
< Date: Thu, 21 Sep 2017 18:31:27 GMT
< Content-Type: application/json;charset=UTF-8
< Transfer-Encoding: chunked

并具有以下内容:

{
  "total" : 1,
  "page" : 1,
  "records" : 1,
  "rows" : [ {
    "infoPostRptId" : 2,
    "mngPplId" : 1,
    "infoPostRptXsdId" : 1,
    "rptFmtCode" : "XML",
    "createUserId" : 5183202,
    "updateUserId" : 1,
    "statusId" : 309403,
    "seqNbr" : 0,
    "urlAnchor" : null,
  } ],
  "errors" : null
}
* Connection #0 to host xxxxxxx left intact

如果我没记错的话,虽然 header 提到了 transfer-encoding[=89,但从网站传来的内容看起来实际上并未进行块编码=] 分块

有关 Methods.pm 模块的更多信息:

据我了解,read_entity_body 子例程尝试解码并组合块以形成响应内容。

我认为问题是响应 headers 有 Transfer-Encoding: chunked 但内容实际上没有编码成块。

非常感谢任何帮助。谢谢。

编辑 1:

版本:

WWW:Mechanize:1.83,LWP:UserAgent:6.15 和 Net::HTTP:6.12

编辑 2:

curl -s --raw -D - "https://...." 的输出:

HTTP/1.1 200 OK
Set-Cookie: JSESSIONID=A29B1E0F561F1E4FBAF12583C0C2DE08; Path=/; HttpOnly
X-Frame-Options: DENY
Transfer-Encoding: chunked
Strict-Transport-Security: max-age=31536000 ; includeSubDomains
Server: Apache-Coyote/1.1
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
X-Content-Type-Options: nosniff
Content-Disposition: attachment;filename=f.txt
Pragma: no-cache
Expires: 0
X-XSS-Protection: 1; mode=block
Date: Fri, 22 Sep 2017 02:36:51 GMT
Content-Type: application/json;charset=UTF-8
Transfer-Encoding: chunked

45c
{
  "total" : 1,
  "page" : 1,
  "records" : 1,
  "rows" : [ {
        "infoPostRptId" : 2,
        "mngPplId" : 1,
        "infoPostRptXsdId" : 1,
        "rptFmtCode" : "XML",
        "createUserId" : 5183202,
        "updateUserId" : 1,
        "statusId" : 309403,
        "seqNbr" : 0,
        "urlAnchor" : null,
  } ],
  "errors" : null
}
0

与之前的 JSON 内容一样,我有一些值 removed/altered 只是为了匿名化数据。

编辑 3: 这是我执行以下命令时得到的结果:

 perl -MLWP::UserAgent -e'print LWP::UserAgent->new->get($ARGV[0])->as_string' 'https://......'

  HTTP/1.1 200 OK
  Cache-Control: no-cache, no-store, max-age=0, must-revalidate
  Connection: close
  Date: Fri, 22 Sep 2017 04:15:06 GMT
  Pragma: no-cache
  Server: Apache-Coyote/1.1
  Content-Type: application/json;charset=UTF-8
  Expires: 0
  Client-Aborted: die
  Client-Date: Fri, 22 Sep 2017 04:15:06 GMT
  Client-Peer: 67.221.172.5:443
  Client-Response-Num: 1
  Client-SSL-Cert-Issuer: /C=US/ST=Arizona/L=Scottsdale/O=GoDaddy.com, Inc./OU=http://certs.godaddy.com/repository//CN=Go Daddy Secure Certificate Authority - G2
  Client-SSL-Cert-Subject: /OU=Domain Control Validated/CN=*.trellisenergy.com
  Client-SSL-Cipher: ECDHE-RSA-AES128-SHA256
  Client-SSL-Socket-Class: IO::Socket::SSL
  Client-Transfer-Encoding: chunked
  Content-Disposition: attachment;filename=f.txt
  Set-Cookie: JSESSIONID=5CAC35648DBBE25E3229DE9BF21C3794; Path=/; HttpOnly
  Strict-Transport-Security: max-age=31536000 ; includeSubDomains
  X-Content-Type-Options: nosniff
  X-Died: Bad chunk-size in HTTP response: { at /usr/local/share/perl5/Net/HTTP/Methods.pm line 544.
  X-Frame-Options: DENY
  X-XSS-Protection: 1; mode=block

编辑 4: TCP 转储:

是否在一个终端中执行了以下命令window:

perl -MLWP::UserAgent -e'print LWP::UserAgent->new->get($ARGV[0])->as_string' 'https://vgs.trellisenergy.com/ptms/public/infopost/getInfoPostRpts.do?tspId=1&proxyTspId=1&rptId=2&downloadInd=0&searchInd=0&showLatestInd=0&cycleId=10303&startDate=09/20/2017&endDate=09/20/2017&_search=false&nd=1505846852955&rows=10&page=1&sidx=&sord=asc&_=1505846826289'

另外一个是:

tcpdump -w tcpdump.pcap -A -s0 -e -n -vvv -i eth0 host vgs.trellisenergy.com

漂亮的打印 tcpdump 使用:

tcpick -C -yP -r tcpdump.pcap

TCP 转储:

Starting tcpick 0.2.1 at 2017-09-22 10:24 MDT
Timeout for connections is 600
tcpick: reading from tcpdump.pcap
1      SYN-SENT       10.1.1.10:24876 > 67.221.172.5:https
1      SYN-RECEIVED   10.1.1.10:24876 > 67.221.172.5:https
1      ESTABLISHED    10.1.1.10:24876 > 67.221.172.5:https
...........Y.8..*m.i.'ZZP*....1...d
.._.$.^....0.,.(.$...
.....k.j.9.8.....2...*.&.......=.5.../.+.'.#... .....g.@.3.2.....E.D.1.-.).%.......<./...A.........
..................._.........vgs.trellisenergy.com.........
. .....................................
.....0..1.0.......U....US1.0...U....Arizona1.0...U...............>.s].s.a^.
Scottsdale1.0...U.
..........0..0A1!0...U....Domain Control Validated1.0...U....*.trellisenergy.com0.."0 Secure Certificate Authority - G20..
h@s0.*$.H.4./..E8.m.V......'!..f...!tY'.(..`......... ...E.)Tz..z2.%..KEi....Dd.....s....JW_.Y  ..8..6..Y ........i.r............"...a.
LI1V    6t....C.....20uB'..#:...n..(-...(..P..M..O...p.3L.].@A.........0...0...U.......0.0...U.%..0...+.........+.......0...U...........07..U...00.0,.*.(.&http://crl.godaddy.com/gdig2s1-337.crl0]..U. .V0T0H..`.H...m....0907..+........+http://certificates.godaddy.com/repository/0...g.....0v..+........j0h0$..+.....0...http://ocsp.godaddy.com/0@..+.....0..4http://certificates.godaddy.com/repository/gdig2.crt0...U.#..0...@..'..4.0.3..l...,..01..U...*0(..*.trellisenergy.com............z...;^..'.@.l..,Cj...N.LY.S.......~p...k.. ...Y..S}.\}o.......(.
.....H..SG.D.vy}...qM(.0LT.C.....R.......y...   Y.....wz.s4..Q.t...u...].8.|..q..+.>5...?..`z.X2. .{.%..[ 7.. r...y.yjY..h]...0I.$..x,O....h......n.b.....c.<.....X.Gi.P.vTM.d.B.
.....0..1.0...a...U....US1.0...U....Arizona1.0...U...
Scottsdale1.0...U.
310503070000Z0..1.0110/...U....US1.0...U....Arizona1.0...U...rity - G20..
Scottsdale1.0...U.
..........0.., Inc.1-0+..U...$http://certs.godaddy.com/repository/1301..U...*Go Daddy Secure Certificate Authority - G20.."0
...........v...b.0d...l...b../.>e...b.<R...EKU.xkc.b...il.....L.E3......+..a.yW....?0<]G.....7.AQ..KT.(.....08...&.fGcm.q&G.8GS.F......E...q..o....0:yO_LG...[...`;..C...3N...'O.%........t.dW..DU.-*:>....2
..d..:P.J..y3.. .....9.i.lcR.w...t.....PT5KiN.;.I.....R..........0...0...U.......0....0...U...........0...U......@..'..4.0.3..l...,..0...U.#..0...:....g(.....An .....04..+........(0&0$..+.....0...http://ocsp.godaddy.com/05..U....0,0*.(.&.......`..r.s$..."....bXD...%......b.Q...Q*...s.v.6....,....*...Mu..?.A.#}[K...X.F..``..}PA......../..T.D..}.C.D..p
...3..-v6&.....a....o.F.(..&}
.....0..1.0.......U....US1.0...U....Arizona1.0...U...
Scottsdale1.0...U.
09GoDaddy.com, Inc.110/..U...(Go Daddy Root Certificate Authority - G20..
371231235959Z0..1.0     ..U....US1.0...U....Arizona1.0...U...
Scottsdale1.0...U.
..........0.., Inc.110/..U...(Go Daddy Root Certificate Authority - G20.."0
..f"..im6.......`.8......F.. C.;....I.'....N...p..2...>.N...O/Y0"...Vk......u.9Q{..5.tN......?........j..............;F|2
>.]|.|..+S..biQ%.a.D..,.C.#..:...)....]....0
............]y...Yg.a.~;.1u-. .Oe......../..Z..t.s.8B..{..u...........S.~.F.....+....'....Z.7....l....=.$Oy.5._.......-.......s@.r%......h..W...:       ..D...7...2..8..d.,~........h..".8-z..T.i._3.z={
.8.. 'e...]p-..N.(F...6.....(....k.Q......8k...v...v...(...=!.:...;.L.....K./.....D....xH .Zi.<!.}i. t.c.!yWY..c.I......?.._.e......"...v.'8Qq.d].......O(8._M....%........]:LU....]l.  .....
............iA...~....C5...k.43... .F6. .\!....X......bJ.e..@.....[.uO.&..-....7.O. .......g2..R.b....H7.........G.....%u1.....8$.u..O....za..T..........P...V2.;.......j.L.Px;..-....&.......H...yQ,n.s..<KFx#...2..K.G..n4OG{N.5.6../...
......
....PU.T....A.d...*.iw..        c.Wjm.V\. ..vP.Z%......v...k......l...b7.|.u..c.=:....$.3K..
........v.{u...`..+.qU. .'.t.g....V......1..P.g..aO....nY..C..F...4x.d...Y....|3..Pz;.K.~]...H..;...PIR..hRv...)].=?.:..[...h...A.. /4..d.......C`....]LZK.Y..q......Q.L.R..D&...l..t..I.j2....8...y.L..).y.n..).u|..'.....z ..,Yg..md."i.......M.74x...3..N.b.6..tm.).u...|-.xK.9R..M,......!....}..[=B.J......     ...~Gx.8p.5.UQ........sJ
...w..Xf.#^..,..G.w.f4.V..'..Bb_..*e.i......P1.
U6!.l..%...ts. u!c5.0>.!.2J.G)p.W.........dF*5.....5..M.        .....G+.....I..vG&..>.}(....E.  ...9...N.i..Jm&b...G...3Wo#k.........e:..p........:w....V.L'9.-..)......d.P_....#..iide@.2..E>.?|..:....B.,mr...N.JAS1]:...O.......i..c..T.pZZ)..E."\b.r2HA..r!....L........K....~1.....x!.Gp.K..G..D*s.u....WN.?..(+..rU..g?d.....eG.L.^...*..a...]/...N0.gX..;...T...%...;.P?.O4{.i.....%.T.|..
...U..Ug......d...a3:$...p...v..t."...
.......%..J`E....5....n..M....>...ge.r.,...s..,..       k..R.N._>3}...=.0...........T.d..       ...u 7?T...3b.?.lr...8o.Gk.}xkBY[...l..^.-.Wt}..G/..l.f..z..^F.A.G.i8l4.....#.a.....BS.c.Q7..=y...{ELUP.R..c.{...a9.u3..-@F.H..M..2.o.j@.pI..S....R  ..vx.u.<-x..".T.d-...:...>......n..Z|..?Dz@N..?...#.../.....2.Z..y..Ej..........Q.....'8.....nC..7.....)e..7r..[..H...R.....h...x7G.+.......eBErwo.r....,..e*.8O..oQ. `O.@.J#...5).9.....!d.u....,...pV..oS...%.o..F..G.7....I...N...s .G..G@.".w6d......R..j
..........G.D..l....0..EH.Y..4.e.\#~s.i.-WKoyK...w.'.o.X-.,x.......4......T.*.>#..
..G(wP.V.i...F.U...t...-.\.!...Y4,...._............7..|<DM3.&u.%.0..G.......9....
.....Y......55ZW..X......Tz..D...r.6$..B...Wv..R..8.."../dL..-...i^o..>:..O...s.W.).i....gOH...@.....8k.......Q........#.....#.R..^.....f.......x^X....^S.R..u.7.._..T]A'/4>k\..Lg....H...J....o>.2 ......$.......PP..#..=.E..;2..>k...`...9..>*.....N...4........(...a....n....)w.I.@O+.(.cV..g.....%G..^.Z#.'EG...]..$_...!e...%.;VG.7.5.&...C........s4..1....t[
1      FIN-WAIT-1     10.1.1.10:24876 > 67.221.172.5:https
1      TIME-WAIT      10.1.1.10:24876 > 67.221.172.5:https
1      CLOSED         10.1.1.10:24876 > 67.221.172.5:https
tcpick: done reading from tcpdump.pcap

22 packets captured
1 tcp sessions detected

这是服务器中的错误,或者(更有可能)是服务器上 运行ning 应用程序中的错误。如果有人发送以下请求:

GET /some-path HTTP/1.1
Host: some-host

服务器正在使用正确的分块响应进行响应。有趣的是 Transfer-Encoding: chunked header 发送了两次 - 一次在 HTTP header 的开头,一次在结尾:

HTTP/1.1 200 OK
Set-Cookie: ...
X-Frame-Options: DENY
Transfer-Encoding: chunked
...
Content-Type: application/json;charset=UTF-8
Transfer-Encoding: chunked

45c
{

现在,当发送一个添加了 Connection: close header 的稍微改变的请求时,响应看起来不同了:

GET /some-path HTTP/1.1
Host: some-host
Connection: close

----

HTTP/1.1 200 OK
Set-Cookie: ...
X-Frame-Options: DENY
Transfer-Encoding: chunked
...
Content-Type: application/json;charset=UTF-8

{

开头的 Transfer-Encoding: chunked 还在,但最后一个已经不在了。 并且响应 body 不再分块,即使响应 header! 中仍有 Transfer-Encoding: chunked

这就是 LWP 与 curl 不同的情况:LWP 正在发送 Connection: TE, close header 而 curl 没有发送 Connection header。这意味着 LWP 得到了损坏的响应并且正确地抱怨,而 curl 没有得到损坏的响应因此没有理由抱怨。但是,如果你明确地添加一个 Connection: close header 来卷曲它会 运行 进入同样的问题:

 $ curl -H 'Connection:close' https://...
 curl: (56) Illegal or missing hexadecimal sequence in chunked-encoding

进一步测试表明,如果客户端正在执行 HTTP/1.0 请求,也会发送前导 Transfer-Encoding: chunked header!这根本不应该发生,因为分块仅用 HTTP/1.1.

定义

这表明服务器上 运行 网络应用程序的某些部分而不是网络服务器本身正在发出第一个 Transfer-Encoding: chunked header。因此,如果您有权访问应用程序或应用程序的开发人员,您应该在那里修复它。