URL 没见过也找不到这么奇怪的符号编码
Never seen and can't find out such a weird sign encoding in URL
有人在 URL: &%23x3F;
中见过这样的符号编码吗?
它看起来像编码的 '
,但它看起来更像是 Word 中的撇号。 '
看起来像 %27
、%E2%80%98
或 %E2%80%99
。你可以在这里看到它:
这里结束:
http://www.hotelreservierung.de/angebot/St-James&%23x3F;s-Club-Morgan-Bay-Saint-Lucia/Hotel-4432957
问题是:这到底是什么标志?我在任何 unicode table 中都找不到它!我先想到,可能是组合
%23
是 #
字符的 url-encoded 形式。因此 URL 包含 ?
.
HTML 实体可以用以下三种格式之一表示:
&<name>;
&#<decimal>;
&#x<hex>;
在这种情况下,URL 包含一个 hex-encoded HTML 实体,其中 0x3F
是 ?
字符的十六进制值。
您提供的URL:
还有这个直接URL:
http://www.hotelreservierung.de/angebot/St-James's-Club-Morgan-Bay-Saint-Lucia/Hotel-4432957
两者都通过 HTTP 重定向响应 URL:
http://www.hotelreservierung.de/angebot/St-James&%23x3F;s-Club-Morgan-Bay-Saint-Lucia/Hotel-4432957
GET /LhPyt HTTP/1.1 Accept: text/html, application/xhtml+xml, */* Accept-Language: en-US User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko Accept-Encoding: gzip, deflate Host: ow.ly DNT: 1 Connection: Keep-Alive HTTP/1.1 301 Moved Permanently Location: http:// goo.gl/8vb7n8 Connection: close Content-Length: 0 GET /8vb7n8 HTTP/1.1 Accept: text/html, application/xhtml+xml, */* Accept-Language: en-US User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko Accept-Encoding: gzip, deflate DNT: 1 Host: goo.gl Connection: Keep-Alive HTTP/1.1 301 Moved Permanently Content-Type: text/html; charset=UTF-8 Pragma: no-cache Expires: Mon, 01 Jan 1990 00:00:00 GMT Date: Fri, 10 Apr 2015 16:59:34 GMT Location: http://www.hotelreservierung.de/angebot/St-James&%23x3F;s-Club-Morgan-Bay-Saint-Lucia/Hotel-4432957 Content-Encoding: gzip X-Content-Type-Options: nosniff X-Frame-Options: SAMEORIGIN X-XSS-Protection: 1; mode=block Content-Length: 240 Server: GSE Cache-Control: no-cache, no-store, max-age=0, must-revalidate Age: 83 Alternate-Protocol: 80:quic,p=0.5
GET /angebot/St-James's-Club-Morgan-Bay-Saint-Lucia/Hotel-4432957 HTTP/1.1 Accept: text/html, application/xhtml+xml, */* Accept-Language: en-US User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko Accept-Encoding: gzip, deflate Host: www.hotelreservierung.de DNT: 1 Connection: Keep-Alive HTTP/1.1 301 Moved Permanently Date: Fri, 10 Apr 2015 17:01:07 GMT Server: Apache/2 Provided-Host: hrslave03 Set-Cookie: _hrlnkflghtl2=a%3A1%3A%7Bi%3A0%3Bs%3A12%3A%22Hrlnkflghtl1%22%3B%7D; expires=Sun, 10-May-2015 17:01:07 GMT; path=/ Set-Cookie: _hrhtldtlnwdsgn2=a%3A1%3A%7Bi%3A0%3Bs%3A16%3A%22Hrhtldtlnwdsgn2b%22%3B%7D; expires=Sun, 10-May-2015 17:01:07 GMT; path=/ Set-Cookie: _hrstrtpgnwfrm=a%3A1%3A%7Bi%3A0%3Bs%3A14%3A%22Hrstrtpgnwfrm4%22%3B%7D; expires=Sun, 10-May-2015 17:01:07 GMT; path=/ Expires: Thu, 19 Nov 1981 08:52:00 GMT Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0 Pragma: no-cache Set-Cookie: mDhBeFyD=00; Expires=Sat, 11-Apr-2015 17:01:07 GMT; Path=/ Location: /angebot/St-James&%23x3F;s-Club-Morgan-Bay-Saint-Lucia/Hotel-4432957 Vary: Accept-Encoding Content-Encoding: gzip Content-Length: 20 Connection: close Content-Type: text/html
注意两个响应中的 Location
header。
在第一种情况下,浏览器只是导航到 goo.gl
告诉它去的新 URL。
在第二种情况下,浏览器在其 GET
请求中传输 '
字符 as-is,然后被重定向到一个新的 URL,其中包含 &%23x3F;
代替。因此,正是 hotelreservierung.de
服务器本身决定在其 URL 中将 '
字符编码为 &%23x3F;
。这不是浏览器做的。