为什么 urllib.parse.urlencode 不将 '_' 更改为 %5F?

Why does urllib.parse.urlencode not change '_' into %5F?

我正在写 POST 游戏请求,我正在尝试为其制作脚本。对于这个 post,我使用的是通用 req = urllib.request.Request(url=url, data=params, headers=headers) 首先,我有一个请求所需数据的字典,我必须用 params = urllib.parse.urlencode(OrderedDict[])

对其进行编码

但是,这给了我一个字符串,但不是正确的字符串!它会给我:

&x=_1&y_=2&_z_=3

但是,游戏编码的方式应该是:

&x=%5F1&y%5F=2&%5Fz%5F=3

所以我没有将下划线编码为“%5F”。我该如何解决?如果可以的话,我有游戏使用的参数(在 url 中,预先编码),我可以在请求的数据字段中使用它吗?

下划线不需要编码,因为它们是 URL 中的有效字符。

根据RFC 1738

Unsafe:

Characters can be unsafe for a number of reasons. The space character is unsafe because significant spaces may disappear and insignificant spaces may be introduced when URLs are transcribed or typeset or subjected to the treatment of word-processing programs. The characters < and > are unsafe because they are used as the delimiters around URLs in free text; the quote mark (") is used to delimit URLs in some systems. The character # is unsafe and should always be encoded because it is used in World Wide Web and in other systems to delimit a URL from a fragment/anchor identifier that might follow it. The character % is unsafe because it is used for encodings of other characters. Other characters are unsafe because gateways and other transport agents are known to sometimes modify such characters. These characters are {, }, |, \, ^, ~, [, ], and `.

All unsafe characters must always be encoded within a URL.

所以 _ 没有被 %5F 替换的原因与 a 没有被 %61 替换的原因相同:只是没有必要。 Web 服务器不(或不应该)关心任何一种方式。

如果您尝试使用的网络服务器确实在乎(但请先检查是否是这种情况),您将不得不做一些手动工作,因为 urllibs 引用 does not support quoting _:

urllib.parse.quote(string, safe='/', encoding=None, errors=None)

Replace special characters in string using the %xx escape. Letters, digits, and the characters _.- are never quoted.

您可能可以用自己的函数包装 quote(),然后将其传递给 urlencode()。像这样的东西(完全未经测试):

def extra_quote(*args, **kwargs):
    quoted = urllib.pars.quote(*args, **kwargs)
    return str.replace(quoted, '_', '%5F')

urllib.parse.urlencode(query, quote_via=extraquote)