python 请求模块请求参数编码 url 与预期的不同 url
python requests module request params encoded url is different with the intended url
我在 python 项目请求模块上遇到 url 编码问题。
这是我从 wireshark 数据包
获得的两个不同的 url 编码参数
- 0900+%28%EB%8C%80%ED%95%9C%EB%AF%BC%EA%B5%AD+%ED%91%9C%EC%A4%80%EC%8B%9C%29
- 0900%20(%EB%8C%80%ED%95%9C%EB%AF%BC%EA%B5%AD%20%ED%91%9C%EC%A4%80%EC%8B%9C)
'1' 是 python 请求模块编码 url 和 '2' 是 url 从网络浏览器发送的数据包。
当我解码它们时,它显示相同的 utf-8 文本。
似乎对空白 space 和括号的处理在它们之间是不同的。
有什么方法可以将“1”更改为“2”吗?
这是我用来发送请求的代码
_url = "http://something"
_headers = {
'Accept': 'text/javascript',
'Accept-Encoding': 'gzip, deflate',
'Accept-Language': 'ko-KR',
'Connection': 'keep-alive',
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 11_2_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36',
'X-Requested-With': 'XMLHttpRequest'
}
_params = {
'action': 'log',
'datetime': '0900 (대한민국 표준시)'
}
# This is the request part
session = requests.Session()
res = session.get(_url, headers=_headers, params=_params)
您可以手动编码您的 _params
以构建您的 查询字符串 ,然后将其连接到您的 _url
.
You can use urllib.parse.urlencode
[Python-Docs] to convert
your _params
dictionary to a percent-encoded ASCII text string. The resulting
string is a series of key=value
pairs separated by &
characters,
where both key and value are quoted using the quote_via
function. By
default, quote_plus()
is used to quote the values, which
means spaces are quoted as a +
character and /
characters are
encoded as %2F
, which follows the standard for GET requests
(application/x-www-form-urlencoded). An alternate function that can be
passed as quote_via
is quote()
, which will encode spaces
as %20
and not encode /
characters. For maximum control of what is
quoted, use quote
and specify a value for safe.
from urllib.parse import quote_plus, quote, urlencode
import requests
url_template = "http://something/<b>?{}"</b>
_headers = { ... }
_params = {"action": "log", "datetime": "0900 (대한민국 표준시)"}
_url = url_template.format(urlencode(_params, safe="()", quote_via=quote))
response = requests.get(_url, headers=_headers)
我在 python 项目请求模块上遇到 url 编码问题。
这是我从 wireshark 数据包
获得的两个不同的 url 编码参数
- 0900+%28%EB%8C%80%ED%95%9C%EB%AF%BC%EA%B5%AD+%ED%91%9C%EC%A4%80%EC%8B%9C%29
- 0900%20(%EB%8C%80%ED%95%9C%EB%AF%BC%EA%B5%AD%20%ED%91%9C%EC%A4%80%EC%8B%9C)
'1' 是 python 请求模块编码 url 和 '2' 是 url 从网络浏览器发送的数据包。 当我解码它们时,它显示相同的 utf-8 文本。
似乎对空白 space 和括号的处理在它们之间是不同的。 有什么方法可以将“1”更改为“2”吗?
这是我用来发送请求的代码
_url = "http://something"
_headers = {
'Accept': 'text/javascript',
'Accept-Encoding': 'gzip, deflate',
'Accept-Language': 'ko-KR',
'Connection': 'keep-alive',
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 11_2_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36',
'X-Requested-With': 'XMLHttpRequest'
}
_params = {
'action': 'log',
'datetime': '0900 (대한민국 표준시)'
}
# This is the request part
session = requests.Session()
res = session.get(_url, headers=_headers, params=_params)
您可以手动编码您的 _params
以构建您的 查询字符串 ,然后将其连接到您的 _url
.
You can use
urllib.parse.urlencode
[Python-Docs] to convert your_params
dictionary to a percent-encoded ASCII text string. The resulting string is a series ofkey=value
pairs separated by&
characters, where both key and value are quoted using thequote_via
function. By default,quote_plus()
is used to quote the values, which means spaces are quoted as a+
character and/
characters are encoded as%2F
, which follows the standard for GET requests (application/x-www-form-urlencoded). An alternate function that can be passed asquote_via
isquote()
, which will encode spaces as%20
and not encode/
characters. For maximum control of what is quoted, usequote
and specify a value for safe.
from urllib.parse import quote_plus, quote, urlencode
import requests
url_template = "http://something/<b>?{}"</b>
_headers = { ... }
_params = {"action": "log", "datetime": "0900 (대한민국 표준시)"}
_url = url_template.format(urlencode(_params, safe="()", quote_via=quote))
response = requests.get(_url, headers=_headers)