使用 urllib 而不是 http.client 登录网站

Question

我正在尝试使用以下代码在 Python 中使用 urllib 登录网站：

import urllib.parse
import urllib.request
headers = {"Content-type": "application/x-www-form-urlencoded"}
payload = urllib.parse.urlencode({"username": "USERNAME-HERE",
                                  "password": "PASSWORD-HERE",
                                  "redirect": "index.php",
                                  "sid": "",
                                  "login": "Login"}).encode("utf-8")
request = urllib.request.Request("https://osu.ppy.sh/forum/ucp.php?mode=login", payload, headers)
response = urllib.request.urlopen(request)
data = response.read()

# print the HTML after the request
print(bytes(str(data), "utf-8").decode("unicode_escape"))

我知道一个常见的建议是只使用 Requests 库，我已经试过了，但我特别想知道如何在不使用 Requests 的情况下做到这一点。

可以使用以下使用 http.client 成功登录站点的代码复制我正在寻找的行为：

import urllib.parse
import http.client
headers = {"Content-type": "application/x-www-form-urlencoded"}
payload = urllib.parse.urlencode({"username": "USERNAME-HERE",
                                  "password": "PASSWORD-HERE",
                                  "redirect": "index.php",
                                  "sid": "",
                                  "login": "Login"})
conn = http.client.HTTPSConnection("osu.ppy.sh")
conn.request("POST", "/forum/ucp.php?mode=login", payload, headers)
response = conn.getresponse()
data = response.read()

# print the HTML after the request
print(bytes(str(data), "utf-8").decode("unicode_escape"))

在我看来，urllib 代码不是 "delivering" 有效载荷，而 http.client 代码是。

我似乎能够 "deliver" 有效载荷，因为提供错误的密码和用户名可以保证服务器的响应，但提供正确的用户名和密码似乎没有任何效果。

有什么见解吗？我是不是忽略了什么？

Answer 1

添加一个饼干罐并取出 headers，因为 urllib:

不需要它们

import http.cookiejar
import urllib.parse
import urllib.request

jar = http.cookiejar.CookieJar()
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(jar))

payload = urllib.parse.urlencode({"username": "USERNAME-HERE",
                                  "password": "PASSWORD-HERE",
                                  "redirect": "index.php",
                                  "sid": "",
                                  "login": "Login"}).encode("utf-8")
response = opener.open("https://osu.ppy.sh/forum/ucp.php?mode=login", payload)
data = response.read()

# print the HTML after the request
print(bytes(str(data), "utf-8").decode("unicode_escape"))

使用 urllib 而不是 http.client 登录网站

Login to website using urllib instead of http.client

python

python-3.x

http

post

urllib