Python 请求 https: 使用 BurpSuite 时没有代码 403 但代码为 200

Question

我目前正在尝试抓取 retailmenot.com 目前我的代码是这样的：

import requests
from collections import OrderedDict

s = requests.session()

s.headers = OrderedDict()
s.headers["Connection"] = "close"
s.headers["Upgrade-Insecure-Requests"] = "1"
s.headers["User-Agent"] = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36"
s.headers["Accept"] = "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9"
s.headers["Sec-Fetch-Site"] = "none"
s.headers["Sec-Fetch-Mode"] = "navigate"
s.headers["Sec-Fetch-Dest"] = "document"
s.headers["Accept-Encoding"] = "gzip, deflate"
s.headers["Accept-Language"] = "en-GB,en-US;q=0.9,en;q=0.8"

s.get("https://www.retailmenot.com/sitemap/A")

当我使用此代码时，我立即被重定向到 CloudFlare 页面。也就是说，每当我通过用这行代码替换我的代码的最后一行来通过 burpsuite 传递我的流量时：

s.get("https://www.retailmenot.com/sitemap/A", proxies = {"https":"https://127.0.0.1:8080"}, verify ="/Users/Downloads/cacert (1).pem")

我直接进入网站。我觉得这有点奇怪，想知道是否有人可以向我解释为什么会这样，以及是否有办法通过使用一些不同的证书来获得类似的结果（因为为了使用 BurpSuite 证书，我需要保持应用程序打开).非常感谢！

Answer 1

看起来问题出在底层客户端 TLS 行为上。

我有一个使用 OpenSSL 1.1.1b 的旧版本 Python 和一个使用 OpenSSL 1.1.1f 的新版本。它在第一个版本中失败，但在第二个版本中有效。这也可以解释为什么它与 Burp 一起工作：它使用稍微不同的 TLS 行为。

我试图找出问题所在：使 non-working 版本使用工作版本的密码将无济于事。其他主要区别是支持的签名算法。实际上，对于提到的 openssl 1.1.1b（以及 Anaconda Python 附带的较新版本），差异可以减少为 sigalgs:

 $ openssl s_client -connect www.retailmenot.com:443 -crlf
 ...[various output]...
 <paste the expected HTTP request>
 ...
 HTTP/1.1 403 Forbidden

 $ openssl s_client -connect www.retailmenot.com:443 -crlf -sigalgs 'ECDSA+SHA256'
 ...[various output]...
 <paste the expected HTTP request>
 ...
 HTTP/1.1 200 OK

不幸的是，我无法在 Python 请求中直接设置 TLS 堆栈中的签名算法。 API 未公开，它仅使用默认值 - 因此失败或成功取决于 OpenSSL 的构建方式。

但看起来可以通过指定不同的安全级别来间接设置该值：

from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.ssl_ import create_urllib3_context

CIPHERS = ('DEFAULT:@SECLEVEL=2')
class CipherAdapter(HTTPAdapter):
    def init_poolmanager(self, *args, **kwargs):
        context = create_urllib3_context(ciphers=CIPHERS)
        kwargs['ssl_context'] = context
        return super(CipherAdapter, self).init_poolmanager(*args, **kwargs)

    def proxy_manager_for(self, *args, **kwargs):
        context = create_urllib3_context(ciphers=CIPHERS)
        kwargs['ssl_context'] = context
        return super(CipherAdapter, self).proxy_manager_for(*args, **kwargs)

s = requests.session()
s.mount('https://www.retailmenot.com/', CipherAdapter())
...
print(s.get("https://www.retailmenot.com/sitemap/A"))

这与特定的 header 设置一起，导致我的测试结果为 <Response [200]>，而使用相同的 Python 版本并且没有更改的安全级别导致 [=13] =].

Python 请求 https: 使用 BurpSuite 时没有代码 403 但代码为 200

Python requests https: code 403 without but code 200 when using BurpSuite

python

ssl

certificate

python-requests

burp