关于循环优化和速度的问题

Question

这是我的代码的要点：

while (int(price) > targetPrice):

    try:
        details = requests.get(url, headers=headers).text
        var1 = (int)(re.search('desired-string(\d+)', details).group(1))
        var2 = (int)(re.search('desired-string(\d+)', details).group(1))
        var3 = (int)(re.search('desired-string(\d+)', details).group(1))    
    except (AttributeError, ValueError):
        print('Error')

本质上，我有一个不断获取网页并抓取所需数据的循环。我遇到的问题是我需要这个循环尽快到达运行。循环迭代一次平均需要 0.33 秒，我希望这个数字尽可能低。我正在获取的信息经常发生变化，我需要在发生变化时立即获取它。

我发现需要这么长时间的原因是我提出的要求。当我只需要 HTML 中同一位置的大约 5 行时，会出现很多 HTML。有没有办法让请求获取 HTML 的特定行并忽略我不需要的所有内容？

正在提取的 HTML 来自此页面：https://www.roblox.com/catalog/6803405665/Gucci-Dionysus-Bag

多线程并不是我真正想要的，因为目标是尝试让循环尽可能快地迭代。据我所知和测试，多线程只允许循环异步运行但仍会运行每次迭代 0.33 秒。

我认为这是一个优化问题（如果有的话）。任何援助将不胜感激。如果需要任何进一步的信息，请告诉我，我会提供。

Answer 1

我会尝试的第一件事是使用 requests.Session 根据文档 https://2.python-requests.org/en/master/user/advanced/#session-objects:

The Session object allows you to persist certain parameters across requests. It >also persists cookies across all requests made from the Session instance, and >will use urllib3’s connection pooling. So if you’re making several requests to >the same host, the underlying TCP connection will be reused, which can result >in a significant performance increase (see HTTP persistent connection).

在 while 循环外实例化会话：

s = requests.Session()
while (int(price) > targetPrice):

    try:
        details = s.get(url, headers=headers).text
        var1 = (int)(re.search('desired-string(\d+)', details).group(1))
        var2 = (int)(re.search('desired-string(\d+)', details).group(1))
        var3 = (int)(re.search('desired-string(\d+)', details).group(1))    
    except (AttributeError, ValueError):
        print('Error')

如果这还不够，也许转向异步请求https://pypi.org/project/aiohttp/

关于循环优化和速度的问题

Question About Loop Optimization and Speed

python

optimization

python-requests