URL 读取时间过长时如何跳出循环

Question

您好，我有以下代码可以跳过特定的 URL 如果阅读时间太长。

timeout = 30 
loop begins below for different urlz {

    timeout_start = time.time()

    
    webpage = urlopen(urlz[i]).read()
        
    if time.time() > timeout_start + timeout:
        continue}

我的问题是；程序不会在向下移动以检查 if 条件之前执行代码行“webpage = urlopen(urlz[i]).read()”吗？在那种情况下，我认为它不会检测页面是否花费太长时间（阅读时间超过 30 秒）。如果程序卡住 30 秒，我基本上想跳过这个 URL 并转到下一个（即我们运行在阅读这个特定的 URL 时遇到问题）。

Answer 1

urlopen()函数内置超时方法：

urllib.request.urlopen(url, data=None, [timeout, ]*, cafile=None, capath=None, cadefault=False, context=None)

因此在您的代码中：

timeout = 30 
loop begins below for different urlz {

    try:
        webpage = urlopen(urlz[i], timeout=timeout).read()
    }

URL 读取时间过长时如何跳出循环

How to break out of loop when URL read is taking too long

timedelay

python-3.x