How to handle urllib.error.URLError: <urlopen error [WinError 10054] An existing connection was forcibly closed by the remote host>?

How to handle urllib.error.URLError: <urlopen error [WinError 10054] An existing connection was forcibly closed by the remote host>?

我有以下一小段 python 脚本:

from urllib.request import urlopen

def download_file(url):
    fp = open(url.split("=")[-1] + ".pdf", 'wb')
    req = urlopen(url)
    CHUNK = 20480
    chunk = req.read(CHUNK)
    fp.write(chunk)
    fp.close()

for i in range(1, 10000, 1):
    download_file("__some__url__" + str(i))

print("Done.")

我保留了这个脚本 运行 但过了一段时间(比方说下载了 100 个文件之后)由于某种原因它给出了一个错误:urllib.error.URLError: <urlopen error [WinError 10054] An existing connection was forcibly closed by the remote host>

我如何修改我的代码来处理该错误,即它不应该停止脚本和处理,即等待连接恢复,然后从它离开的地方继续下载?

PS:我知道它只从 URL 下载 20KB。

有关此错误的可能原因,您可以查看 python: [Errno 10054] An existing connection was forcibly closed by the remote host。阅读最高投票答案后,我的结论是它可能会发生,您的代码应该为此做好准备。

我会在此处循环使用 try: ... except ... 块,并在重试失败的连接之前增加延迟:

def download_file(url):
    # prefere with when exceptions are expected
    with open(url.split("=")[-1] + ".pdf", 'wb') as fp:
        delay = 5
        max_retries = 3
        for _ in range(max_retries):
            try:
                req = urlopen(url)
                CHUNK = 20480
                chunk = req.read(CHUNK)
                fp.write(chunk)
                break                # do not loop after a successfull download...
            except urllib.error.URLError:
                time.sleep(delay)
                delay *= 2
        else:                # signal an abort if download was not possible
            print(f"Failed for {url} after {max_retries} attempt")
            # or if you want to abort the script
            # raise Exception(f"Failed for {url} after {max_retries} attempt")