urllib.request.urlretrieve 仅从站点检索一张图片时卡住了

urllib.request.urlretrieve got stuck when retrieved just one image from a site

python代码(python3)

import time
import urllib.response, requests

from config.dev import CONTENT_IMAGE_UPLOAD

directory = CONTENT_IMAGE_UPLOAD + "en_" + time.strftime('%Y%m%d')
filename =  "sample.jpg"
try:
    urllib.request.urlretrieve("https://www.miamiherald.com/latest-news/wfeh98/picture238148999/alternates/LANDSCAPE_1140/Screenshot%20(150).png", directory + "/" + filename)
    print("image is saved")
except Exception as e:
    print(e)

我希望在不到一分钟的时间内获得图像,但它需要 too long 然后将输出打印为如下消息。

[Errno 60] Operation timed out

我确定图像已经存在,因为当我复制和粘贴时我得到了图像, 但似乎 URL 在这部分 Screenshot%20(150).png

中包含一些特殊字符

我该如何解决这个错误?

你应该添加用户代理来绕过这个问题,我从不直接使用 urllib 我通常使用请求,因为它对我来说更容易,如果你愿意,你可以使用 urllib 实现相同的概念,但你需要查找它, 这是示例代码

import time
import urllib.response, requests

# from config.dev import CONTENT_IMAGE_UPLOAD
headers = {"user-agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36"}
# directory = CONTENT_IMAGE_UPLOAD + "en_" + time.strftime('%Y%m%d')
filename =  "sample.jpg"
try:
    resp = requests.get("https://www.miamiherald.com/latest-news/wfeh98/picture238148999/alternates/LANDSCAPE_1140/Screenshot%20(150).png", headers=headers).content
    with open(filename, "wb") as f:
        f.write(resp)
    print("image is saved")
except Exception as e:
    print(e)

这可能对你有帮助:)

Changing User Agent in Python 3 for urrlib.request.urlopen