使用 Python 同时下载文件的最佳方式？

Question

我正在尝试使用 Python requests 模块同时发送 get 请求。

在寻找解决方案时，我遇到了很多不同的方法，包括 grequests、gevent.monkey、requests futures、threading、multi-processing ...

关于速度和代码可读性，我有点不知所措，不确定该选择哪一个。

任务是尽快从同一台服务器下载 < 400 个文件。理想情况下，它应该在终端中输出下载状态，例如。 G。打印每个请求的错误或成功消息。

Answer 1

我会使用线程，因为没有必要像多处理那样运行在多核上进行下载。
所以写一个requests.get()在里面的函数，然后作为一个线程启动。

但请记住，您的互联网连接必须足够快，否则就不值得了。

Answer 2

def download(webpage):
    requests.get(webpage)
    # Whatever else you need to do to download your resource, put it in here

urls = ['https://www.example.com', 'https://www.google.com','https://yahoo.com'] # Populate with resources you wish to download
threads = {}

if __name__ == '__main__':
    for i in urls:
        print(i)
        threads[i] = threading.Thread(target=download, args=(i,))
    for i in threads:
        threads[i].start()
    for i in threads:
        threads[i].join()
    print('successfully done.')

以上代码包含一个名为 download 的函数，它表示您必须运行下载您要下载的资源的任何代码。然后会生成一个包含您要下载的 url 的列表 - 请根据需要更改这些值。这被组装到包含线程的第二个字典中。这样一来，您可以在 url 字典中拥有任意数量的 url，并且为它们中的每一个创建一个单独的线程。线程分别启动，然后加入。

使用 Python 同时下载文件的最佳方式？

Best way to download files simultaneously with Python?

python

networking

multithreading

download

python-requests