使用 urllib2 加速多个下载
Speed up multiple downloads with urllib2
我正在使用我编写的相当简单的代码从名为 ZINC 的数据库下载多个 SMI 文件。但是,考虑到文件大小(几 kb)和我的互联网连接,它的速度看起来不太好。
有没有办法加快速度?
import urllib2
def job(url):
''' This function opens the URL and download SMI files from ZINC15'''
u = urllib2.urlopen(url) # Open URL
print 'downloading ' + url # Print which files is being downloaded
with open('output.smi', 'a') as local_file:
local_file.write(u.read())
with open('data.csv') as flist:
urls = ['http://zinc15.docking.org/substances/{}.smi'.format(str(line.rstrip())) for line in flist]
map(job, urls)
import threading
import Queue # the correct module name is Queue
MAX_THREADS = 10
urls = Queue.Queue()
def downloadFile():
while not urls.empty()
u = urls.get_nowait()
job(u)
for url in your_url_list:
urls.put(url)
for i in range(0, MAX_THREADS + 1):
t = threading.Thread(target=downloadFile)
t.start()
基本上它导入了threading和queu模块,Queu对象会保存跨线程使用的数据,每个线程都会执行downloadFile()函数。
很容易理解,如果不明白,请告诉我。
我正在使用我编写的相当简单的代码从名为 ZINC 的数据库下载多个 SMI 文件。但是,考虑到文件大小(几 kb)和我的互联网连接,它的速度看起来不太好。 有没有办法加快速度?
import urllib2
def job(url):
''' This function opens the URL and download SMI files from ZINC15'''
u = urllib2.urlopen(url) # Open URL
print 'downloading ' + url # Print which files is being downloaded
with open('output.smi', 'a') as local_file:
local_file.write(u.read())
with open('data.csv') as flist:
urls = ['http://zinc15.docking.org/substances/{}.smi'.format(str(line.rstrip())) for line in flist]
map(job, urls)
import threading
import Queue # the correct module name is Queue
MAX_THREADS = 10
urls = Queue.Queue()
def downloadFile():
while not urls.empty()
u = urls.get_nowait()
job(u)
for url in your_url_list:
urls.put(url)
for i in range(0, MAX_THREADS + 1):
t = threading.Thread(target=downloadFile)
t.start()
基本上它导入了threading和queu模块,Queu对象会保存跨线程使用的数据,每个线程都会执行downloadFile()函数。
很容易理解,如果不明白,请告诉我。