有什么办法可以停止 pycurl perform() 的执行吗?
Is there any way to stop pycurl perform() execution?
我目前正在使用 pycurl 和 python 开发一个简单的多线程下载器,现在我可以根据需要暂停下载,但这只会占用太多内存,如果进程被终止,我不能直接恢复吧。所以我想出了一个解决方法(尚未实施)来停止下载,将字节位置(我可以通过进度函数获得)保存到 XML 文件,然后通过指定所述字节从那里继续下载位置。
我卡在的地方是我需要在执行 perform() 时关闭连接,因为 perform() 正在阻塞我想我可以从另一个线程调用 close() 但这只会抛出一个例外。我似乎无法在 pycurl 或 libcurl 文档中找到任何可以帮助我完成我想要的东西。
至于为什么我在更简单的请求上使用 pycurl,我已经使用 pycurl 进行基本自动化已有一段时间了,我已经习惯了,如果这里没有其他方法可以做我想做的事情的话pycurl 然后我猜 requests 可能是最后的手段。
class Downloader:
def __init__(self, url, parts):
self.url = url
self.fileName = re.search(r"(?:[^/][\d\w.]+)+$", self.url, flags=re.IGNORECASE).group(0)
self.parts = parts
self.pause = False
self.fileSize = round(self._getSize())
self.partSize = round(self.fileSize / self.parts)
self.threads = list()
self.curlObjs = list()
# Get file size by only downloading the HEADER and then calling getinfo for the length.
def _getSize(self):
curl = pycurl.Curl()
curl.setopt(curl.URL, self.url)
curl.setopt(curl.FOLLOWLOCATION, True)
curl.setopt(curl.NOBODY, True)
curl.perform()
fileSize = curl.getinfo(curl.CONTENT_LENGTH_DOWNLOAD)
curl.close()
return fileSize
# Track individual file part download progress.
def _trackProgress(self, totalDown, currentDown, totalUp, currentUp):
pass # TODO
# if currentDown != 0 and currentDown == totalDown:
# print(f"Download Completed!\n{currentDown}/{totalDown}")
# Calculate the part size, execute _downloadRange in separate threads, merge file parts on download completion.
def download(self):
partStart = 0
partEnd = self.partSize
for part in range(1, self.parts + 1):
t = threading.Thread(target=self._downloadRange, args=(partStart, partEnd, part))
self.threads.append(t)
t.start()
partStart += self.partSize + 1 if part == 1 else self.partSize
partEnd += self.partSize
for t in self.threads:
t.join()
self._mergeFiles(self.fileName)
# Download the specified range and write it to a file part.
def _downloadRange(self, startRange, endRange, fileNo):
with open(f"{self.fileName}{fileNo}.part", "wb") as f:
curl = pycurl.Curl()
self.curlObjs.append(curl)
curl = curlObj.curl
curl.setopt(curl.URL, self.url)
curl.setopt(curl.FOLLOWLOCATION, True)
curl.setopt(curl.RANGE, f"{startRange}-{endRange}")
curl.setopt(curl.WRITEDATA, f)
curl.setopt(curl.NOPROGRESS, False)
curl.setopt(curl.XFERINFOFUNCTION, self._trackProgress)
curl.perform()
curl.close()
# Merge the file parts into one and delete the parts.
def _mergeFiles(self, fileName):
with open(fileName, "wb") as o:
for part in range(1, self.parts + 1):
with open(f"{self.fileName}{part}.part", "rb") as p:
o.write(p.read())
os.remove(f"{self.fileName}{part}.part")```
我目前正在使用 pycurl 和 python 开发一个简单的多线程下载器,现在我可以根据需要暂停下载,但这只会占用太多内存,如果进程被终止,我不能直接恢复吧。所以我想出了一个解决方法(尚未实施)来停止下载,将字节位置(我可以通过进度函数获得)保存到 XML 文件,然后通过指定所述字节从那里继续下载位置。
我卡在的地方是我需要在执行 perform() 时关闭连接,因为 perform() 正在阻塞我想我可以从另一个线程调用 close() 但这只会抛出一个例外。我似乎无法在 pycurl 或 libcurl 文档中找到任何可以帮助我完成我想要的东西。
至于为什么我在更简单的请求上使用 pycurl,我已经使用 pycurl 进行基本自动化已有一段时间了,我已经习惯了,如果这里没有其他方法可以做我想做的事情的话pycurl 然后我猜 requests 可能是最后的手段。
class Downloader:
def __init__(self, url, parts):
self.url = url
self.fileName = re.search(r"(?:[^/][\d\w.]+)+$", self.url, flags=re.IGNORECASE).group(0)
self.parts = parts
self.pause = False
self.fileSize = round(self._getSize())
self.partSize = round(self.fileSize / self.parts)
self.threads = list()
self.curlObjs = list()
# Get file size by only downloading the HEADER and then calling getinfo for the length.
def _getSize(self):
curl = pycurl.Curl()
curl.setopt(curl.URL, self.url)
curl.setopt(curl.FOLLOWLOCATION, True)
curl.setopt(curl.NOBODY, True)
curl.perform()
fileSize = curl.getinfo(curl.CONTENT_LENGTH_DOWNLOAD)
curl.close()
return fileSize
# Track individual file part download progress.
def _trackProgress(self, totalDown, currentDown, totalUp, currentUp):
pass # TODO
# if currentDown != 0 and currentDown == totalDown:
# print(f"Download Completed!\n{currentDown}/{totalDown}")
# Calculate the part size, execute _downloadRange in separate threads, merge file parts on download completion.
def download(self):
partStart = 0
partEnd = self.partSize
for part in range(1, self.parts + 1):
t = threading.Thread(target=self._downloadRange, args=(partStart, partEnd, part))
self.threads.append(t)
t.start()
partStart += self.partSize + 1 if part == 1 else self.partSize
partEnd += self.partSize
for t in self.threads:
t.join()
self._mergeFiles(self.fileName)
# Download the specified range and write it to a file part.
def _downloadRange(self, startRange, endRange, fileNo):
with open(f"{self.fileName}{fileNo}.part", "wb") as f:
curl = pycurl.Curl()
self.curlObjs.append(curl)
curl = curlObj.curl
curl.setopt(curl.URL, self.url)
curl.setopt(curl.FOLLOWLOCATION, True)
curl.setopt(curl.RANGE, f"{startRange}-{endRange}")
curl.setopt(curl.WRITEDATA, f)
curl.setopt(curl.NOPROGRESS, False)
curl.setopt(curl.XFERINFOFUNCTION, self._trackProgress)
curl.perform()
curl.close()
# Merge the file parts into one and delete the parts.
def _mergeFiles(self, fileName):
with open(fileName, "wb") as o:
for part in range(1, self.parts + 1):
with open(f"{self.fileName}{part}.part", "rb") as p:
o.write(p.read())
os.remove(f"{self.fileName}{part}.part")```