google-api-python-媒体下载的客户端错误

google-api-python-client bug with media downloading

二手google-api-python-client==1.6.2

fh = io.BytesIO()
request = self.drive_service.files().export_media(
    fileId='1fwshPVKCACXgNxJtmGN94X-9RRrukiDs9q4s-n0nGlM',
    mimeType='application/vnd.openxmlformats-officedocument.wordprocessingml.document'
)
downloader = MediaIoBaseDownload(fh, request, chunksize=1024)
done = False
while done is False:
    status, done = downloader.next_chunk()
    print "Download ", status.progress(), downloader._progress, downloader._total_size, done

输出:

Download  0.0 973060 None False
Download  0.0 1946120 None False
Download  0.0 2919180 None False
Download  0.0 3892240 None False
Download  0.0 4865300 None False
Download  0.0 5838360 None False
Download  0.0 6811420 None False
Download  0.0 7784480 None False
Download  0.0 8757540 None False
...

下载文件的文件大小为973060字节。因此,库忽略了 chunksize 参数并且没有停止。永无止境。

所以,谁能告诉我是我的要求太高了还是图书馆太差了?

下面的示例怎么样?

示例:

request = self.drive_service.files().export_media(
    fileId='1fwshPVKCACXgNxJtmGN94X-9RRrukiDs9q4s-n0nGlM',
    mimeType='application/vnd.openxmlformats-officedocument.wordprocessingml.document'
).execute()
with open('sample.docx', 'wb') as f:
    f.write(request)

如果这不起作用,我很抱歉。

The google-api-python-client library has a bug where downloads will never be considered done if the Content-length or Content-range header is missing.

并且由于 drive.files.export 不支持分块下载,所以它不 return Content-lengthContent-range header.

您只需在 HttpRequest 上调用 execute 即可下载文件,因为 drive.files.export 总是会在一个请求中导出整个文件。

如果您仍想使用 MediaIoBaseDownload 作为更通用的解决方法,您可以检查 MediaDownloadProgress.total_size 是否为 None

fh = io.BytesIO()
request = service.files().export_media(fileId=file_id, mimeType=mime_type)
downloader = MediaIoBaseDownload(fh, request)

done = False
while not done:
    status, done = downloader.next_chunk()
    if status.total_size is None:
        # https://github.com/google/google-api-python-client/issues/15
        done = True