python 从 7z 文件中提取未压缩的数据
python extract uncompressed data from 7z-file
我有几个 csv 文件,其中一些是压缩的,而另一些不是,都在一个 7z 存档中。我想读取 csv 文件并将内容保存在数据库中。但是,每当 py7zlib 尝试从实际上未压缩的 csv 文件中读取数据时,我都会收到错误 data error during decompression
.
import os
import py7zlib
scr = r'Y:\PathtoArchive'
z7file = 'ArchiveName.7z'
with open(os.path.join(scr,z7file),'rb') as f:
archive = py7zlib.Archive7z(f)
names = archive.filenames
for mem in names:
obj = archive.getmember(mem)
print obj.compressed # prints None for uncompressed data
try:
data = obj.read()
except Exception as er:
print er # prints data error during decompression
# whenever obj.compressed is None
错误发生在
File "C:\Anaconda\lib\site-packages\py7zlib.py", line 608, in read
data = getattr(self, decoder)(coder, data, level)
File "C:\Anaconda\lib\site-packages\py7zlib.py", line 671, in _read_lzma
return self._read_from_decompressor(coder, dec, input, level, checkremaining=True, with_cache=True)
File "C:\Anaconda\lib\site-packages\py7zlib.py", line 646, in _read_from_decompressor
tmp = decompressor.decompress(data)
ValueError: data error during decompression
那么,如何从 7z-Archive 中提取未压缩的数据?
虽然我无法真正弄清楚问题似乎是什么,但我找到了解决最终目标的解决方法,即从 7z 存档的 csv 文件中获取数据。
7-zip 带有一个命令行工具。通过 subprocess 模块与该工具通信,我可以毫无问题地自动提取我希望提取的文件
import subprocess
import py7zlib
archiveman = r'c:\Program Files-zipz' # 7z.exe comes with 7-zip
archivepath = r'C:\Path\to\archive.7z'
with open(archivepath,'rb') as f:
archive = py7zlib.Archive7z(f)
names = archive.filenames
for name in names:
_ = subprocess.check_output([archiveman, 'e', archivepath, '-o{}'.format(r'C:\Destination\of\copy'), name])
可以找到可与 7z 一起使用的不同命令 here。
你可以试试另一个库py7zr,它也支持7zip压缩包的压缩、解压、加密和解密。
https://pypi.org/project/py7zr
我有几个 csv 文件,其中一些是压缩的,而另一些不是,都在一个 7z 存档中。我想读取 csv 文件并将内容保存在数据库中。但是,每当 py7zlib 尝试从实际上未压缩的 csv 文件中读取数据时,我都会收到错误 data error during decompression
.
import os
import py7zlib
scr = r'Y:\PathtoArchive'
z7file = 'ArchiveName.7z'
with open(os.path.join(scr,z7file),'rb') as f:
archive = py7zlib.Archive7z(f)
names = archive.filenames
for mem in names:
obj = archive.getmember(mem)
print obj.compressed # prints None for uncompressed data
try:
data = obj.read()
except Exception as er:
print er # prints data error during decompression
# whenever obj.compressed is None
错误发生在
File "C:\Anaconda\lib\site-packages\py7zlib.py", line 608, in read
data = getattr(self, decoder)(coder, data, level)
File "C:\Anaconda\lib\site-packages\py7zlib.py", line 671, in _read_lzma
return self._read_from_decompressor(coder, dec, input, level, checkremaining=True, with_cache=True)
File "C:\Anaconda\lib\site-packages\py7zlib.py", line 646, in _read_from_decompressor
tmp = decompressor.decompress(data)
ValueError: data error during decompression
那么,如何从 7z-Archive 中提取未压缩的数据?
虽然我无法真正弄清楚问题似乎是什么,但我找到了解决最终目标的解决方法,即从 7z 存档的 csv 文件中获取数据。 7-zip 带有一个命令行工具。通过 subprocess 模块与该工具通信,我可以毫无问题地自动提取我希望提取的文件
import subprocess
import py7zlib
archiveman = r'c:\Program Files-zipz' # 7z.exe comes with 7-zip
archivepath = r'C:\Path\to\archive.7z'
with open(archivepath,'rb') as f:
archive = py7zlib.Archive7z(f)
names = archive.filenames
for name in names:
_ = subprocess.check_output([archiveman, 'e', archivepath, '-o{}'.format(r'C:\Destination\of\copy'), name])
可以找到可与 7z 一起使用的不同命令 here。
你可以试试另一个库py7zr,它也支持7zip压缩包的压缩、解压、加密和解密。 https://pypi.org/project/py7zr