从自定义路径中提取 .ppm.bz2 到自定义路径

Question

正如标题所说，我有几个文件夹，几个 .ppm.bz2 文件，我想将它们提取到它们使用的确切位置 python。

Directory structure image

我是这样遍历文件夹的：

 import tarfile
 import os
 path = '/Users/ankitkumar/Downloads/colorferet/dvd1/data/images/'
 folders = os.listdir(path)
 for folder in folders:  #the folders starting like 00001
     if not folder.startswith("0"):
         pass
     path2 = path + folder
     zips = os.listdir(path2)
     for zip in zips:
         if not zip.startswith("0"):
             pass
         path3 = path2+"/"+zip

         fh = tarfile.open(path3, 'r:bz2')
         outpath = path2+"/"
         fh.extractall(outpath)
         fh.close

`

然后我得到这个错误 `

Traceback (most recent call last):
  File "ZIP.py", line 16, in <module>
    fh = tarfile.open(path3, 'r:bz2')
  File "/anaconda2/lib/python2.7/tarfile.py", line 1693, in open
    return func(name, filemode, fileobj, **kwargs)
  File "/anaconda2/lib/python2.7/tarfile.py", line 1778, in bz2open
    t = cls.taropen(name, mode, fileobj, **kwargs)
  File "/anaconda2/lib/python2.7/tarfile.py", line 1723, in taropen
    return cls(name, mode, fileobj, **kwargs)
  File "/anaconda2/lib/python2.7/tarfile.py", line 1587, in __init__
    self.firstmember = self.next()
  File "/anaconda2/lib/python2.7/tarfile.py", line 2370, in next
    raise ReadError(str(e))
tarfile.ReadError: invalid header

`

Answer 1

tar文件模块用于 tar 个文件，包括 tar.bz2。如果你的文件不是 tar 你应该直接使用 bz2 模块。

此外，尝试使用 os.walk 而不是多个 listdir，因为它可以遍历树

import os
import bz2
import shutil

for path, dirs, files in os.walk(path):
    for filename in files:
        basename, ext = os.path.splitext(filename)
        if ext.lower() != '.bz2':
            continue
        fullname = os.path.join(path, filename)
        newname = os.path.join(path, basename)
        with bz2.open(fullname) as fh, open(newname, 'wb') as fw:
            shutil.copyfileobj(fh, fw)

这将解压缩所有子文件夹中的所有 .bz2 文件，它们位于相同的位置。所有其他文件将保持不变。如果未压缩的文件已经存在，它将被覆盖。

请在运行破坏性代码

之前备份您的数据

从自定义路径中提取 .ppm.bz2 到自定义路径

extracting a .ppm.bz2 from a custom path to a custom path

python

extract

tarfile

bzip2

bz2