zipfile.BadZipFile:提取受密码保护的 .zip 时 CRC-32 错误 & .zip 在提取时损坏
zipfile.BadZipFile: Bad CRC-32 when extracting a password protected .zip & .zip goes corrupt on extract
我正在尝试提取一个受密码保护的 .zip 文件,其中有一个 .txt 文档(对于这种情况,请说 Congrats.txt
)。现在 Congrats.txt
中有文本,因此它的大小不是 0kb。它被放置在一个 .zip 中(为了线程的缘故,让我们将这个 .zip 命名为 zipv1.zip
),密码为 dominique
为了这个线程。该密码与其他单词和名称一起存储在另一个 .txt 中(为了这个问题,我们将其命名为 file.txt
)。
现在,如果我 运行 通过执行 python Program.py -z zipv1.zip -f file.txt
下面的代码(假设所有这些文件都在与 Program.py
相同的文件夹中),我的程序显示 dominique
作为file.txt
中 words/passwords 中 zipv1.zip
的正确密码并提取 zipv1.zip
但 Congrats.txt
为空且大小为 0kb.
现在我的代码如下:
import argparse
import multiprocessing
import zipfile
parser = argparse.ArgumentParser(description="Unzips a password protected .zip", usage="Program.py -z zip.zip -f file.txt")
# Creates -z arg
parser.add_argument("-z", "--zip", metavar="", required=True, help="Location and the name of the .zip file.")
# Creates -f arg
parser.add_argument("-f", "--file", metavar="", required=True, help="Location and the name of file.txt.")
args = parser.parse_args()
def extract_zip(zip_filename, password):
try:
zip_file = zipfile.ZipFile(zip_filename)
zip_file.extractall(pwd=password)
print(f"[+] Password for the .zip: {password.decode('utf-8')} \n")
except:
# If a password fails, it moves to the next password without notifying the user. If all passwords fail, it will print nothing in the command prompt.
pass
def main(zip, file):
if (zip == None) | (file == None):
# If the args are not used, it displays how to use them to the user.
print(parser.usage)
exit(0)
# Opens the word list/password list/dictionary in "read binary" mode.
txt_file = open(file, "rb")
# Allows 8 instances of Python to be ran simultaneously.
with multiprocessing.Pool(8) as pool:
# "starmap" expands the tuples as 2 separate arguments to fit "extract_zip"
pool.starmap(extract_zip, [(zip, line.strip()) for line in txt_file])
if __name__ == '__main__':
main(args.zip, args.file)
但是,如果我使用与 zipv1.zip
相同的方法进行另一个压缩 (zipv2.zip
),唯一不同的是 Congrats.txt
位于一个文件夹中,该文件夹与 [=16= 一起压缩] 我确实得到了与 zipv1.zip
相同的结果,但这次 Congrats.txt
沿着它所在的文件夹提取,并且 Congrats.txt
完好无损;其中的文字和大小都完好无损。
所以为了解决这个问题,我尝试阅读 zipfile's documentation,在那里我发现如果密码与 .zip 不匹配,它会抛出 RuntimeError
。所以我确实将代码中的 except:
更改为 except RuntimeError:
并在尝试解压缩时出现此错误 zipv1.zip
:
(venv) C:\Users\USER\Documents\Jetbrains\PyCharm\Program>Program.py -z zipv1.zip -f file.txt
[+] Password for the .zip: dominique
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 47, in starmapstar
return list(itertools.starmap(args[0], args[1]))
File "C:\Users\USER\Documents\Jetbrains\PyCharm\Program\Program.py", line 16, in extract_zip
zip_file.extractall(pwd=password)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 1594, in extractall
self._extract_member(zipinfo, path, pwd)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 1649, in _extract_member
shutil.copyfileobj(source, target)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\shutil.py", line 79, in copyfileobj
buf = fsrc.read(length)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 876, in read
data = self._read1(n)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 966, in _read1
self._update_crc(data)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 894, in _update_crc
raise BadZipFile("Bad CRC-32 for file %r" % self.name)
zipfile.BadZipFile: Bad CRC-32 for file 'Congrats.txt'
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\USER\Documents\Jetbrains\PyCharm\Program\Program.py", line 38, in <module>
main(args.zip, args.file)
File "C:\Users\USER\Documents\Jetbrains\PyCharm\Program\Program.py", line 33, in main
pool.starmap(extract_zip, [(zip, line.strip()) for line in txt_file])
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 276, in starmap
return self._map_async(func, iterable, starmapstar, chunksize).get()
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 657, in get
raise self._value
zipfile.BadZipFile: Bad CRC-32 for file 'Congrats.txt'
虽然发生了相同的结果;密码在 file.txt
中找到,zipv1.zip
已提取,但 Congrats.txt
为空且大小为 0kb。所以我再次 运行 程序,但是这次 zipv2.zip
结果是:
(venv) C:\Users\USER\Documents\Jetbrains\PyCharm\Program>Program.py -z zipv2.zip -f file.txt
[+] Password for the .zip: dominique
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 47, in starmapstar
return list(itertools.starmap(args[0], args[1]))
File "C:\Users\USER\Documents\Jetbrains\PyCharm\Program\Program.py", line 16, in extract_zip
zip_file.extractall(pwd=password)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 1594, in extractall
self._extract_member(zipinfo, path, pwd)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 1649, in _extract_member
shutil.copyfileobj(source, target)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\shutil.py", line 79, in copyfileobj
buf = fsrc.read(length)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 876, in read
data = self._read1(n)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 966, in _read1
self._update_crc(data)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 894, in _update_crc
raise BadZipFile("Bad CRC-32 for file %r" % self.name)
zipfile.BadZipFile: Bad CRC-32 for file 'Congrats.txt'
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\USER\Documents\Jetbrains\PyCharm\Program\Program.py", line 38, in <module>
main(args.zip, args.file)
File "C:\Users\USER\Documents\Jetbrains\PyCharm\Program\Program.py", line 33, in main
pool.starmap(extract_zip, [(zip, line.strip()) for line in txt_file])
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 276, in starmap
return self._map_async(func, iterable, starmapstar, chunksize).get()
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 657, in get
raise self._value
zipfile.BadZipFile: Bad CRC-32 for file 'Congrats.txt'
同样的结果;成功提取文件夹的位置 Congrats.txt
也提取了其中的文本,并且其大小完好无损。
我确实看过 this 类似的线程,以及 this 线程,但它们没有帮助。我还检查了 zipfile's documentation,但对这个问题没有帮助。
编辑
现在由于某些未知和奇怪的原因实施了 with zipfile.ZipFile(zip_filename, 'r') as zip_file:
;该程序可以 read/process 一个小词 list/password list/dictionary 但如果它很大(?)则不能。
我的意思是说 zipv1.zip
中存在一个 .txt 文档;命名为 Congrats.txt
,文本为 You have cracked the .zip!
。 zipv2.zip
中也存在相同的 .txt,但这次放置在名为 ZIP Contents
的文件夹中,然后 zipped/password 受到保护。两个 zip 的密码都是 dominique
。
请注意,每个 .zip 都是使用 Deflate
压缩方法和 7zip 中的 ZipCrypto
加密生成的。
现在密码在 Line 35
(35/52 行)John The Ripper Jr.txt
和 Line 1968
中 John The Ripper.txt
(1968/3106 行)。
现在,如果您在您的 CMD 中执行 python Program.py -z zipv1 -f "John The Ripper Jr.txt"
(或您选择的 IDE);它将创建一个名为 Extracted
的文件夹,并将 Congrats.txt
放在我们之前设置的句子中。 zipv2
也是如此,但 Congrats.txt
将位于 Extracted
文件夹内的 ZIP Contents
文件夹中。在此实例中提取 .zips 没有问题。
但是如果你用 John The Ripper.txt
尝试同样的事情,即在你的 CMD 中 python Program.py -z zipv1 -f "John The Ripper.txt"
(或你选择的 IDE),它会创建 Extracted
文件夹拉链;就像 John The Ripper Jr.txt
一样,但这次 Congrats.txt
出于某种未知原因对他们俩来说都是 空 。
我的代码和所有需要的文件如下:
import argparse
import multiprocessing
import zipfile
parser = argparse.ArgumentParser(description="Unzips a password protected .zip by performing a brute-force attack.", usage="Program.py -z zip.zip -f file.txt")
# Creates -z arg
parser.add_argument("-z", "--zip", metavar="", required=True, help="Location and the name of the .zip file.")
# Creates -f arg
parser.add_argument("-f", "--file", metavar="", required=True, help="Location and the name of the word list/password list/dictionary.")
args = parser.parse_args()
def extract_zip(zip_filename, password):
try:
with zipfile.ZipFile(zip_filename, 'r') as zip_file:
zip_file.extractall('Extracted', pwd=password)
print(f"[+] Password for the .zip: {password.decode('utf-8')} \n")
except:
# If a password fails, it moves to the next password without notifying the user. If all passwords fail, it will print nothing in the command prompt.
pass
def main(zip, file):
if (zip == None) | (file == None):
# If the args are not used, it displays how to use them to the user.
print(parser.usage)
exit(0)
# Opens the word list/password list/dictionary in "read binary" mode.
txt_file = open(file, "rb")
# Allows 8 instances of Python to be ran simultaneously.
with multiprocessing.Pool(8) as pool:
# "starmap" expands the tuples as 2 separate arguments to fit "extract_zip"
pool.starmap(extract_zip, [(zip, line.strip()) for line in txt_file])
if __name__ == '__main__':
# Program.py - z zipname.zip -f filename.txt
main(args.zip, args.file)
我不确定为什么会这样,而且在任何地方都找不到这个问题的答案。据我所知,它完全未知,而且我找不到调试或解决此问题的方法。
无论 word/password 列表如何,这种情况都会继续发生。尝试使用相同 Congrats.txt
但使用来自不同单词 lists/password lists/dictionaries 的不同密码生成更多 .zips。同样的方法;使用了更大和更小版本的 .txt 并获得了与上述相同的结果。
但是 我确实发现如果我删除 John The Ripper.txt
中的前 2k 个单词并制作一个新的 .txt;说 John The Ripper v2.txt
; .zip 成功解压缩,Extracted
文件夹出现并且 Congrats.txt
与其中的文本一起出现。所以我相信它与密码后的行有关。所以在这种情况下 Line 1968
; Line 1968
之后脚本不会停止的地方?我不确定为什么这样做。我想这不是解决方案,而是朝着解决方案迈出的一步...
编辑 2
所以我尝试使用 "pool terminating" 代码:
import argparse
import multiprocessing
import zipfile
parser = argparse.ArgumentParser(description="Unzips a password protected .zip by performing a brute-force attack using", usage="Program.py -z zip.zip -f file.txt")
# Creates -z arg
parser.add_argument("-z", "--zip", metavar="", required=True, help="Location and the name of the .zip file.")
# Creates -f arg
parser.add_argument("-f", "--file", metavar="", required=True, help="Location and the name of the word list/password list/dictionary.")
args = parser.parse_args()
def extract_zip(zip_filename, password, queue):
try:
with zipfile.ZipFile(zip_filename, "r") as zip_file:
zip_file.extractall('Extracted', pwd=password)
print(f"[+] Password for the .zip: {password.decode('utf-8')} \n")
queue.put("Done") # Signal success
except:
# If a password fails, it moves to the next password without notifying the user. If all passwords fail, it will print nothing in the command prompt.
pass
def main(zip, file):
if (zip == None) | (file == None):
print(parser.usage) # If the args are not used, it displays how to use them to the user.
exit(0)
# Opens the word list/password list/dictionary in "read binary" mode.
txt_file = open(file, "rb")
# Create a Queue
manager = multiprocessing.Manager()
queue = manager.Queue()
with multiprocessing.Pool(8) as pool: # Allows 8 instances of Python to be ran simultaneously.
pool.starmap_async(extract_zip, [(zip, line.strip(), queue) for line in txt_file]) # "starmap" expands the tuples as 2 separate arguments to fit "extract_zip"
pool.close()
queue.get(True) # Wait for a process to signal success
pool.terminate() # Terminate the pool
pool.join()
if __name__ == '__main__':
main(args.zip, args.file) # Program.py -z zip.zip -f file.txt.
现在,如果我使用它,两个 zip 都会成功提取,就像之前的实例一样。 但是这次zipv1.zip
的Congrats.txt
完好无损;里面有消息。但是关于 zipv2.zip
不能说同样的话,因为它仍然是空的。
抱歉让您久等了……看来您有点困惑了。
回顾:
- 正在处理受密码保护的 .zip 文件
- 尝试暴力破解 (ciobaneste),使用文件中的密码
- 正确的密码在(上一步)文件中,但尽管如此,某些文件仍未正确提取
1。调查
场景复杂(离MCVE我想说的很远),有很多东西可以为行为受到指责。
从 zipv1.zip / zipv2.zip 不匹配开始。仔细一看,似乎 zipv2 也被搞砸了 。如果很容易发现 zipv1(Congrats.txt 是唯一的文件),zipv2, "ZIP Contents/Black-Large.png" 正在 0 大小。
它可以用任何文件重现,甚至更多:它适用于 返回的 1st 条目(不是目录) zf.namelist.
所以,事情开始变得更清楚了:
- 正在解压文件内容,因为 dominique 存在于密码文件中(不知道到那时会发生什么)
- 稍后,.zip 的 1st 条目被截断为 0字节
查看尝试使用错误密码提取文件时抛出的异常,有 3 种类型(其中最后两种可以归为一组):
- 运行时错误:文件密码错误...
- 其他:
- zlib.error: 解压数据时出错-3 ...
- zipfile.BadZipFile:文件的 CRC-32 错误 ...
我创建了自己的存档文件。为了保持一致性,我将从现在开始使用它,但所有内容也适用于任何其他文件。
- 内容:
- DummyFile0.zip(10 字节)- 包含:0123456789
- DummyFile1.zip(10 字节)- 包含:0000000000
- DummyFile2.zip(10 字节)- 包含:AAAAAAAAAA
- 使用 Total Commander (9.21a) 内部 zip 加壳器归档了 3 个文件,使用 dominique 密码保护(zip2.0 加密)。生成的存档(命名为 arc0.zip(但名称不相关))是 392 字节长
code.py:
#!/usr/bin/env python3
import sys
import os
import zipfile
def main():
arc_name = sys.argv[1] if len(sys.argv) > 1 else "./arc0.zip"
pwds = [
#b"dominique",
#b"dickhead",
b"coco",
]
pwds = [item.strip() for item in open("orig/John The Ripper.txt.orig", "rb").readlines()]
print("Unpacking (password protected: dominique) {:s},"
" using a list of predefined passwords ...".format(arc_name))
if not os.path.isfile(arc_name):
raise SystemExit("Archive file must exist!\nExiting.")
faulty_pwds = list()
good_pwds = list()
with zipfile.ZipFile(arc_name, "r") as zip_file:
print("Zip names: {:}\n".format(zip_file.namelist()))
for idx, pwd in enumerate(pwds):
try:
zip_file.extractall("Extracted", pwd=pwd)
except:
exc_cls, exc_inst, exc_tb = sys.exc_info()
if exc_cls != RuntimeError:
print("Exception caught when using password ({:d}): [{:}] ".format(idx, pwd))
print(" {:}: {:}".format(exc_cls, exc_inst))
faulty_pwds.append(pwd)
else:
print("Success using password ({:d}): [{:}] ".format(idx, pwd))
good_pwds.append(pwd)
print("\nFaulty passwords: {:}\nGood passwords: {:}".format(faulty_pwds, good_pwds))
if __name__ == "__main__":
print("Python {:s} on {:s}\n".format(sys.version, sys.platform))
main()
输出:
[cfati@CFATI-5510-0:e:\Work\Dev\Whosebug\q054532010]> "e:\Work\Dev\VEnvs\py_064_03.06.08_test0\Scripts\python.exe" code.py arc0.zip
Python 3.6.8 (tags/v3.6.8:3c6b436a57, Dec 24 2018, 00:16:47) [MSC v.1916 64 bit (AMD64)] on win32
Unpacking (password protected: dominique) arc0.zip, using a list of predefined passwords ...
Zip names: ['DummyFile0.txt', 'DummyFile1.txt', 'DummyFile2.txt']
Exception caught when using password (1189): [b'mariah']
<class 'zlib.error'>: Error -3 while decompressing data: invalid code lengths set
Exception caught when using password (1446): [b'zebra']
<class 'zlib.error'>: Error -3 while decompressing data: invalid block type
Exception caught when using password (1477): [b'1977']
<class 'zlib.error'>: Error -3 while decompressing data: invalid block type
Success using password (1967): [b'dominique']
Exception caught when using password (2122): [b'hank']
<class 'zlib.error'>: Error -3 while decompressing data: invalid code lengths set
Exception caught when using password (2694): [b'solomon']
<class 'zlib.error'>: Error -3 while decompressing data: invalid distance code
Exception caught when using password (2768): [b'target']
<class 'zlib.error'>: Error -3 while decompressing data: invalid block type
Exception caught when using password (2816): [b'trish']
<class 'zlib.error'>: Error -3 while decompressing data: invalid code lengths set
Exception caught when using password (2989): [b'coco']
<class 'zlib.error'>: Error -3 while decompressing data: invalid stored block lengths
Faulty passwords: [b'mariah', b'zebra', b'1977', b'hank', b'solomon', b'target', b'trish', b'coco']
Good passwords: [b'dominique']
查看 ZipFile.extractall 代码,它试图提取所有成员。 1st 引发了一个异常,所以它开始更清楚为什么它的行为方式如此。但是,为什么在尝试使用 2 个错误密码提取项目时会出现行为差异?
如 2 种不同抛出异常类型的回溯所示,答案位于 ZipFile.open.
末尾的某处
查了一番,原来是因为一个
2。 zip 加密漏洞
确定的冲突
根据[UT.CS]: dmitri-report-f15-16.pdf - Password-based encryption in ZIP files((最后)重点是我的):
3.1 Traditional PKWARE encryption
The original encryption scheme, commonly referred to as the PKZIP cipher, was designed by
Roger Schaffely [1]. In [5] Biham and Kocher showed that the cipher is weak and demonstrated
an attack requiring 13 bytes of plaintext. Further attacks have been developed, some of which
require no user provided plaintext at all [6].
The PKZIP cipher is essentially a stream cipher, i.e. input is encrypted by generating a pseudo-
random key stream and XOR-ing it with the plaintext. The internal state of the cipher consists
of three 32-bit words: key0, key1 and key2. These are initialized to 0x12345678, 0x23456789 and
0x34567890, respectively. A core step of the algorithm involves updating the three keys using a
single byte of input...
...
Before encrypting a file in the archive, 12 random bytes are first prepended to its compressed
contents and the resulting bytestream is then encrypted. Upon decryption, the first 12 bytes
need to be discarded. According to the specification, this is done in order to render a plaintext
attack on the data ineffective.
The specification also states that out of the 12 prepended bytes, only the first 11 are actually
random, the last byte is equal to the high order byte of the CRC-32 of the uncompressed
contents of the file. This gives the ability to quickly verify whether a given password is correct
by comparing the last byte of the decrypted 12 byte header to the high order byte of the actual
CRC-32 value that is included in the local file header. This can be done before decrypting the
rest of the file.
其他参考资料:
算法弱点:由于只对一个字节进行微分,对于256不同(并仔细选择) 个错误的密码,将有 个(至少)会生成与正确密码相同的数字。
该算法会丢弃大部分错误密码,但也有一些不会。
返回:尝试使用密码提取文件时:
- 如果根据文件密码的最后一个字节计算的“散列”与文件 CRC 的高位字节不同,则会抛出异常
- 但是,如果它们相等:
- 打开一个新的文件流进行写入(如果文件已经存在则清空文件)
- 尝试解压:
- 密码错误(已通过上述检查),解压失败(但文件已清空)
从上面的输出可以看出,对于我的 (.zip) 文件,有 8 搞砸了的密码。注意:
- 对于每个存档文件,结果都不同
- 成员 文件名和内容是相关的(至少对于第 1st 个)。更改其中任何一个都会产生不同的结果(对于“相同”的存档文件)
这是一个基于我的 .zip 文件中数据的测试:
>>> import zipfile
>>>
>>> zd_coco = zipfile._ZipDecrypter(b"coco")
>>> zd_dominique = zipfile._ZipDecrypter(b"dominique")
>>> zd_other = zipfile._ZipDecrypter(b"other")
>>> cipher = b'\xd1\x86y ^\xd77gRzZ\xee' # Member (1st) file cipher: 12 bytes starting from archive offset 44
>>>
>>> crc = 2793719750 # Member (1st) file CRC - archive bytes: 14 - 17
>>> hex(crc)
'0xa684c7c6'
>>> for zd in (zd_coco, zd_dominique, zd_other):
... print(zd, [hex(zd(c)) for c in cipher])
...
<zipfile._ZipDecrypter object at 0x0000021E8DA2E0F0> ['0x1f', '0x58', '0x89', '0x29', '0x89', '0xe', '0x32', '0xe7', '0x2', '0x31', '0x70', '0xa6']
<zipfile._ZipDecrypter object at 0x0000021E8DA2E160> ['0xa8', '0x3f', '0xa2', '0x56', '0x4c', '0x37', '0xbb', '0x60', '0xd3', '0x5e', '0x84', '0xa6']
<zipfile._ZipDecrypter object at 0x0000021E8DA2E128> ['0xeb', '0x64', '0x36', '0xa3', '0xca', '0x46', '0x17', '0x1a', '0xfb', '0x6d', '0x6c', '0x4e']
>>> # As seen, the last element of the first 2 arrays (coco and dominique) is 0xA6 (166), which is the same as the first byte of the CRC
我用其他解包引擎做了一些测试(使用默认参数):
- WinRar: 密码错误文件未被修改,但密码错误文件被截断(同这里)
- 7-Zip:询问用户是否覆盖文件,无论解压结果如何都覆盖[=205=]
- Total Commander 的内部 (zip) 解包器:与 #2 相同。
3。结论
- 我认为这是一个 zipfile 错误。指定这样一个错误的(和错误的)密码不应覆盖现有文件(如果有的话)。或者至少,行为应该是一致的(对于所有错误的密码)
- 快速浏览没有发现 Python
上的任何错误
- 我认为没有简单的解决方法,因为:
- 无法改进zip算法(更好地检查密码是否正确)
- 我想到了一些修复方法,但它们要么会对性能产生负面影响,要么会在某些(角落)情况下引入回归
我已提交 [GitHub]: python/cpython - [3.6] bpo-36247: zipfile - extract truncates (existing) file when bad password provided (zip encryption weakness),已关闭 3.6 (仅处于 安全修复 模式)。不确定它的结果会是什么(在其他分支),但无论如何,它不会很快可用(假设在接下来的几个月)。
作为替代方案,您可以下载补丁,并在本地应用更改。检查 (修补 utrunner 部分)了解如何在 Win 上应用补丁(基本上,以 一个“+” 符号开头的每一行都会进入,并且以 一个“-” 符号开头的每一行都会消失)。我正在使用 Cygwin,btw。
您可以将 zipfile.py 从 Python 的目录复制到您的项目(或一些“个人”)目录和补丁该文件,如果你想保持 Python 安装的原始状态。
我遇到了这个问题,仔细研究后发现问题出在我选择的 zip 文件加密方式上。我从 7-Zip 提供的默认值 ZipCrypto 更改为 AES-256,一切正常。
我正在尝试提取一个受密码保护的 .zip 文件,其中有一个 .txt 文档(对于这种情况,请说 Congrats.txt
)。现在 Congrats.txt
中有文本,因此它的大小不是 0kb。它被放置在一个 .zip 中(为了线程的缘故,让我们将这个 .zip 命名为 zipv1.zip
),密码为 dominique
为了这个线程。该密码与其他单词和名称一起存储在另一个 .txt 中(为了这个问题,我们将其命名为 file.txt
)。
现在,如果我 运行 通过执行 python Program.py -z zipv1.zip -f file.txt
下面的代码(假设所有这些文件都在与 Program.py
相同的文件夹中),我的程序显示 dominique
作为file.txt
中 words/passwords 中 zipv1.zip
的正确密码并提取 zipv1.zip
但 Congrats.txt
为空且大小为 0kb.
现在我的代码如下:
import argparse
import multiprocessing
import zipfile
parser = argparse.ArgumentParser(description="Unzips a password protected .zip", usage="Program.py -z zip.zip -f file.txt")
# Creates -z arg
parser.add_argument("-z", "--zip", metavar="", required=True, help="Location and the name of the .zip file.")
# Creates -f arg
parser.add_argument("-f", "--file", metavar="", required=True, help="Location and the name of file.txt.")
args = parser.parse_args()
def extract_zip(zip_filename, password):
try:
zip_file = zipfile.ZipFile(zip_filename)
zip_file.extractall(pwd=password)
print(f"[+] Password for the .zip: {password.decode('utf-8')} \n")
except:
# If a password fails, it moves to the next password without notifying the user. If all passwords fail, it will print nothing in the command prompt.
pass
def main(zip, file):
if (zip == None) | (file == None):
# If the args are not used, it displays how to use them to the user.
print(parser.usage)
exit(0)
# Opens the word list/password list/dictionary in "read binary" mode.
txt_file = open(file, "rb")
# Allows 8 instances of Python to be ran simultaneously.
with multiprocessing.Pool(8) as pool:
# "starmap" expands the tuples as 2 separate arguments to fit "extract_zip"
pool.starmap(extract_zip, [(zip, line.strip()) for line in txt_file])
if __name__ == '__main__':
main(args.zip, args.file)
但是,如果我使用与 zipv1.zip
相同的方法进行另一个压缩 (zipv2.zip
),唯一不同的是 Congrats.txt
位于一个文件夹中,该文件夹与 [=16= 一起压缩] 我确实得到了与 zipv1.zip
相同的结果,但这次 Congrats.txt
沿着它所在的文件夹提取,并且 Congrats.txt
完好无损;其中的文字和大小都完好无损。
所以为了解决这个问题,我尝试阅读 zipfile's documentation,在那里我发现如果密码与 .zip 不匹配,它会抛出 RuntimeError
。所以我确实将代码中的 except:
更改为 except RuntimeError:
并在尝试解压缩时出现此错误 zipv1.zip
:
(venv) C:\Users\USER\Documents\Jetbrains\PyCharm\Program>Program.py -z zipv1.zip -f file.txt
[+] Password for the .zip: dominique
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 47, in starmapstar
return list(itertools.starmap(args[0], args[1]))
File "C:\Users\USER\Documents\Jetbrains\PyCharm\Program\Program.py", line 16, in extract_zip
zip_file.extractall(pwd=password)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 1594, in extractall
self._extract_member(zipinfo, path, pwd)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 1649, in _extract_member
shutil.copyfileobj(source, target)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\shutil.py", line 79, in copyfileobj
buf = fsrc.read(length)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 876, in read
data = self._read1(n)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 966, in _read1
self._update_crc(data)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 894, in _update_crc
raise BadZipFile("Bad CRC-32 for file %r" % self.name)
zipfile.BadZipFile: Bad CRC-32 for file 'Congrats.txt'
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\USER\Documents\Jetbrains\PyCharm\Program\Program.py", line 38, in <module>
main(args.zip, args.file)
File "C:\Users\USER\Documents\Jetbrains\PyCharm\Program\Program.py", line 33, in main
pool.starmap(extract_zip, [(zip, line.strip()) for line in txt_file])
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 276, in starmap
return self._map_async(func, iterable, starmapstar, chunksize).get()
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 657, in get
raise self._value
zipfile.BadZipFile: Bad CRC-32 for file 'Congrats.txt'
虽然发生了相同的结果;密码在 file.txt
中找到,zipv1.zip
已提取,但 Congrats.txt
为空且大小为 0kb。所以我再次 运行 程序,但是这次 zipv2.zip
结果是:
(venv) C:\Users\USER\Documents\Jetbrains\PyCharm\Program>Program.py -z zipv2.zip -f file.txt
[+] Password for the .zip: dominique
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 47, in starmapstar
return list(itertools.starmap(args[0], args[1]))
File "C:\Users\USER\Documents\Jetbrains\PyCharm\Program\Program.py", line 16, in extract_zip
zip_file.extractall(pwd=password)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 1594, in extractall
self._extract_member(zipinfo, path, pwd)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 1649, in _extract_member
shutil.copyfileobj(source, target)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\shutil.py", line 79, in copyfileobj
buf = fsrc.read(length)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 876, in read
data = self._read1(n)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 966, in _read1
self._update_crc(data)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 894, in _update_crc
raise BadZipFile("Bad CRC-32 for file %r" % self.name)
zipfile.BadZipFile: Bad CRC-32 for file 'Congrats.txt'
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\USER\Documents\Jetbrains\PyCharm\Program\Program.py", line 38, in <module>
main(args.zip, args.file)
File "C:\Users\USER\Documents\Jetbrains\PyCharm\Program\Program.py", line 33, in main
pool.starmap(extract_zip, [(zip, line.strip()) for line in txt_file])
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 276, in starmap
return self._map_async(func, iterable, starmapstar, chunksize).get()
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 657, in get
raise self._value
zipfile.BadZipFile: Bad CRC-32 for file 'Congrats.txt'
同样的结果;成功提取文件夹的位置 Congrats.txt
也提取了其中的文本,并且其大小完好无损。
我确实看过 this 类似的线程,以及 this 线程,但它们没有帮助。我还检查了 zipfile's documentation,但对这个问题没有帮助。
编辑
现在由于某些未知和奇怪的原因实施了 with zipfile.ZipFile(zip_filename, 'r') as zip_file:
;该程序可以 read/process 一个小词 list/password list/dictionary 但如果它很大(?)则不能。
我的意思是说 zipv1.zip
中存在一个 .txt 文档;命名为 Congrats.txt
,文本为 You have cracked the .zip!
。 zipv2.zip
中也存在相同的 .txt,但这次放置在名为 ZIP Contents
的文件夹中,然后 zipped/password 受到保护。两个 zip 的密码都是 dominique
。
请注意,每个 .zip 都是使用 Deflate
压缩方法和 7zip 中的 ZipCrypto
加密生成的。
现在密码在 Line 35
(35/52 行)John The Ripper Jr.txt
和 Line 1968
中 John The Ripper.txt
(1968/3106 行)。
现在,如果您在您的 CMD 中执行 python Program.py -z zipv1 -f "John The Ripper Jr.txt"
(或您选择的 IDE);它将创建一个名为 Extracted
的文件夹,并将 Congrats.txt
放在我们之前设置的句子中。 zipv2
也是如此,但 Congrats.txt
将位于 Extracted
文件夹内的 ZIP Contents
文件夹中。在此实例中提取 .zips 没有问题。
但是如果你用 John The Ripper.txt
尝试同样的事情,即在你的 CMD 中 python Program.py -z zipv1 -f "John The Ripper.txt"
(或你选择的 IDE),它会创建 Extracted
文件夹拉链;就像 John The Ripper Jr.txt
一样,但这次 Congrats.txt
出于某种未知原因对他们俩来说都是 空 。
我的代码和所有需要的文件如下:
import argparse
import multiprocessing
import zipfile
parser = argparse.ArgumentParser(description="Unzips a password protected .zip by performing a brute-force attack.", usage="Program.py -z zip.zip -f file.txt")
# Creates -z arg
parser.add_argument("-z", "--zip", metavar="", required=True, help="Location and the name of the .zip file.")
# Creates -f arg
parser.add_argument("-f", "--file", metavar="", required=True, help="Location and the name of the word list/password list/dictionary.")
args = parser.parse_args()
def extract_zip(zip_filename, password):
try:
with zipfile.ZipFile(zip_filename, 'r') as zip_file:
zip_file.extractall('Extracted', pwd=password)
print(f"[+] Password for the .zip: {password.decode('utf-8')} \n")
except:
# If a password fails, it moves to the next password without notifying the user. If all passwords fail, it will print nothing in the command prompt.
pass
def main(zip, file):
if (zip == None) | (file == None):
# If the args are not used, it displays how to use them to the user.
print(parser.usage)
exit(0)
# Opens the word list/password list/dictionary in "read binary" mode.
txt_file = open(file, "rb")
# Allows 8 instances of Python to be ran simultaneously.
with multiprocessing.Pool(8) as pool:
# "starmap" expands the tuples as 2 separate arguments to fit "extract_zip"
pool.starmap(extract_zip, [(zip, line.strip()) for line in txt_file])
if __name__ == '__main__':
# Program.py - z zipname.zip -f filename.txt
main(args.zip, args.file)
我不确定为什么会这样,而且在任何地方都找不到这个问题的答案。据我所知,它完全未知,而且我找不到调试或解决此问题的方法。
无论 word/password 列表如何,这种情况都会继续发生。尝试使用相同 Congrats.txt
但使用来自不同单词 lists/password lists/dictionaries 的不同密码生成更多 .zips。同样的方法;使用了更大和更小版本的 .txt 并获得了与上述相同的结果。
但是 我确实发现如果我删除 John The Ripper.txt
中的前 2k 个单词并制作一个新的 .txt;说 John The Ripper v2.txt
; .zip 成功解压缩,Extracted
文件夹出现并且 Congrats.txt
与其中的文本一起出现。所以我相信它与密码后的行有关。所以在这种情况下 Line 1968
; Line 1968
之后脚本不会停止的地方?我不确定为什么这样做。我想这不是解决方案,而是朝着解决方案迈出的一步...
编辑 2
所以我尝试使用 "pool terminating" 代码:
import argparse
import multiprocessing
import zipfile
parser = argparse.ArgumentParser(description="Unzips a password protected .zip by performing a brute-force attack using", usage="Program.py -z zip.zip -f file.txt")
# Creates -z arg
parser.add_argument("-z", "--zip", metavar="", required=True, help="Location and the name of the .zip file.")
# Creates -f arg
parser.add_argument("-f", "--file", metavar="", required=True, help="Location and the name of the word list/password list/dictionary.")
args = parser.parse_args()
def extract_zip(zip_filename, password, queue):
try:
with zipfile.ZipFile(zip_filename, "r") as zip_file:
zip_file.extractall('Extracted', pwd=password)
print(f"[+] Password for the .zip: {password.decode('utf-8')} \n")
queue.put("Done") # Signal success
except:
# If a password fails, it moves to the next password without notifying the user. If all passwords fail, it will print nothing in the command prompt.
pass
def main(zip, file):
if (zip == None) | (file == None):
print(parser.usage) # If the args are not used, it displays how to use them to the user.
exit(0)
# Opens the word list/password list/dictionary in "read binary" mode.
txt_file = open(file, "rb")
# Create a Queue
manager = multiprocessing.Manager()
queue = manager.Queue()
with multiprocessing.Pool(8) as pool: # Allows 8 instances of Python to be ran simultaneously.
pool.starmap_async(extract_zip, [(zip, line.strip(), queue) for line in txt_file]) # "starmap" expands the tuples as 2 separate arguments to fit "extract_zip"
pool.close()
queue.get(True) # Wait for a process to signal success
pool.terminate() # Terminate the pool
pool.join()
if __name__ == '__main__':
main(args.zip, args.file) # Program.py -z zip.zip -f file.txt.
现在,如果我使用它,两个 zip 都会成功提取,就像之前的实例一样。 但是这次zipv1.zip
的Congrats.txt
完好无损;里面有消息。但是关于 zipv2.zip
不能说同样的话,因为它仍然是空的。
抱歉让您久等了……看来您有点困惑了。
回顾:
- 正在处理受密码保护的 .zip 文件
- 尝试暴力破解 (ciobaneste),使用文件中的密码
- 正确的密码在(上一步)文件中,但尽管如此,某些文件仍未正确提取
1。调查
场景复杂(离MCVE我想说的很远),有很多东西可以为行为受到指责。
从 zipv1.zip / zipv2.zip 不匹配开始。仔细一看,似乎 zipv2 也被搞砸了 。如果很容易发现 zipv1(Congrats.txt 是唯一的文件),zipv2, "ZIP Contents/Black-Large.png" 正在 0 大小。
它可以用任何文件重现,甚至更多:它适用于 返回的 1st 条目(不是目录) zf.namelist.
所以,事情开始变得更清楚了:
- 正在解压文件内容,因为 dominique 存在于密码文件中(不知道到那时会发生什么)
- 稍后,.zip 的 1st 条目被截断为 0字节
查看尝试使用错误密码提取文件时抛出的异常,有 3 种类型(其中最后两种可以归为一组):
- 运行时错误:文件密码错误...
- 其他:
- zlib.error: 解压数据时出错-3 ...
- zipfile.BadZipFile:文件的 CRC-32 错误 ...
我创建了自己的存档文件。为了保持一致性,我将从现在开始使用它,但所有内容也适用于任何其他文件。
- 内容:
- DummyFile0.zip(10 字节)- 包含:0123456789
- DummyFile1.zip(10 字节)- 包含:0000000000
- DummyFile2.zip(10 字节)- 包含:AAAAAAAAAA
- 使用 Total Commander (9.21a) 内部 zip 加壳器归档了 3 个文件,使用 dominique 密码保护(zip2.0 加密)。生成的存档(命名为 arc0.zip(但名称不相关))是 392 字节长
code.py:
#!/usr/bin/env python3
import sys
import os
import zipfile
def main():
arc_name = sys.argv[1] if len(sys.argv) > 1 else "./arc0.zip"
pwds = [
#b"dominique",
#b"dickhead",
b"coco",
]
pwds = [item.strip() for item in open("orig/John The Ripper.txt.orig", "rb").readlines()]
print("Unpacking (password protected: dominique) {:s},"
" using a list of predefined passwords ...".format(arc_name))
if not os.path.isfile(arc_name):
raise SystemExit("Archive file must exist!\nExiting.")
faulty_pwds = list()
good_pwds = list()
with zipfile.ZipFile(arc_name, "r") as zip_file:
print("Zip names: {:}\n".format(zip_file.namelist()))
for idx, pwd in enumerate(pwds):
try:
zip_file.extractall("Extracted", pwd=pwd)
except:
exc_cls, exc_inst, exc_tb = sys.exc_info()
if exc_cls != RuntimeError:
print("Exception caught when using password ({:d}): [{:}] ".format(idx, pwd))
print(" {:}: {:}".format(exc_cls, exc_inst))
faulty_pwds.append(pwd)
else:
print("Success using password ({:d}): [{:}] ".format(idx, pwd))
good_pwds.append(pwd)
print("\nFaulty passwords: {:}\nGood passwords: {:}".format(faulty_pwds, good_pwds))
if __name__ == "__main__":
print("Python {:s} on {:s}\n".format(sys.version, sys.platform))
main()
输出:
[cfati@CFATI-5510-0:e:\Work\Dev\Whosebug\q054532010]> "e:\Work\Dev\VEnvs\py_064_03.06.08_test0\Scripts\python.exe" code.py arc0.zip Python 3.6.8 (tags/v3.6.8:3c6b436a57, Dec 24 2018, 00:16:47) [MSC v.1916 64 bit (AMD64)] on win32 Unpacking (password protected: dominique) arc0.zip, using a list of predefined passwords ... Zip names: ['DummyFile0.txt', 'DummyFile1.txt', 'DummyFile2.txt'] Exception caught when using password (1189): [b'mariah'] <class 'zlib.error'>: Error -3 while decompressing data: invalid code lengths set Exception caught when using password (1446): [b'zebra'] <class 'zlib.error'>: Error -3 while decompressing data: invalid block type Exception caught when using password (1477): [b'1977'] <class 'zlib.error'>: Error -3 while decompressing data: invalid block type Success using password (1967): [b'dominique'] Exception caught when using password (2122): [b'hank'] <class 'zlib.error'>: Error -3 while decompressing data: invalid code lengths set Exception caught when using password (2694): [b'solomon'] <class 'zlib.error'>: Error -3 while decompressing data: invalid distance code Exception caught when using password (2768): [b'target'] <class 'zlib.error'>: Error -3 while decompressing data: invalid block type Exception caught when using password (2816): [b'trish'] <class 'zlib.error'>: Error -3 while decompressing data: invalid code lengths set Exception caught when using password (2989): [b'coco'] <class 'zlib.error'>: Error -3 while decompressing data: invalid stored block lengths Faulty passwords: [b'mariah', b'zebra', b'1977', b'hank', b'solomon', b'target', b'trish', b'coco'] Good passwords: [b'dominique']
查看 ZipFile.extractall 代码,它试图提取所有成员。 1st 引发了一个异常,所以它开始更清楚为什么它的行为方式如此。但是,为什么在尝试使用 2 个错误密码提取项目时会出现行为差异?
如 2 种不同抛出异常类型的回溯所示,答案位于 ZipFile.open.
查了一番,原来是因为一个
2。 zip 加密漏洞
确定的冲突根据[UT.CS]: dmitri-report-f15-16.pdf - Password-based encryption in ZIP files((最后)重点是我的):
3.1 Traditional PKWARE encryption
The original encryption scheme, commonly referred to as the PKZIP cipher, was designed by Roger Schaffely [1]. In [5] Biham and Kocher showed that the cipher is weak and demonstrated an attack requiring 13 bytes of plaintext. Further attacks have been developed, some of which require no user provided plaintext at all [6]. The PKZIP cipher is essentially a stream cipher, i.e. input is encrypted by generating a pseudo- random key stream and XOR-ing it with the plaintext. The internal state of the cipher consists of three 32-bit words: key0, key1 and key2. These are initialized to 0x12345678, 0x23456789 and 0x34567890, respectively. A core step of the algorithm involves updating the three keys using a single byte of input...
...
Before encrypting a file in the archive, 12 random bytes are first prepended to its compressed contents and the resulting bytestream is then encrypted. Upon decryption, the first 12 bytes need to be discarded. According to the specification, this is done in order to render a plaintext attack on the data ineffective. The specification also states that out of the 12 prepended bytes, only the first 11 are actually random, the last byte is equal to the high order byte of the CRC-32 of the uncompressed contents of the file. This gives the ability to quickly verify whether a given password is correct by comparing the last byte of the decrypted 12 byte header to the high order byte of the actual CRC-32 value that is included in the local file header. This can be done before decrypting the rest of the file.
其他参考资料:
算法弱点:由于只对一个字节进行微分,对于256不同(并仔细选择) 个错误的密码,将有 个(至少)会生成与正确密码相同的数字。
该算法会丢弃大部分错误密码,但也有一些不会。
返回:尝试使用密码提取文件时:
- 如果根据文件密码的最后一个字节计算的“散列”与文件 CRC 的高位字节不同,则会抛出异常
- 但是,如果它们相等:
- 打开一个新的文件流进行写入(如果文件已经存在则清空文件)
- 尝试解压:
- 密码错误(已通过上述检查),解压失败(但文件已清空)
从上面的输出可以看出,对于我的 (.zip) 文件,有 8 搞砸了的密码。注意:
- 对于每个存档文件,结果都不同
- 成员 文件名和内容是相关的(至少对于第 1st 个)。更改其中任何一个都会产生不同的结果(对于“相同”的存档文件)
这是一个基于我的 .zip 文件中数据的测试:
>>> import zipfile >>> >>> zd_coco = zipfile._ZipDecrypter(b"coco") >>> zd_dominique = zipfile._ZipDecrypter(b"dominique") >>> zd_other = zipfile._ZipDecrypter(b"other") >>> cipher = b'\xd1\x86y ^\xd77gRzZ\xee' # Member (1st) file cipher: 12 bytes starting from archive offset 44 >>> >>> crc = 2793719750 # Member (1st) file CRC - archive bytes: 14 - 17 >>> hex(crc) '0xa684c7c6' >>> for zd in (zd_coco, zd_dominique, zd_other): ... print(zd, [hex(zd(c)) for c in cipher]) ... <zipfile._ZipDecrypter object at 0x0000021E8DA2E0F0> ['0x1f', '0x58', '0x89', '0x29', '0x89', '0xe', '0x32', '0xe7', '0x2', '0x31', '0x70', '0xa6'] <zipfile._ZipDecrypter object at 0x0000021E8DA2E160> ['0xa8', '0x3f', '0xa2', '0x56', '0x4c', '0x37', '0xbb', '0x60', '0xd3', '0x5e', '0x84', '0xa6'] <zipfile._ZipDecrypter object at 0x0000021E8DA2E128> ['0xeb', '0x64', '0x36', '0xa3', '0xca', '0x46', '0x17', '0x1a', '0xfb', '0x6d', '0x6c', '0x4e'] >>> # As seen, the last element of the first 2 arrays (coco and dominique) is 0xA6 (166), which is the same as the first byte of the CRC
我用其他解包引擎做了一些测试(使用默认参数):
- WinRar: 密码错误文件未被修改,但密码错误文件被截断(同这里)
- 7-Zip:询问用户是否覆盖文件,无论解压结果如何都覆盖[=205=]
- Total Commander 的内部 (zip) 解包器:与 #2 相同。
3。结论
- 我认为这是一个 zipfile 错误。指定这样一个错误的(和错误的)密码不应覆盖现有文件(如果有的话)。或者至少,行为应该是一致的(对于所有错误的密码)
- 快速浏览没有发现 Python 上的任何错误
- 我认为没有简单的解决方法,因为:
- 无法改进zip算法(更好地检查密码是否正确)
- 我想到了一些修复方法,但它们要么会对性能产生负面影响,要么会在某些(角落)情况下引入回归
我已提交 [GitHub]: python/cpython - [3.6] bpo-36247: zipfile - extract truncates (existing) file when bad password provided (zip encryption weakness),已关闭 3.6 (仅处于 安全修复 模式)。不确定它的结果会是什么(在其他分支),但无论如何,它不会很快可用(假设在接下来的几个月)。
作为替代方案,您可以下载补丁,并在本地应用更改。检查
您可以将 zipfile.py 从 Python 的目录复制到您的项目(或一些“个人”)目录和补丁该文件,如果你想保持 Python 安装的原始状态。
我遇到了这个问题,仔细研究后发现问题出在我选择的 zip 文件加密方式上。我从 7-Zip 提供的默认值 ZipCrypto 更改为 AES-256,一切正常。