在不解压的情况下读取 zip 文件的内容

Reading contents of zip file without extracting

我正在努力实现的示例:

我的文本文件 (test1.txt) 包含以下两行:

John scored 80 in english

tim scored 75 in english

我已将此文件压缩为 test1.zip,我正在尝试使用以下代码读取内容:

f = 'test1.zip'
z = zipfile.ZipFile(f, "r")
zinfo = z.namelist()
for name in zinfo:
    with z.open(name) as f1:
        fi1 = f1.readlines()
for line in fi1:
print(line)

但我得到的结果是

b'John scored 80 in english\r\n'

b'tim scored 75 in english\r\n'

我如何读取这个 zip 文件的内容,它应该给我与原始文件内容相同的输出,即:

John scored 80 in english

tim scored 75 in english

您实际上正在阅读文件中的确切内容。

/r/n字符是windows中的换行符。问题 Difference between \n and \r? 更详细一些,但归结为 Windows 使用 /r/n 作为换行符。

您看到的 b' 字符与 python 及其解析文件的方式有关。问题 What does the 'b' character do in front of a string literal? 很好地回答了为什么会发生这种情况,但引用的文档是:

Bytes literals are always prefixed with 'b' or 'B'; they produce an instance of the bytes type instead of the str type. They may only contain ASCII characters; bytes with a numeric value of 128 or greater must be expressed with escapes.

编辑:我实际上找到了一个非常相似的答案,您可以从中提取阅读而无需额外字符:py3k: How do you read a file inside a zip file as text, not bytes?。基本想法是你可以使用这个:

items_file  = io.TextIOWrapper(items_file, encoding='your-encoding', newline='')

使用print(line.decode('ascii').strip())代替print(line)