GitPython 的 `git show` 输出的 `File is not a zip file` 错误

The `File is not a zip file` error for the output of `git show` by GitPython

重现问题的脚本

将此代码另存为 shell 脚本并 运行 它。该代码应报告 File is not a zip file 错误。

#!/bin/bash

set -eu

mkdir foo
cd foo

pip install --user GitPython

echo foo > a
zip a.zip a

# -t option validates the zip file.
# See https://unix.stackexchange.com/questions/197127/test-integrity-of-zip-file
unzip -t a.zip

git init
git add a.zip
git commit -m 'init commit'

cat << EOF > test.py
from git import Repo
import zipfile
from io import StringIO

repo = Repo('.', search_parent_directories=True)

raw = repo.git.show("HEAD:a.zip")

z = zipfile.ZipFile(StringIO(raw), "r")
EOF

python3 test.py

原问题

我正在编写一个 Krita 插件来查看 Git 存储库以前提交的文件,我想获取 Krita 文件的缩略图文件。为此,我尝试获取 git show 的文件,解压缩它,因为 Krita 文件是 a Zip file,然后获取 preview.pngmergedimage.png.

%unzip image.kra
Archive:  image.kra
 extracting: mimetype                
  inflating: maindoc.xml             
  inflating: documentinfo.xml        
  inflating: preview.png             
  inflating: image/layers/layer2     
  inflating: image/layers/layer2.defaultpixel  
  inflating: image/layers/layer2.icc  
  inflating: image/annotations/icc   
  inflating: mergedimage.png         
  inflating: image/animation/index.xml  

我们可以从 Git 存储库中获取 .kra 文件 as str with GitPython. However, I can't parse the file with zipfile.ZipFile as it says File is not a zip file. (This code is based on this SO answer)

from git import Repo
import zipfile
from io import StringIO

repo = Repo('.', search_parent_directories=True)

raw = repo.git.show("HEAD~:image.kra")

z = zipfile.ZipFile(StringIO(raw), "r")

会发射

Traceback (most recent call last):
  File "/home/hiroki/krita_question/test.py", line 11, in <module>
    z = zipfile.ZipFile(StringIO(raw), "r")
  File "/usr/lib/python3.9/zipfile.py", line 1257, in __init__
    self._RealGetContents()
  File "/usr/lib/python3.9/zipfile.py", line 1324, in _RealGetContents
    raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file

我认为这是一个有效的 Krita 文件,因为我可以在命令行中使用 git show 恢复该文件。所以,

%git show HEAD~:image.kra > prev.kra
%krita prev.kra

工作正常。解压缩文件也可以。

为什么我不能将 git show 输出解析为 Zip 文件?

git log --stat|grep -v 'Author':

commit b96d915862b39a204a9f4350e7e56634b6fcfe0b
Date:   Wed Mar 30 14:44:02 2022 +0900

    chore: add

 ls | 231 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 231 insertions(+)

commit 619984a842c6c2daf31559c1979f91227a323648
Date:   Wed Mar 30 14:43:58 2022 +0900

    chore: add

 image.kra | Bin 0 -> 777685 bytes
 1 file changed, 0 insertions(+), 0 deletions(-)

版本

Python: 3.9.9

GitPython: 3.1.27

克里塔:5.0.2

Linux 5.15.16-gentoo

所有文件都在 Linux 中创建。

更新

GitPython 版本 3.1.28(尚未发布)应添加 strip_newline_in_stdout option。如果该选项设置为 Falserepo.git.foobar 的任何命令 运行 的标准输出的尾部 \n 将被保留。

raw = repo.git.show("HEAD~:image.kra", strip_newline_in_stdout=False)

原回答

看来这是GitPython's bug造成的。它 t运行 处理了 git show 输出的最后 \n 并使文件无效。

我更改了代码以使用 subprocess.Popen 并且 ZipFile 成功了。

import zipfile
from io import BytesIO
import subprocess

p = subprocess.Popen(["git", "show", "HEAD:a.zip"], stdout = subprocess.PIPE)

out, _ = p.communicate()

z = zipfile.ZipFile(BytesIO(out), "r")