如何重写 git 历史以删除对文件的修改

Question

我有一个类似于 this one 的问题，但随着 filter-repo 的可用性，我想知道现在是否有更好的方法。

我有一个很大的回购协议，其中有一些我想通过重写历史来清理的有问题的提交（我不会推回原点，这将是新的 'master' 回购协议，原件将永久保持原样，处于只读模式。

有许多提交文件已被替换为大型二进制文件。有相应的提交通过重新安装非二进制文件来解决问题。

给定一组这样的提交对，我可以想象使用 rebase -i 手动修复提交。但是有很多提交，我想要一个可编写脚本的解决方案。可以使用 filter-repo 来完成这个吗？我可以想象使用 --commit-callback 并检查 file_changes 中的文件名，但我还需要检查大小以确定此提交是否是有问题的提交之一。

git filter-repo --commit-callback '
commit.file_changes = [ c in commit.file_changes
                        if not (c.filename == b"myfilename" and
                               <somehow check size of blob here>) ]
'

谢谢

Answer 1

你可以，因为 in this issue, write a python program like black_history.py 会：

调用 filter-repo
带有提交回调
哪个有能力！
- 检查内容文件名
- 将正确的转储到磁盘上，您可以在其中检查大小

即：

    for change in commit.file_changes:
        if change.blob_id in blobs_handled:
            change.blob_id = blobs_handled[change.blob_id]
        elif change.filename.endswith(b".py"):
            # change.blob_id is None for deleted files (ex: change.type=b'D')
            if change.blob_id is None:
                assert change.type == b"D"
                continue
            # Get the old blob contents
            cat_file_process.stdin.write(change.blob_id + b"\n")
            cat_file_process.stdin.flush()
            objhash, objtype, objsize = cat_file_process.stdout.readline().split()

如果 objsize 太大，您将从当前更改中删除该 blob。

如何重写 git 历史以删除对文件的修改

How to rewrite git history to remove a modification to a file

git

version-control

rebase