如何永久删除 git 中的提交(附件)?

How to permanently delete a commit in git (annex)?

我已经开始使用 datalad,它是 git 附件的包装器,用于我实验室中的版本控制数据和过期。它工作得很好,除了 .git 文件夹可以悄无声息地变大,尤其是在 git 历史记录中来回重复某些步骤时。例如,有时我做了一个提交,意识到我需要修复一些东西,所以用 git reset HEAD~ 回滚它然后从那里进行额外的提交。这会孤立以前是 HEAD 的提交,因此它不会出现在 git log 中,但它的所有关联文件仍将在附件中,如果您有提交 sha,您仍然可以 git show 它。我怎样才能永久删除这些孤立的提交,这样它们和它们的关联文件就不会占用磁盘 space?我尝试了 git gc --prune=now --aggressive,但似乎什么也没做。

例如:

datalad create test
cd test
# create new branch
git branch tmp
git checkout tmp
# build up a git history to play with
echo a > f
datalad save -m a
datalad run -i . -o . bash -c "echo aa > f"
datalad run -i . -o . bash -c "echo aaa > f"
# cat all annexed files (where symlinks point)
find .git/annex/objects -type f | xargs -I{} cat {}
# prints out:
# a
# aaa
# aa
# remove last 2 commits
git reset --hard HEAD~2
# make another commit from 2 commits ago
datalad run -i . -o . bash -c "echo b > f"
# print out git annex'd files again
find .git/annex/objects -type f | xargs -I{} cat {}
# should print
# a
# aaa
# b
# aa
# everything is still there, despite the git reset --hard
git checkout master
git branch -D tmp
git gc --prune=now --aggressive
# check what's there again
find .git/annex/objects -type f | xargs -I{} cat {}
# everything is still in the annex, even after deleting the branch and running git gc!

在神经星上解决了:https://neurostars.org/t/how-to-permanently-delete-a-commit-in-git-annex/18235

git gc would take care about removing the commits from .git/objects but annex’ed files under .git/annex/objects would indeed persist. For annexed files, you can use git annex unused to find annexed files which are no longer used in the refs you specify (so you could e.g. drop data for intermediate steps between tagged “releases”) and then use git annex drop --unused. Note, that git-annex branch would still keep that in its history. So if you are to do it thousands of times, it might be not a complete solution and you might may be compliment it with git annex forget to forget the history of annex entirely