在 git svn clone/fetch 期间避免 "warning: There are too many unreachable loose objects"

Avoiding "warning: There are too many unreachable loose objects" during git svn clone/fetch

当运行对大型 Subversion 存储库(100k+ 提交)执行 git svn clonegit svn fetch 时,提取会定期停止:

Auto packing the repository in background for optimum performance.
See "git help gc" for manual housekeeping.
error: The last gc run reported the following. Please correct the root cause and remove .git/gc.log.
Automatic cleanup will not be performed until the file is removed.

warning: There are too many unreachable loose objects; run 'git prune' to remove them.

gc --auto: command returned error: 255

为了恢复,我必须按照说明进行操作,运行 更积极的 p运行e 和 gc,删除日志文件并继续,结果又一次又一次发生读取了一批 10k 提交。

如何避免这个问题?

自我回答。

git svn 操作属于启动后台 gc --auto 内部管理操作的操作。在这种情况下,我认为 git svn fetch 的持续进展可能会导致 gc 操作中某个时刻 unreachable/loose 对象的数量超过 auth-threshold,从而导致此警告。不幸的是,这对于正在进行的提取来说是致命的。

我的解决方案是暂时 disable/suspect 这些 gc 操作,方法是按照其手册页中的说明停用 gc auto:

git config gc.auto 0

一旦 git svn fetch 操作完成,您可以根据需要删除此配置,并 运行 手动完整 gc、p运行e 和重新打包操作以优化最终存储库。

我认为如果您将配置选项 gc.pruneExpire 设置为 now,至少在导入过程中会暂时避免该消息。设置该选项后,git gc 将立即删除 所有 无法访问的对象,而不是仅删除至少两周前的对象(默认设置)。加上 gc.auto 的合理值,这应该可以防止它们累积到您收到该消息的程度。

在 git pull:

上开始看到这个警告
$ git pull
remote: Enumerating objects: 22, done.
remote: Counting objects: 100% (22/22), done.
remote: Compressing objects: 100% (5/5), done.
remote: Total 22 (delta 17), reused 22 (delta 17), pack-reused 0
Unpacking objects: 100% (22/22), done.
Auto packing the repository in background for optimum performance.
See "git help gc" for manual housekeeping.
error: The last gc run reported the following. Please correct the root cause
and remove .git/gc.log.
Automatic cleanup will not be performed until the file is removed.

warning: There are too many unreachable loose objects; run 'git prune' to remove them.

Already up-to-date.

查看警告文件,内容不多:

$ cat .git/gc.log
warning: There are too many unreachable loose objects; run 'git prune' to remove them.

阅读帮助:

$ git help gc

听起来我们应该定期做一些这样的事情

运行 偶尔推荐的激进选项

$ git gc --aggressive
Counting objects: 41544, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (40544/40544), done.
Writing objects: 100% (41544/41544), done.
Total 41544 (delta 30536), reused 7801 (delta 0)
Removing duplicate objects: 100% (256/256), done.
Checking connectivity: 46959, done.

删除日志警告:

$ rm .git/gc.log 

微笑

:)

请注意,在这种情况下,删除 .git/gc.log 是解决方案,删除会更容易。

在 Git 2.34(2021 年第 4 季度)中,建议消息中的路径名已准备好剪切和粘贴。

参见commit b45c172 (31 Aug 2021) by Ævar Arnfjörð Bjarmason (avar)
(由 Junio C Hamano -- gitster -- in commit 02d2632 合并,2021 年 9 月 10 日)

gc: remove trailing dot from "gc.log" line

Signed-off-by: Ævar Arnfjörð Bjarmason
Suggested-by: Jan Judas

Remove the trailing dot from the warning we emit about gc.log.
It's common for various terminal UX's to allow the user to select "words", and by including the trailing dot a user wanting to select the path to gc.log will need to manually remove the trailing dot.

Such a user would also probably need to adjust the path if it e.g. had spaces in it, but this should address this very common case.