如果 git 提取中途取消,它会恢复吗?

If a git fetch is cancelled half way will it resume?

有时获取任何 git 存储库(通过执行“git fetch repository_URL”)可能需要花费数小时,具体取决于存储库的大小和网络速度。

如果由于某种原因用户在中途取消获取,然后稍后尝试在他/她取消上次获取的完全相同的环境中获取相同的存储库,获取将如何工作?

它会从中断处恢复提取吗?

否(2015 年)或可能很快(2018 年第 4 季度),git clone/fetch/pull 操作没有 "resume" 能力。

从那时起:

  • 2018 年第 4 季度,
  • .

2015 年:

唯一的选择,mentioned in this thread, is gitolite(这是一个管理 ACM 的 perl 脚本——您的存储库的访问控制级别,以及围绕 git 访问提供其他实用程序)

gitolite can be configured to update a "Git bundle" (see the git-bundle manual) which is then can be made downloadable via rsync or HTTP protocols and then it can be downloaded using an rsync client of a HTTP client which supports resuming.

Using this technique can make the "download everything" and "make a repo out of the downloaded stuff" steps distinct, and the first step can be carried out using any number of attempts.

The downsides are obvious:

  1. This requires special setup on the server side.
  2. It's unclear what happens if someone manages to update a repository while someone is downloading its bundle, or the update happens between the adjacent download attempts.

关于git clone/fetch的可恢复功能(在“How to complete a git clone for a big project on an unstable connection?"), there is a recent discussion (March 2016) on the git mailing list.

中提到
  • 一种方法是让服务器生成包,可以加载(使用 wget -c 恢复!)并添加到本地存储库(因为包是 one 您可以从中克隆的文件,就好像它是 git 存储库一样)。
    参见“Cloning Linux from a bundle

即:

wget -c https://cdn.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/clone.bundle
git bundle verify clone.bundle
...
clone.bundle is okay
git clone clone.bundle linux
#Now, point the origin to the live git repository and get the latest changes:
cd linux
git remote remove origin
git remote add origin https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
git pull origin master

We could implement resumable clone by making a bit of a hybrid of the smart and dumb HTTP protocol.

  1. A git clone eventually calls into the transport layer, and git-remote-curl will probe for the info/clone URL; if the resource fails to load, everything goes through the traditional codepath.

  2. When git-remote-curl detects the support of dumb clone, it does the "retry until successfully download the pack data fully" dance internally, tentatively updates the remote tracking refs, and then pretends as if it was asked to do an incremental fetch. If this succeeds without any die(), everybody is happy.

  3. If the above step 3. has to die() for some reason (including impatience hitting CTRLC), leave the $GIT_DIR, downloaded .info file and partially downloaded .pack file.
    Tell the user that the cloning can be resumed and how.

注意这是一个可恢复的克隆,not a resumable fetch:

the initial "clone" and subsequent incremental "fetch" are orthogonal issues.

Because the proposed update to "clone" has much larger payoff than the proposed change to "fetch", i.e.

  • The amount of data that gets transferred is much larger, hence the chance of network timing out in a poor network environment is much higher, need for resuming much larger.
  • Not only the approach makes "clone" resumable and helping clients, it helps the server offload bulk transfer out to CDN.

and it has much smaller damage to the existing code, i.e.

  • We do not have to pessimize the packing process, only to discard the bulk of bytes that were generated, like the proposed approach for "fetch".
  • The area new code is needed is well isolated and the switch to new protocol happens very early in the exchange without sharing code to existing codepath; these properties make it less risky to introduce regression.

为了避免新协议中的 HTTP-only 特性,有人提议 "v2" 协议让双方在 ref 通告之前交换能力。然后客户端看到服务器的可恢复URL,知道是否继续广告。
请参阅 2017 年 7 月的 stefanbeller/gitprotocol2-10