"git submodule foreach git pull origin master" 和 "git pull origin master --recurse-submodules" 有什么区别

What is the difference between "git submodule foreach git pull origin master" and "git pull origin master --recurse-submodules"

我有一个 dotfiles 存储库,其中我所有的 vim 插件都存储为子模块,因此当它们有更改时很容易更新。我以为这两个命令做了同样的事情,但我发现情况一定不是这样。

我知道我在几个子模块中有更新要下拉,所以我从父存储库的根目录 运行 git pull origin master --recurse-submodules。它似乎遍历每个子模块,但只从它们的原始存储库中获取更新。

当我 运行 git submodule foreach git pull origin master 然后它实际上 运行 git pull origin master 在每个存储库中,进行提取和合并。

使用 --recurse-submodules 有什么意义?我对它实际尝试做的事情有点困惑,Google 对我发现的东西有点神秘。我想也许你们聪明的人会有更简单的解释。

What is the point of using --recurse-submodules?

--recurse-submodules 将在一个子模块中执行子模块(实际上是 递归 )。 git submodule foreach git pull origin master不会,它只会做直接的子模块。

该选项主要用于获取所有子模块提交,而不是像主模块那样只拉取一个特定分支,原因在以下两个提交中详述:
(请注意 Git 2.11 中修复了一个错误,请参阅此答案的末尾)

对于git pull,此选项已在(commit 7dce19d, Nov. 2010, git 1.7.4-rc0):

中引入

fetch/pull: Add the --recurse-submodules option

Until now you had to call "git submodule update" (without -N|--no-fetch option) or something like "git submodule foreach git fetch" to fetch new commits in populated submodules from their remote.

This could lead to "(commits not present)" messages in the output of "git diff --submodule" (which is used by "git gui" and "gitk") after fetching or pulling new commits in the superproject and is an obstacle for implementing recursive checkout of submodules.
Also "git submodule update" cannot fetch changes when disconnected, so it was very easy to forget to fetch the submodule changes before disconnecting only to discover later that they are needed.

This patch adds the "--recurse-submodules" option to recursively fetch each populated submodule from the url configured in the .git/config of the submodule at the end of each "git fetch" or during "git pull" in the superproject. The submodule paths are taken from the index.


Commit 88a2197 (March 2011, git 1.7.5-rc1) 稍微解释一下:

fetch/pull: recurse into submodules when necessary

To be able to access all commits of populated submodules referenced by the superproject, it is sufficient to only then let "git fetch" recurse into a submodule when the new commits fetched in the superproject record new commits for it.

  • Having these commits present is extremely useful when using the "--submodule" option to "git diff" (which is what "git gui" and "gitk" do since 1.6.6), as all submodule commits needed for creating a descriptive output can be accessed.
  • Also merging submodule commits (added in 1.7.3) depends on the submodule commits in question being present to work.
  • Last but not least this enables disconnected operation when using submodules, as all commits necessary for a successful "git submodule update -N" will have been fetched automatically.

So we choose this mode as the default for fetch and pull.


git pull origin master --recurse-submodules 
git submodule foreach git pull origin master

第一个应该是pull,而不仅仅是fetch,相当于第二个。也许这是参数顺序问题:

git pull --recurse-submodules origin master 

但是,这不是为给定分支更新子模块的推荐方法:请参阅下一节。


请注意,从 master 实际拉取的正确方法是 register the master branch to the submodule,使该子模块跟踪 master:

git config -f .gitmodules submodule.<path>.branch <branch>

那么简单的git submodule update --remote --recursive就够了
fetch/pull 的分支记录在父存储库中(在 .gitmodules 文件中),因此您甚至不必记住要针对哪个分支更新子模块。


更新 Git 2.11(2011 年第四季度)

Having a submodule whose ".git" repository is somehow corrupt caused a few commands that recurse into submodules loop forever.

参见 commit 10f5c52 (01 Sep 2016) by Junio C Hamano (gitster)
(由 Junio C Hamano -- gitster -- in commit 293c232 合并,2016 年 9 月 12 日)

最近的 2016 年提交在 With Git 2.21(2018 年第 4 季度)中进行了扩展:“git fetch --recurse-submodules"(man) 可能无法获取绑定到超级项目的必要提交,这正在更正中。

参见 commit be76c21 (06 Dec 2018), and commit a62387b, commit 26f80cc, commit d5498e0, commit bcd7337, commit 16dd6fe, commit 08a297b, commit 25e3d28, commit 161b1cf (28 Nov 2018) by Stefan Beller (stefanbeller)
(由 Junio C Hamano -- gitster -- in commit 5d3635d 合并,2019 年 1 月 29 日)

submodule.c: fetch in submodules git directory instead of in worktree

Signed-off-by: Stefan Beller

Keep the properties introduced in 10f5c52656 ("submodule: avoid auto-discovery in prepare_submodule_repo_env()", 2016-09-01, Git v2.11.0-rc0 -- merge listed in batch #1), by fixating the git directory of the submodule.

但是...“git fetch"(man) 无法正确处理嵌套子模块,其中不感兴趣的最内层子模块在上游得到更新,已通过 Git 2.30(2021 年第一季度)。

参见 commit 1b7ac4e (12 Nov 2020) by Peter Kaestle (dscho)
(由 Junio C Hamano -- gitster -- in commit d627bf6 合并,2020 年 11 月 25 日)

submodules: fix of regression on fetching of non-init subsub-repo

Signed-off-by: Peter Kaestle

A regression has been introduced by a62387b ("submodule.c: fetch in submodules git directory instead of in worktree", 2018-11-28, Git v2.21.0-rc0 -- merge listed in batch #4).

The scenario in which it triggers is when one has a remote repository with a subrepository inside a subrepository like this: superproject/middle_repo/inner_repo

Person A and B have both a clone of it, while Person B is not working with the inner_repo and thus does not have it initialized in his working copy.

Now person A introduces a change to the inner_repo and propagates it through the middle_repo and the superproject.

Once person A pushed the changes and person B wants to fetch them using "git fetch"(man) on superproject level, B's git(man) call will return with error saying:

Could not access submodule 'inner_repo' Errors during submodule fetch:> middle_repo

Expectation is that in this case the inner submodule will be recognized as uninitialized subrepository and skipped by the git fetch(man) command.

This used to work correctly before 'a62387b ("submodule.c: fetch in submodules git directory instead of in worktree", 2018-11-28, Git v2.21.0-rc0 -- merge listed in batch #4)'.

Starting with a62387b the code wants to evaluate "is_empty_dir()" inside .git/modules for a directory only existing in the worktree, delivering then of course wrong return value.

This patch reverts the changes of a62387b and introduces a regression test.


警告:先前修复“git fetch --recurse-submodules"(man) 的尝试破坏了另一个用例;使用 Git 2.30(2021 年第一季度)恢复它, 直到找到更好的解决方案。

参见 commit 7091499 (02 Dec 2020) by Junio C Hamano (gitster)
(由 Junio C Hamano -- gitster -- in commit f3e5dcd 合并,2020 年 12 月 3 日)

Revert "submodules: fix of regression on fetching of non-init subsub-repo"

This reverts commit 1b7ac4e6d4d490b224f5206af7418ed74e490608 Ralf Thielow reports that "git fetch"(man) with submodule.recurse set can result in a bogus and infinitely recursive fetching of the same submodule.


使用 Git 2.30.1(2021 年第一季度),“git fetch --recurse-submodules"(man) 修复(第二次尝试)。

参见 commit 505a276 (09 Dec 2020) by Peter Kaestle (dscho)
(由 Junio C Hamano -- gitster -- in commit c977ff4 合并,2021 年 1 月 6 日)

submodules: fix of regression on fetching of non-init subsub-repo

Signed-off-by: Peter Kaestle
CC: Junio C Hamano
CC: Philippe Blain
CC: Ralf Thielow
CC: Eric Sunshine
Reviewed-by: Philippe Blain

A regression has been introduced by a62387b ("submodule.c: fetch in submodules git directory instead of in worktree", 2018-11-28, Git v2.21.0-rc0 -- merge listed in batch #4).

The scenario in which it triggers is when one has a repository with a submodule inside a submodule like this: superproject/middle_repo/inner_repo

Person A and B have both a clone of it, while Person B is not working with the inner_repo and thus does not have it initialized in his working copy.

Now person A introduces a change to the inner_repo and propagates it through the middle_repo and the superproject.

Once person A pushed the changes and person B wants to fetch them using "git fetch"(man) at the superproject level, B's git call will return with error saying:

Could not access submodule 'inner_repo' Errors during submodule fetch: middle_repo

Expectation is that in this case the inner submodule will be recognized as uninitialized submodule and skipped by the git fetch command.

This used to work correctly before 'a62387b ("submodule.c: fetch in submodules git directory instead of in worktree", 2018-11-28, Git v2.21.0-rc0 -- merge listed in batch #4)'.

Starting with a62387b the code wants to evaluate "is_empty_dir()" inside .git/modules for a directory only existing in the worktree, delivering then of course wrong return value.

This patch ensures is_empty_dir() is getting the correct path of the uninitialized submodule by concatenation of the actual worktree and the name of the uninitialized submodule.

The first attempt to fix this regression, in 1b7ac4e ("submodules: fix of regression on fetching of non-init subsub-repo", 2020-11-12, Git v2.30.0-rc0 -- merge listed in batch #8), by simply reverting a62387b, resulted in an infinite loop of submodule fetches in the simpler case of a recursive fetch of a superproject with uninitialized submodules, and so this commit was reverted in 7091499 (Revert "submodules: fix of regression on fetching of non-init subsub-repo", 2020-12-02, Git v2.30.0-rc0 -- merge listed in batch #10).
To prevent future breakages, also add a regression test for this scenario.


"git fetch --recurse-submodules from``"(man) multiple remotes (either from a remote group, or "--all") used to make one extra "git fetch"(man) 在子模块中,已使用 Git 2.37(2022 年第 3 季度)更正。

参见 commit 0353c68 (16 May 2022) by Junio C Hamano (gitster)
(由 Junio C Hamano -- gitster -- in commit fa61b77 合并,2022 年 5 月 25 日)

fetch: do not run a redundant fetch from submodule

Reviewed-by: Glen Choo

When 7dce19d ("fetch/pull: Add the --recurse-submodules option", 2010-11-12, Git v1.7.4-rc0 -- merge) introduced the "--recurse-submodule" option, the approach taken was to perform fetches in submodules only once, after all the main fetching (it may usually be a fetch from a single remote, but it could be fetching from a group of remotes using fetch_multiple()) succeeded.
Later we added "--all" to fetch from all defined remotes, which complicated things even more.

If your project has a submodule, and you try to run "git fetch"(man) --recurse-submodule --all, you'd see a fetch for the top-level, which invokes another fetch for the submodule, followed by another fetch for the same submodule.
All but the last fetch for the submodule come from a "git fetch --recurse-submodules"(man) subprocess that is spawned via the fetch_multiple() interface for the remotes, and the last fetch comes from the code at the end.

Because recursive fetching from submodules is done in each fetch for the top-level in fetch_multiple(), the last fetch in the submodule is redundant.
It only matters when fetch_one() interacts with a single remote at the top-level.

While we are at it, there is one optimization that exists in dealing with a group of remote, but is missing when "--all" is used.
In the former, when the group turns out to be a group of one, instead of spawning "git fetch" as a subprocess via the fetch_multiple() interface, we use the normal fetch_one() code path.
Do the same when handing "--all", if it turns out that we have only one remote defined.