git-grep 不使用多线程
git-grep not using multiple threads
我正在尝试使用 git grep
来搜索一个非常大的存储库的所有修订。我使用的命令是:
$ git rev-list --all | xargs git grep -I --threads 10 --line-number \
--only-matching "SomeString"
我在mac上使用git的最新官方版本:
$ git --version
git version 2.19.1
用了很长时间,看activity monitor git 只用了一个线程。但是 docs 说它应该默认使用 8。它只使用一个线程,有或没有 --threads <num>
选项。我也没有任何其他配置集可以覆盖此设置:
$ git config --list
credential.helper=osxkeychain
user.name=****
user.email=****
知道我遗漏了什么吗?其他人可以使用 git-grep
并确认他们看到多个线程吗?
感谢您的帮助
我想知道是否是因为您正在使用 | xargs
,它等待 stdin
上的输入。由于 git rev-list
的输出是单个流 xargs,默认情况下将仅使用一个进程:
-P max-procs, --max-procs=max-procs
Run up to max-procs processes at a time; **the default is 1**. If
max-procs is 0, xargs will run as many processes as possible
at a time.
所以尝试使用上面的标志增加它:
git rev-list --all | xargs -P 10 git grep -I --threads 1 --line-number \
--only-matching "SomeString"
这将产生多个 git grep
,而不是使 git grep
能够使用多个线程,因此是一种功能性的答案。
分配给 xargs
的线程数将取决于 git grep
使用的线程数。
git grep
.
以前默认为 8
但是:
对于 Git 2.26(2020 年第一季度),这是现在的核心数。
参见 commit f1928f0, commit 70a9fef, commit 1184a95, commit 6c30762, commit c441ea4, commit d799242, commit 1d1729c, commit 31877c9, commit b1fc9da, commit d5b0bac, commit faf123c, commit c3a5bb3 (16 Jan 2020) by Matheus Tavares (matheustavares
)。
(由 Junio C Hamano -- gitster
-- in commit 56ceb64 合并,2020 年 2 月 14 日)
grep
: use no. of cores as the default no. of threads
Signed-off-by: Matheus Tavares
When --threads
is not specified, git grep
will use 8 threads by default.
This fixed number may be too many for machines with fewer cores and too little for machines with more cores.
So, instead, use the number of logical cores available in the machine, which seems to result in the best overall performance.
The following measurements correspond to the mean elapsed times for 30 git grep
executions in chromium's repository with a 95% confidence interval (each set of 30 were performed after 2 warmup runs).
Regex 1 is 'abcd[02]
' and Regex 2 is '(static|extern) (int|double) \*
'.
(chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”, 04-06-2019), after a 'git gc
' execution.)
| Working tree | Object Store
------|-------------------------------|--------------------------------
#ths | Regex 1 | Regex 2 | Regex 1 | Regex 2
------|---------------|---------------|----------------|---------------
32 | 2.92s ± 0.01 | 3.72s ± 0.21 | 5.36s ± 0.01 | 6.07s ± 0.01
16 | 2.84s ± 0.01 | 3.57s ± 0.21 | 5.05s ± 0.01 | 5.71s ± 0.01
8 | 2.53s ± 0.00 | 3.24s ± 0.21 | 4.86s ± 0.01 | 5.48s ± 0.01
4 | 2.43s ± 0.02 | 3.22s ± 0.20 | 5.22s ± 0.02 | 6.03s ± 0.02
2 | 3.06s ± 0.20 | 4.52s ± 0.01 | 7.52s ± 0.01 | 9.06s ± 0.01
1 | 6.16s ± 0.01 | 9.25s ± 0.02 | 14.10s ± 0.01 | 17.22s ± 0.01
The above tests were performed in a desktop running Debian 10.0 with Intel(R) Xeon(R) CPU E3-1230 V2 (4 cores w/ hyper-threading), 32GB of RAM and a 7200 rpm, SATA 3.1 HDD.
Bellow, the tests were repeated for a machine with SSD: a Manjaro laptop with Intel(R) i7-7700HQ (4 cores w/ hyper-threading) and 16GB of RAM:
| Working tree | Object Store
------|--------------------------------|--------------------------------
#ths | Regex 1 | Regex 2 | Regex 1 | Regex 2
------|---------------|----------------|----------------|---------------
32 | 3.29s ± 0.21 | 4.30s ± 0.01 | 6.30s ± 0.01 | 7.30s ± 0.02
16 | 3.19s ± 0.20 | 4.14s ± 0.02 | 5.91s ± 0.01 | 6.83s ± 0.01
8 | 2.90s ± 0.04 | 3.82s ± 0.20 | 5.70s ± 0.02 | 6.53s ± 0.01
4 | 2.84s ± 0.02 | 3.77s ± 0.20 | 6.19s ± 0.02 | 7.18s ± 0.02
2 | 3.73s ± 0.21 | 5.57s ± 0.02 | 9.28s ± 0.01 | 11.22s ± 0.01
1 | 7.48s ± 0.02 | 11.36s ± 0.03 | 17.75s ± 0.01 | 21.87s ± 0.08
我正在尝试使用 git grep
来搜索一个非常大的存储库的所有修订。我使用的命令是:
$ git rev-list --all | xargs git grep -I --threads 10 --line-number \
--only-matching "SomeString"
我在mac上使用git的最新官方版本:
$ git --version
git version 2.19.1
用了很长时间,看activity monitor git 只用了一个线程。但是 docs 说它应该默认使用 8。它只使用一个线程,有或没有 --threads <num>
选项。我也没有任何其他配置集可以覆盖此设置:
$ git config --list
credential.helper=osxkeychain
user.name=****
user.email=****
知道我遗漏了什么吗?其他人可以使用 git-grep
并确认他们看到多个线程吗?
感谢您的帮助
我想知道是否是因为您正在使用 | xargs
,它等待 stdin
上的输入。由于 git rev-list
的输出是单个流 xargs,默认情况下将仅使用一个进程:
-P max-procs, --max-procs=max-procs
Run up to max-procs processes at a time; **the default is 1**. If
max-procs is 0, xargs will run as many processes as possible
at a time.
所以尝试使用上面的标志增加它:
git rev-list --all | xargs -P 10 git grep -I --threads 1 --line-number \
--only-matching "SomeString"
这将产生多个 git grep
,而不是使 git grep
能够使用多个线程,因此是一种功能性的答案。
分配给 xargs
的线程数将取决于 git grep
使用的线程数。
git grep
.
但是:
对于 Git 2.26(2020 年第一季度),这是现在的核心数。
参见 commit f1928f0, commit 70a9fef, commit 1184a95, commit 6c30762, commit c441ea4, commit d799242, commit 1d1729c, commit 31877c9, commit b1fc9da, commit d5b0bac, commit faf123c, commit c3a5bb3 (16 Jan 2020) by Matheus Tavares (matheustavares
)。
(由 Junio C Hamano -- gitster
-- in commit 56ceb64 合并,2020 年 2 月 14 日)
grep
: use no. of cores as the default no. of threadsSigned-off-by: Matheus Tavares
When
--threads
is not specified,git grep
will use 8 threads by default.This fixed number may be too many for machines with fewer cores and too little for machines with more cores.
So, instead, use the number of logical cores available in the machine, which seems to result in the best overall performance.The following measurements correspond to the mean elapsed times for 30
git grep
executions in chromium's repository with a 95% confidence interval (each set of 30 were performed after 2 warmup runs).
Regex 1 is 'abcd[02]
' and Regex 2 is '(static|extern) (int|double) \*
'.(chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”, 04-06-2019), after a '
git gc
' execution.)| Working tree | Object Store ------|-------------------------------|-------------------------------- #ths | Regex 1 | Regex 2 | Regex 1 | Regex 2 ------|---------------|---------------|----------------|--------------- 32 | 2.92s ± 0.01 | 3.72s ± 0.21 | 5.36s ± 0.01 | 6.07s ± 0.01 16 | 2.84s ± 0.01 | 3.57s ± 0.21 | 5.05s ± 0.01 | 5.71s ± 0.01 8 | 2.53s ± 0.00 | 3.24s ± 0.21 | 4.86s ± 0.01 | 5.48s ± 0.01 4 | 2.43s ± 0.02 | 3.22s ± 0.20 | 5.22s ± 0.02 | 6.03s ± 0.02 2 | 3.06s ± 0.20 | 4.52s ± 0.01 | 7.52s ± 0.01 | 9.06s ± 0.01 1 | 6.16s ± 0.01 | 9.25s ± 0.02 | 14.10s ± 0.01 | 17.22s ± 0.01
The above tests were performed in a desktop running Debian 10.0 with Intel(R) Xeon(R) CPU E3-1230 V2 (4 cores w/ hyper-threading), 32GB of RAM and a 7200 rpm, SATA 3.1 HDD.
Bellow, the tests were repeated for a machine with SSD: a Manjaro laptop with Intel(R) i7-7700HQ (4 cores w/ hyper-threading) and 16GB of RAM:
| Working tree | Object Store ------|--------------------------------|-------------------------------- #ths | Regex 1 | Regex 2 | Regex 1 | Regex 2 ------|---------------|----------------|----------------|--------------- 32 | 3.29s ± 0.21 | 4.30s ± 0.01 | 6.30s ± 0.01 | 7.30s ± 0.02 16 | 3.19s ± 0.20 | 4.14s ± 0.02 | 5.91s ± 0.01 | 6.83s ± 0.01 8 | 2.90s ± 0.04 | 3.82s ± 0.20 | 5.70s ± 0.02 | 6.53s ± 0.01 4 | 2.84s ± 0.02 | 3.77s ± 0.20 | 6.19s ± 0.02 | 7.18s ± 0.02 2 | 3.73s ± 0.21 | 5.57s ± 0.02 | 9.28s ± 0.01 | 11.22s ± 0.01 1 | 7.48s ± 0.02 | 11.36s ± 0.03 | 17.75s ± 0.01 | 21.87s ± 0.08