难解难解git+ssh+proxy失败"bash: No such file or directory"
The hard way to debug the mysterious git+ssh+proxy failure "bash: No such file or directory"
我正在尝试通过 SOCKS5 代理克隆 github 存储库。在 ~/.ssh/config
我有:
Host github.com *.github.com
ProxyCommand /usr/bin/nc -X 5 -x 127.0.0.1:7070 %h %p
"git 克隆" 失败并出现错误 bash: No such file or directory
:
$ git clone git@github.com:aureliojargas/sedsed.git
Cloning into 'sedsed'...
bash: No such file or directory
kex_exchange_identification: Connection closed by remote host
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
我手动尝试了 ssh 命令,它也失败了:
$ ssh -v git@github.com
OpenSSH_8.1p1, LibreSSL 2.7.3
debug1: Reading configuration data /Users/pynexj/.ssh/config
debug1: /Users/pynexj/.ssh/config line 16: Applying options for github.com
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 47: Applying options for *
debug1: Executing proxy command: exec /usr/bin/nc -X 5 -x 127.0.0.1:7070 github.com 22
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
debug1: identity file /Users/pynexj/.ssh/id_rsa type 0
debug1: identity file /Users/pynexj/.ssh/id_rsa-cert type -1
debug1: identity file /Users/pynexj/.ssh/id_dsa type -1
debug1: identity file /Users/pynexj/.ssh/id_dsa-cert type -1
debug1: identity file /Users/pynexj/.ssh/id_ecdsa type -1
debug1: identity file /Users/pynexj/.ssh/id_ecdsa-cert type -1
bash: No such file or directory
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
debug1: identity file /Users/pynexj/.ssh/id_ed25519 type -1
debug1: identity file /Users/pynexj/.ssh/id_ed25519-cert type -1
debug1: identity file /Users/pynexj/.ssh/id_xmss type -1
debug1: identity file /Users/pynexj/.ssh/id_xmss-cert type -1
debug1: Local version string SSH-2.0-OpenSSH_8.1
kex_exchange_identification: Connection closed by remote host
然后我手动尝试了 nc 命令,它确实有效:
$ /usr/bin/nc -X 5 -x 127.0.0.1:7070 github.com 22
SSH-2.0-babeld-8cd15329
^C
而且 SOCKS5 代理也工作正常:
$ curl -x socks5://127.0.0.1:7070/ https://github.com/ > foo.html
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 214k 0 214k 0 0 86775 0 --:--:-- 0:00:02 --:--:-- 86775
我很好奇是谁(以及为什么)产生错误 bash: no such file or directory
。
对我来说,这个问题是 macOS 特有的。我在 Google 上搜索了很多,在 macOS 10.15 (Catalina) 上发现了许多损坏的 SSH 案例,但 none 的解决方法对我有用。最终我不得不看一下 OpenSSH 代码并发现了问题。
在源文件中 sshconnect.c:
194 static int
195 ssh_proxy_connect(struct ssh *ssh, const char *host, const char *host_arg,
196 u_short port, const char *proxy_command)
197 {
...
...
201 char *shell;
202
203 if ((shell = getenv("SHELL")) == NULL || *shell == '[=10=]')
204 shell = _PATH_BSHELL;
...
...
211 command_string = expand_proxy_command(proxy_command, options.user,
212 host, host_arg, port);
213 debug("Executing proxy command: %.500s", command_string);
214
215 /* Fork and execute the proxy command. */
216 if ((pid = fork()) == 0) {
217 char *argv[10];
...
...
240 argv[0] = shell;
241 argv[1] = "-c";
242 argv[2] = command_string;
243 argv[3] = NULL;
244
245 /* Execute the proxy command. Note that we gave up any
246 extra privileges above. */
247 ssh_signal(SIGPIPE, SIG_DFL);
248 execv(argv[0], argv);
249 perror(argv[0]);
250 exit(1);
251 }
参见第 203、240 和 248 行,ssh 正在尝试 运行 ProxyCommand 和 $SHELL
(我没有找到文档为此) 并且它使用 execv() 不会在 $PATH
中搜索。然后我检查了我的 $SHELL
:
$ echo $SHELL
bash
这就是问题所在。 $SHELL
不是完整路径名可执行文件,因此 execv()
无法执行它,错误 bash: No such file or directory
来自第 249 行的 perror()。(错误让我困惑很多。前缀 bash:
让我认为错误来自 Bash。)
解决方案: 手动将 SHELL
设置为 shell 的完整路径名,例如/bin/bash
。 (我没有在.screenrc
里写shell /bin/bash
因为我还有/usr/local/bin/bash
。)
那SHELL=bash
是谁定的?为什么不设置 SHELL=/bin/bash
?
在我的 ~/.screenrc
我有:
shell bash
根据屏幕manual:
shell command
Set the command to be used to create a new shell. This overrides the value of the environment variable $SHELL
.
SHELL
变量最初是 /bin/bash
在我启动屏幕之前在我的交互式 shell 中,所以设置 SHELL=bash
的是屏幕。我认为屏幕应该找出 shell 的完整路径名并将 SHELL
设置为完整路径名,因为根据 posix:
This variable shall represent a pathname of the user's preferred command language interpreter.
那为什么它在我的 Linux 系统 (Debian) 上也能正常工作,而我也有 SHELL=bash
(也在屏幕上)?
我做了一个 strace 并得到了这个:
$ SHELL=xxx strace -f ssh git@github.com
[...]
[pid 5767] rt_sigaction(SIGPIPE, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
[pid 5767] execve("/root/bin/xxx", ["xxx", "-c", "exec nc -X 5 -x 127.0.0.1:7070 g"...], 0x561e33a599a0 /* 33 vars */) = -1 ENOENT (No such file or directory)
[pid 5767] execve("/usr/local/bin/xxx", ["xxx", "-c", "exec nc -X 5 -x 127.0.0.1:7070 g"...], 0x561e33a599a0 /* 33 vars */) = -1 ENOENT (No such file or directory)
[pid 5767] execve("/usr/local/sbin/xxx", ["xxx", "-c", "exec nc -X 5 -x 127.0.0.1:7070 g"...], 0x561e33a599a0 /* 33 vars */) = -1 ENOENT (No such file or directory)
[pid 5767] execve("/usr/sbin/xxx", ["xxx", "-c", "exec nc -X 5 -x 127.0.0.1:7070 g"...], 0x561e33a599a0 /* 33 vars */) = -1 ENOENT (No such file or directory)
[pid 5767] execve("/usr/bin/xxx", ["xxx", "-c", "exec nc -X 5 -x 127.0.0.1:7070 g"...], 0x561e33a599a0 /* 33 vars */) = -1 ENOENT (No such file or directory)
[pid 5767] execve("/sbin/xxx", ["xxx", "-c", "exec nc -X 5 -x 127.0.0.1:7070 g"...], 0x561e33a599a0 /* 33 vars */) = -1 ENOENT (No such file or directory)
[pid 5767] execve("/bin/xxx", ["xxx", "-c", "exec nc -X 5 -x 127.0.0.1:7070 g"...], 0x561e33a599a0 /* 33 vars */) = -1 ENOENT (No such file or directory)
[pid 5767] dup(2) = 3
[pid 5767] fcntl(3, F_GETFL) = 0x8402 (flags O_RDWR|O_APPEND|O_LARGEFILE)
[pid 5767] fstat(3, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0x21), ...}) = 0
[pid 5767] write(3, "xxx: No such file or directory\n", 31xxx: No such file or directory
) = 31
[pid 5767] close(3) = 0
[...]
我们可以看到,它实际上是在$PATH
中搜索xxx
。为什么?我想 Debian 一定已经修补了 openssh 并改变了它的行为。 (如果我了解 Debian 内部构建,我会验证这一点。:-)
更新 2020-11-19:
我从 source 手动编译了 OpenSSH (v8.4) 并在 Debian 上重现了同样的问题。这证实 Debian 已经修补了 OpenSSH 并改变了它的行为。
$ /usr/local/openssh-8.4/bin/ssh git@github.com
bash: No such file or directory
kex_exchange_identification: Connection closed by remote host
$ strace -f /usr/local/openssh-8.4/bin/ssh git@github.com
[...]
[pid 21020] rt_sigaction(SIGPIPE, {sa_handler=SIG_DFL, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f19a05a9840}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
[pid 21020] execve("bash", ["bash", "-c", "exec nc -X 5 -x 127.0.0.1:7070 g"...], 0x5566982872f0 /* 33 vars */) = -1 ENOENT (No such file or directory)
[pid 21020] dup(2) = 3
[pid 21020] fcntl(3, F_GETFL) = 0x8402 (flags O_RDWR|O_APPEND|O_LARGEFILE)
[pid 21020] fstat(3, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0x25), ...}) = 0
[pid 21020] write(3, "bash: No such file or directory\n", 32bash: No such file or directory
) = 32
[pid 21020] close(3)
[...]
我正在尝试通过 SOCKS5 代理克隆 github 存储库。在 ~/.ssh/config
我有:
Host github.com *.github.com
ProxyCommand /usr/bin/nc -X 5 -x 127.0.0.1:7070 %h %p
"git 克隆" 失败并出现错误 bash: No such file or directory
:
$ git clone git@github.com:aureliojargas/sedsed.git
Cloning into 'sedsed'...
bash: No such file or directory
kex_exchange_identification: Connection closed by remote host
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
我手动尝试了 ssh 命令,它也失败了:
$ ssh -v git@github.com
OpenSSH_8.1p1, LibreSSL 2.7.3
debug1: Reading configuration data /Users/pynexj/.ssh/config
debug1: /Users/pynexj/.ssh/config line 16: Applying options for github.com
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 47: Applying options for *
debug1: Executing proxy command: exec /usr/bin/nc -X 5 -x 127.0.0.1:7070 github.com 22
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
debug1: identity file /Users/pynexj/.ssh/id_rsa type 0
debug1: identity file /Users/pynexj/.ssh/id_rsa-cert type -1
debug1: identity file /Users/pynexj/.ssh/id_dsa type -1
debug1: identity file /Users/pynexj/.ssh/id_dsa-cert type -1
debug1: identity file /Users/pynexj/.ssh/id_ecdsa type -1
debug1: identity file /Users/pynexj/.ssh/id_ecdsa-cert type -1
bash: No such file or directory
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
debug1: identity file /Users/pynexj/.ssh/id_ed25519 type -1
debug1: identity file /Users/pynexj/.ssh/id_ed25519-cert type -1
debug1: identity file /Users/pynexj/.ssh/id_xmss type -1
debug1: identity file /Users/pynexj/.ssh/id_xmss-cert type -1
debug1: Local version string SSH-2.0-OpenSSH_8.1
kex_exchange_identification: Connection closed by remote host
然后我手动尝试了 nc 命令,它确实有效:
$ /usr/bin/nc -X 5 -x 127.0.0.1:7070 github.com 22
SSH-2.0-babeld-8cd15329
^C
而且 SOCKS5 代理也工作正常:
$ curl -x socks5://127.0.0.1:7070/ https://github.com/ > foo.html
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 214k 0 214k 0 0 86775 0 --:--:-- 0:00:02 --:--:-- 86775
我很好奇是谁(以及为什么)产生错误 bash: no such file or directory
。
对我来说,这个问题是 macOS 特有的。我在 Google 上搜索了很多,在 macOS 10.15 (Catalina) 上发现了许多损坏的 SSH 案例,但 none 的解决方法对我有用。最终我不得不看一下 OpenSSH 代码并发现了问题。
在源文件中 sshconnect.c:
194 static int
195 ssh_proxy_connect(struct ssh *ssh, const char *host, const char *host_arg,
196 u_short port, const char *proxy_command)
197 {
...
...
201 char *shell;
202
203 if ((shell = getenv("SHELL")) == NULL || *shell == '[=10=]')
204 shell = _PATH_BSHELL;
...
...
211 command_string = expand_proxy_command(proxy_command, options.user,
212 host, host_arg, port);
213 debug("Executing proxy command: %.500s", command_string);
214
215 /* Fork and execute the proxy command. */
216 if ((pid = fork()) == 0) {
217 char *argv[10];
...
...
240 argv[0] = shell;
241 argv[1] = "-c";
242 argv[2] = command_string;
243 argv[3] = NULL;
244
245 /* Execute the proxy command. Note that we gave up any
246 extra privileges above. */
247 ssh_signal(SIGPIPE, SIG_DFL);
248 execv(argv[0], argv);
249 perror(argv[0]);
250 exit(1);
251 }
参见第 203、240 和 248 行,ssh 正在尝试 运行 ProxyCommand 和 $SHELL
(我没有找到文档为此) 并且它使用 execv() 不会在 $PATH
中搜索。然后我检查了我的 $SHELL
:
$ echo $SHELL
bash
这就是问题所在。 $SHELL
不是完整路径名可执行文件,因此 execv()
无法执行它,错误 bash: No such file or directory
来自第 249 行的 perror()。(错误让我困惑很多。前缀 bash:
让我认为错误来自 Bash。)
解决方案: 手动将 SHELL
设置为 shell 的完整路径名,例如/bin/bash
。 (我没有在.screenrc
里写shell /bin/bash
因为我还有/usr/local/bin/bash
。)
那SHELL=bash
是谁定的?为什么不设置 SHELL=/bin/bash
?
在我的 ~/.screenrc
我有:
shell bash
根据屏幕manual:
shell command
Set the command to be used to create a new shell. This overrides the value of the environment variable
$SHELL
.
SHELL
变量最初是 /bin/bash
在我启动屏幕之前在我的交互式 shell 中,所以设置 SHELL=bash
的是屏幕。我认为屏幕应该找出 shell 的完整路径名并将 SHELL
设置为完整路径名,因为根据 posix:
This variable shall represent a pathname of the user's preferred command language interpreter.
那为什么它在我的 Linux 系统 (Debian) 上也能正常工作,而我也有 SHELL=bash
(也在屏幕上)?
我做了一个 strace 并得到了这个:
$ SHELL=xxx strace -f ssh git@github.com
[...]
[pid 5767] rt_sigaction(SIGPIPE, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
[pid 5767] execve("/root/bin/xxx", ["xxx", "-c", "exec nc -X 5 -x 127.0.0.1:7070 g"...], 0x561e33a599a0 /* 33 vars */) = -1 ENOENT (No such file or directory)
[pid 5767] execve("/usr/local/bin/xxx", ["xxx", "-c", "exec nc -X 5 -x 127.0.0.1:7070 g"...], 0x561e33a599a0 /* 33 vars */) = -1 ENOENT (No such file or directory)
[pid 5767] execve("/usr/local/sbin/xxx", ["xxx", "-c", "exec nc -X 5 -x 127.0.0.1:7070 g"...], 0x561e33a599a0 /* 33 vars */) = -1 ENOENT (No such file or directory)
[pid 5767] execve("/usr/sbin/xxx", ["xxx", "-c", "exec nc -X 5 -x 127.0.0.1:7070 g"...], 0x561e33a599a0 /* 33 vars */) = -1 ENOENT (No such file or directory)
[pid 5767] execve("/usr/bin/xxx", ["xxx", "-c", "exec nc -X 5 -x 127.0.0.1:7070 g"...], 0x561e33a599a0 /* 33 vars */) = -1 ENOENT (No such file or directory)
[pid 5767] execve("/sbin/xxx", ["xxx", "-c", "exec nc -X 5 -x 127.0.0.1:7070 g"...], 0x561e33a599a0 /* 33 vars */) = -1 ENOENT (No such file or directory)
[pid 5767] execve("/bin/xxx", ["xxx", "-c", "exec nc -X 5 -x 127.0.0.1:7070 g"...], 0x561e33a599a0 /* 33 vars */) = -1 ENOENT (No such file or directory)
[pid 5767] dup(2) = 3
[pid 5767] fcntl(3, F_GETFL) = 0x8402 (flags O_RDWR|O_APPEND|O_LARGEFILE)
[pid 5767] fstat(3, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0x21), ...}) = 0
[pid 5767] write(3, "xxx: No such file or directory\n", 31xxx: No such file or directory
) = 31
[pid 5767] close(3) = 0
[...]
我们可以看到,它实际上是在$PATH
中搜索xxx
。为什么?我想 Debian 一定已经修补了 openssh 并改变了它的行为。 (如果我了解 Debian 内部构建,我会验证这一点。:-)
更新 2020-11-19:
我从 source 手动编译了 OpenSSH (v8.4) 并在 Debian 上重现了同样的问题。这证实 Debian 已经修补了 OpenSSH 并改变了它的行为。
$ /usr/local/openssh-8.4/bin/ssh git@github.com
bash: No such file or directory
kex_exchange_identification: Connection closed by remote host
$ strace -f /usr/local/openssh-8.4/bin/ssh git@github.com
[...]
[pid 21020] rt_sigaction(SIGPIPE, {sa_handler=SIG_DFL, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f19a05a9840}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
[pid 21020] execve("bash", ["bash", "-c", "exec nc -X 5 -x 127.0.0.1:7070 g"...], 0x5566982872f0 /* 33 vars */) = -1 ENOENT (No such file or directory)
[pid 21020] dup(2) = 3
[pid 21020] fcntl(3, F_GETFL) = 0x8402 (flags O_RDWR|O_APPEND|O_LARGEFILE)
[pid 21020] fstat(3, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0x25), ...}) = 0
[pid 21020] write(3, "bash: No such file or directory\n", 32bash: No such file or directory
) = 32
[pid 21020] close(3)
[...]