通过 shebang 线路链接解释器是否可移植?
Is chaining interpreters via shebang lines portable?
通过所谓的 shebang 行将脚本绑定到特定的解释器是 POSIX 操作系统上的一种众所周知的做法。例如,如果执行以下脚本(给予足够的文件系统权限),操作系统将启动 /bin/sh
解释器,并将脚本的文件名作为第一个参数。随后,shell 将执行脚本中的命令,跳过将被视为注释的 shebang 行。
#! /bin/sh
date -R
echo hello world
可能的输出:
Sat, 01 Apr 2017 12:34:56 +0100
hello world
我曾经相信解释器(本例中的/bin/sh
)必须是本地可执行文件,不能一个脚本本身,反过来又需要启动另一个解释器。
不过,我还是继续尝试了以下实验。
使用下面的哑巴shell保存为/tmp/interpreter.py
,...
#! /usr/bin/python3
import sys
import subprocess
for script in sys.argv[1:]:
with open(script) as istr:
status = any(
map(
subprocess.call,
map(
str.split,
filter(
lambda s : s and not s.startswith('#'),
map(str.strip, istr)
)
)
)
)
if status:
sys.exit(status)
…而下面的脚本保存为/tmp/script.xyz
,
#! /tmp/interpreter.py
date -R
echo hello world
…我能够(在使两个文件都可执行之后)执行 script.xyz
.
5gon12eder:/tmp> ls -l
total 8
-rwxr-x--- 1 5gon12eder 5gon12eder 493 Jun 19 01:01 interpreter.py
-rwxr-x--- 1 5gon12eder 5gon12eder 70 Jun 19 01:02 script.xyz
5gon12eder:/tmp> ./script.xyz
Mon, 19 Jun 2017 01:07:19 +0200
hello world
这让我很吃惊。我什至可以通过另一个脚本启动 scrip.xyz
。
所以,我想问的是:
- 我的实验观察到的行为是否可移植?
- 实验是否正确进行,或者是否存在这种方法不起作用的情况?不同的(类 Unix)操作系统怎么样?
- 如果这个 是 应该工作,就调用而言,本机可执行文件和解释脚本之间没有明显的区别是真的吗?
类 Unix 操作系统中的新可执行文件由系统调用 execve
(2) 启动。 execve
的手册页包括:
Interpreter scripts
An interpreter script is a text file that has execute
permission enabled and whose first line is of the form:
#! interpreter [optional-arg]
The interpreter must be a valid pathname for an executable which
is not itself a script. If the filename argument of execve()
specifies an interpreter script, then interpreter will be invoked
with the following arguments:
interpreter [optional-arg] filename arg...
where arg... is the series of words pointed to by the argv
argument of execve().
For portable use, optional-arg should either be absent, or be
specified as a single word (i.e., it should not contain white
space); see NOTES below.
所以在这些限制(类 Unix,可选参数最多一个词)内,是的,shebang 脚本是可移植的。阅读手册页以获取更多详细信息,包括二进制可执行文件和脚本之间调用的其他差异。
请参阅下面的粗体文字:
This mechanism allows scripts to be used in virtually any context
normal compiled programs can be, including as full system programs,
and even as interpreters of other scripts. As a caveat, though, some
early versions of kernel support limited the length of the interpreter
directive to roughly 32 characters (just 16 in its first
implementation), would fail to split the interpreter name from any
parameters in the directive, or had other quirks. Additionally, some
modern systems allow the entire mechanism to be constrained or
disabled for security purposes (for example, set-user-id support has
been disabled for scripts on many systems). -- WP
并且 COLUMNS=75 man execve | grep -nA 23 "
Interpreter scripts" | head -39
在 Ubuntu 17.04 框上的输出,
特别是 #186-#189 行告诉我们什么在 Linux 上起作用(即脚本可以是解释器,最多四个级别深):
166: Interpreter scripts
167- An interpreter script is a text file that has execute permission
168- enabled and whose first line is of the form:
169-
170- #! interpreter [optional-arg]
171-
172- The interpreter must be a valid pathname for an executable file.
173- If the filename argument of execve() specifies an interpreter
174- script, then interpreter will be invoked with the following argu‐
175- ments:
176-
177- interpreter [optional-arg] filename arg...
178-
179- where arg... is the series of words pointed to by the argv argu‐
180- ment of execve(), starting at argv[1].
181-
182- For portable use, optional-arg should either be absent, or be
183- specified as a single word (i.e., it should not contain white
184- space); see NOTES below.
185-
186- Since Linux 2.6.28, the kernel permits the interpreter of a script
187- to itself be a script. This permission is recursive, up to a
188- limit of four recursions, so that the interpreter may be a script
189- which is interpreted by a script, and so on.
--
343: Interpreter scripts
344- A maximum line length of 127 characters is allowed for the first
345- line in an interpreter scripts.
346-
347- The semantics of the optional-arg argument of an interpreter
348- script vary across implementations. On Linux, the entire string
349- following the interpreter name is passed as a single argument to
350- the interpreter, and this string can include white space. How‐
351- ever, behavior differs on some other systems. Some systems use
352- the first white space to terminate optional-arg. On some systems,
353- an interpreter script can have multiple arguments, and white spa‐
354- ces in optional-arg are used to delimit the arguments.
355-
356- Linux ignores the set-user-ID and set-group-ID bits on scripts.
来自 Solaris 11 exec(2) 手册页:
An interpreter file begins with a line of the form
#! pathname [arg]
where pathname is the path of the interpreter, and arg is an
optional argument. When an interpreter file is executed, the
system invokes the specified interpreter. The pathname
specified in the interpreter file is passed as arg0 to the
interpreter. If arg was specified in the interpreter file,
it is passed as arg1 to the interpreter. The remaining
arguments to the interpreter are arg0 through argn of the
originally exec'd file. The interpreter named by pathname
must not be an interpreter file.
正如最后一条声明所述,Solaris 中根本不支持链接解释器,尝试这样做将导致最后一个未解释的解释器(例如 /usr/bin/python3
)解释第一个脚本(例如 /tmp/script.xyz
,最后的命令行将变成 /usr/bin/python3 /tmp/script.xyz
),没有链接。
所以脚本解释器链接根本不可移植。
通过所谓的 shebang 行将脚本绑定到特定的解释器是 POSIX 操作系统上的一种众所周知的做法。例如,如果执行以下脚本(给予足够的文件系统权限),操作系统将启动 /bin/sh
解释器,并将脚本的文件名作为第一个参数。随后,shell 将执行脚本中的命令,跳过将被视为注释的 shebang 行。
#! /bin/sh
date -R
echo hello world
可能的输出:
Sat, 01 Apr 2017 12:34:56 +0100
hello world
我曾经相信解释器(本例中的/bin/sh
)必须是本地可执行文件,不能一个脚本本身,反过来又需要启动另一个解释器。
不过,我还是继续尝试了以下实验。
使用下面的哑巴shell保存为/tmp/interpreter.py
,...
#! /usr/bin/python3
import sys
import subprocess
for script in sys.argv[1:]:
with open(script) as istr:
status = any(
map(
subprocess.call,
map(
str.split,
filter(
lambda s : s and not s.startswith('#'),
map(str.strip, istr)
)
)
)
)
if status:
sys.exit(status)
…而下面的脚本保存为/tmp/script.xyz
,
#! /tmp/interpreter.py
date -R
echo hello world
…我能够(在使两个文件都可执行之后)执行 script.xyz
.
5gon12eder:/tmp> ls -l total 8 -rwxr-x--- 1 5gon12eder 5gon12eder 493 Jun 19 01:01 interpreter.py -rwxr-x--- 1 5gon12eder 5gon12eder 70 Jun 19 01:02 script.xyz 5gon12eder:/tmp> ./script.xyz Mon, 19 Jun 2017 01:07:19 +0200 hello world
这让我很吃惊。我什至可以通过另一个脚本启动 scrip.xyz
。
所以,我想问的是:
- 我的实验观察到的行为是否可移植?
- 实验是否正确进行,或者是否存在这种方法不起作用的情况?不同的(类 Unix)操作系统怎么样?
- 如果这个 是 应该工作,就调用而言,本机可执行文件和解释脚本之间没有明显的区别是真的吗?
类 Unix 操作系统中的新可执行文件由系统调用 execve
(2) 启动。 execve
的手册页包括:
Interpreter scripts
An interpreter script is a text file that has execute
permission enabled and whose first line is of the form:
#! interpreter [optional-arg]
The interpreter must be a valid pathname for an executable which
is not itself a script. If the filename argument of execve()
specifies an interpreter script, then interpreter will be invoked
with the following arguments:
interpreter [optional-arg] filename arg...
where arg... is the series of words pointed to by the argv
argument of execve().
For portable use, optional-arg should either be absent, or be
specified as a single word (i.e., it should not contain white
space); see NOTES below.
所以在这些限制(类 Unix,可选参数最多一个词)内,是的,shebang 脚本是可移植的。阅读手册页以获取更多详细信息,包括二进制可执行文件和脚本之间调用的其他差异。
请参阅下面的粗体文字:
This mechanism allows scripts to be used in virtually any context normal compiled programs can be, including as full system programs, and even as interpreters of other scripts. As a caveat, though, some early versions of kernel support limited the length of the interpreter directive to roughly 32 characters (just 16 in its first implementation), would fail to split the interpreter name from any parameters in the directive, or had other quirks. Additionally, some modern systems allow the entire mechanism to be constrained or disabled for security purposes (for example, set-user-id support has been disabled for scripts on many systems). -- WP
并且
COLUMNS=75 man execve | grep -nA 23 " Interpreter scripts" | head -39
在 Ubuntu 17.04 框上的输出, 特别是 #186-#189 行告诉我们什么在 Linux 上起作用(即脚本可以是解释器,最多四个级别深):
166: Interpreter scripts 167- An interpreter script is a text file that has execute permission 168- enabled and whose first line is of the form: 169- 170- #! interpreter [optional-arg] 171- 172- The interpreter must be a valid pathname for an executable file. 173- If the filename argument of execve() specifies an interpreter 174- script, then interpreter will be invoked with the following argu‐ 175- ments: 176- 177- interpreter [optional-arg] filename arg... 178- 179- where arg... is the series of words pointed to by the argv argu‐ 180- ment of execve(), starting at argv[1]. 181- 182- For portable use, optional-arg should either be absent, or be 183- specified as a single word (i.e., it should not contain white 184- space); see NOTES below. 185- 186- Since Linux 2.6.28, the kernel permits the interpreter of a script 187- to itself be a script. This permission is recursive, up to a 188- limit of four recursions, so that the interpreter may be a script 189- which is interpreted by a script, and so on. -- 343: Interpreter scripts 344- A maximum line length of 127 characters is allowed for the first 345- line in an interpreter scripts. 346- 347- The semantics of the optional-arg argument of an interpreter 348- script vary across implementations. On Linux, the entire string 349- following the interpreter name is passed as a single argument to 350- the interpreter, and this string can include white space. How‐ 351- ever, behavior differs on some other systems. Some systems use 352- the first white space to terminate optional-arg. On some systems, 353- an interpreter script can have multiple arguments, and white spa‐ 354- ces in optional-arg are used to delimit the arguments. 355- 356- Linux ignores the set-user-ID and set-group-ID bits on scripts.
来自 Solaris 11 exec(2) 手册页:
An interpreter file begins with a line of the form
#! pathname [arg]
where pathname is the path of the interpreter, and arg is an
optional argument. When an interpreter file is executed, the
system invokes the specified interpreter. The pathname
specified in the interpreter file is passed as arg0 to the
interpreter. If arg was specified in the interpreter file,
it is passed as arg1 to the interpreter. The remaining
arguments to the interpreter are arg0 through argn of the
originally exec'd file. The interpreter named by pathname
must not be an interpreter file.
正如最后一条声明所述,Solaris 中根本不支持链接解释器,尝试这样做将导致最后一个未解释的解释器(例如 /usr/bin/python3
)解释第一个脚本(例如 /tmp/script.xyz
,最后的命令行将变成 /usr/bin/python3 /tmp/script.xyz
),没有链接。
所以脚本解释器链接根本不可移植。