如何等待孙进程(由于 SIG CHLD,`bash` retval 在 Perl 中变为 -1)
How to wait for grandchild process (`bash` retval becomes -1 in Perl due to SIG CHLD)
我有一个在 cron 中运行的 Perl 脚本(下面的片段)来执行系统检查。我 fork 一个 child 作为超时,并使用 SIG{CHLD} 获取它。 Perl 对 Bash 脚本进行多次系统调用并检查它们的退出状态。一个 bash 脚本在大约 5% 的时间内没有错误地失败。 Bash 脚本以 0 存在并且 Perl 看到 $?作为 -1 和 $!作为 "No child processes"。
此 bash 脚本测试编译器许可证,并且在 Bash 脚本完成后留下英特尔 icc(下面的 ps 输出)。我认为 icc 僵尸完成,迫使 Perl 进入 SIG{CHLD} 处理程序,这会吹走 $?在我能够阅读它之前的状态。
Compile status -1; No child processes
#!/usr/bin/perl
use strict;
use POSIX ':sys_wait_h';
my $GLOBAL_TIMEOUT = 1200;
### Timer to notify if this program hangs
my $timer_pid;
$SIG{CHLD} = sub {
local ($!, $?);
while((my $pid = waitpid(-1, WNOHANG)) > 0)
{
if($pid == $timer_pid)
{
die "Timeout\n";
}
}
};
die "Unable to fork\n" unless(defined($timer_pid = fork));
if($timer_pid == 0) # child
{
sleep($GLOBAL_TIMEOUT);
exit;
}
### End Timer
### Compile test
my @compile = `./compile_test.sh 2>&1`;
my $status = $?;
print "Compile status $status; $!\n";
if($status != 0)
{
print "@compile\n";
}
END # Timer cleanup
{
if($timer_pid != 0)
{
$SIG{CHLD} = 'IGNORE';
kill(15, $timer_pid);
}
}
exit(0);
#!/bin/sh
cc compile_test.c
if [ $? -ne 0 ]; then
echo "Cray compiler failure"
exit 1
fi
module swap PrgEnv-cray PrgEnv-intel
cc compile_test.c
if [ $? -ne 0 ]; then
echo "Intel compiler failure"
exit 1
fi
wait
ps
exit 0
等待并没有真正等待,因为 cc 调用 icc,它创建了一个等待(或等待 PID)不会阻塞的僵尸 grandchild 进程。 (等待 `pidof icc`,在本例中为 31589,给出 "not a child of this shell")
user 31589 1 0 12:47 pts/15 00:00:00 icc
我只是不知道如何在 Bash 或 Perl 中解决这个问题。
谢谢,克里斯
这不是 alarm
的用例吗?扔掉你的 SIGCHLD 处理程序并说
local $? = -1;
eval {
local $SIG{ALRM} = sub { die "Timeout\n" };
alarm($GLOBAL_TIMEOUT);
@compile = `./compile_test.sh 2>&1`;
alarm(0);
};
my $status = $?;
相反。
我认为最快的解决方案是在 bash 脚本底部添加一两秒的休眠以等待僵尸 icc 完成。但这没有用。
如果我还没有 SIG ALRM(在实际程序中),我同意最好的选择是将整个事情包装在一个 eval 中。甚至认为这对于一个 500 行的程序来说会很丑陋。
没有本地 ($?),每个 `system` 调用都会得到 $? = -1。 $?在这种情况下,我需要在 waitpid 之后,不幸的是在 sig 处理程序退出后设置为 -1。所以我觉得这行得通。 新行显示为###
my $timer_pid;
my $chld_status; ###
$SIG{CHLD} = sub {
local($!, $?);
while((my $pid = waitpid(-1, WNOHANG)) > 0)
{
$chld_status = $?; ###
if($pid == $timer_pid)
{
die "Timeout\n";
}
}
};
...
my @compile = `./compile_test.sh 2>&1`;
my $status = ($? == -1) ? $chld_status : $?; ###
...
我们有一个类似的问题,这是我们的解决方案:将 write-side 文件描述符泄漏到孙子中并从中读取(),这将阻塞直到它退出。
另请参阅:wait for children and grand-children
use Fcntl;
# OCF scripts invoked by Pacemaker will be killed by Pacemaker with
# a SIGKILL if the script exceeds the configured resource timeout. In
# addition to killing the script, Pacemaker also kills all of the children
# invoked by that script. Because it is a kill, the scripts cannot trap
# the signal and clean up; because all of the children are killed as well,
# we cannot simply fork and have the parent wait on the child. In order
# to work around that, we need the child not to have a parent proccess
# of the OCF script---and the only way to do that is to grandchild the
# process. However, we still want the parent to wait for the grandchild
# process to exit so that the OCF script exits when the grandchild is
# done and not before. This is done by leaking the write file descriptor
# from pipe() into the grandchild and then the parent reads the read file
# descriptor, thus blocking until it gets IO or the grandchild exits. Since
# the file descriptor is never written to by the grandchild, the parent
# blocks until the child exits.
sub grandchild_wait_exit
{
# We use "our" instead of "my" for the write side of the pipe. If
# we did not, then when the sub exits and $w goes out of scope,
# the file descriptor will close and the parent will exit.
pipe(my $r, our $w);
# Enable leaking the file descriptor into the children
my $flags = fcntl($w, F_GETFD, 0) or warn $!;
fcntl($w, F_SETFD, $flags & (~FD_CLOEXEC)) or die "Can't set flags: $!\n";
# Fork the child
my $child = fork();
if ($child) {
# We are the parent, waitpid for the child and
# then read to wait for the grandchild.
close($w);
waitpid($child, 0);
<$r>;
exit;
}
# Otherwise we are the child, so close the read side of the pipe.
close($r);
# Fork a grandchild, exit the child.
if (fork()) {
exit;
}
# Turn off leaking of the file descriptor in the grandchild so
# that no other process can write to the open file descriptor
# that would prematurely exit the parent.
$flags = fcntl($w, F_GETFD, 0) or warn $!;
fcntl($w, F_SETFD, $flags | FD_CLOEXEC) or die "Can't set flags: $!\n";
}
grandchild_wait_exit();
sleep 1;
print getppid() . "\n";
print "$$: gc\n";
sleep 30;
exit;
我有一个在 cron 中运行的 Perl 脚本(下面的片段)来执行系统检查。我 fork 一个 child 作为超时,并使用 SIG{CHLD} 获取它。 Perl 对 Bash 脚本进行多次系统调用并检查它们的退出状态。一个 bash 脚本在大约 5% 的时间内没有错误地失败。 Bash 脚本以 0 存在并且 Perl 看到 $?作为 -1 和 $!作为 "No child processes"。
此 bash 脚本测试编译器许可证,并且在 Bash 脚本完成后留下英特尔 icc(下面的 ps 输出)。我认为 icc 僵尸完成,迫使 Perl 进入 SIG{CHLD} 处理程序,这会吹走 $?在我能够阅读它之前的状态。
Compile status -1; No child processes
#!/usr/bin/perl
use strict;
use POSIX ':sys_wait_h';
my $GLOBAL_TIMEOUT = 1200;
### Timer to notify if this program hangs
my $timer_pid;
$SIG{CHLD} = sub {
local ($!, $?);
while((my $pid = waitpid(-1, WNOHANG)) > 0)
{
if($pid == $timer_pid)
{
die "Timeout\n";
}
}
};
die "Unable to fork\n" unless(defined($timer_pid = fork));
if($timer_pid == 0) # child
{
sleep($GLOBAL_TIMEOUT);
exit;
}
### End Timer
### Compile test
my @compile = `./compile_test.sh 2>&1`;
my $status = $?;
print "Compile status $status; $!\n";
if($status != 0)
{
print "@compile\n";
}
END # Timer cleanup
{
if($timer_pid != 0)
{
$SIG{CHLD} = 'IGNORE';
kill(15, $timer_pid);
}
}
exit(0);
#!/bin/sh
cc compile_test.c
if [ $? -ne 0 ]; then
echo "Cray compiler failure"
exit 1
fi
module swap PrgEnv-cray PrgEnv-intel
cc compile_test.c
if [ $? -ne 0 ]; then
echo "Intel compiler failure"
exit 1
fi
wait
ps
exit 0
等待并没有真正等待,因为 cc 调用 icc,它创建了一个等待(或等待 PID)不会阻塞的僵尸 grandchild 进程。 (等待 `pidof icc`,在本例中为 31589,给出 "not a child of this shell")
user 31589 1 0 12:47 pts/15 00:00:00 icc
我只是不知道如何在 Bash 或 Perl 中解决这个问题。
谢谢,克里斯
这不是 alarm
的用例吗?扔掉你的 SIGCHLD 处理程序并说
local $? = -1;
eval {
local $SIG{ALRM} = sub { die "Timeout\n" };
alarm($GLOBAL_TIMEOUT);
@compile = `./compile_test.sh 2>&1`;
alarm(0);
};
my $status = $?;
相反。
我认为最快的解决方案是在 bash 脚本底部添加一两秒的休眠以等待僵尸 icc 完成。但这没有用。
如果我还没有 SIG ALRM(在实际程序中),我同意最好的选择是将整个事情包装在一个 eval 中。甚至认为这对于一个 500 行的程序来说会很丑陋。
没有本地 ($?),每个 `system` 调用都会得到 $? = -1。 $?在这种情况下,我需要在 waitpid 之后,不幸的是在 sig 处理程序退出后设置为 -1。所以我觉得这行得通。 新行显示为###
my $timer_pid;
my $chld_status; ###
$SIG{CHLD} = sub {
local($!, $?);
while((my $pid = waitpid(-1, WNOHANG)) > 0)
{
$chld_status = $?; ###
if($pid == $timer_pid)
{
die "Timeout\n";
}
}
};
...
my @compile = `./compile_test.sh 2>&1`;
my $status = ($? == -1) ? $chld_status : $?; ###
...
我们有一个类似的问题,这是我们的解决方案:将 write-side 文件描述符泄漏到孙子中并从中读取(),这将阻塞直到它退出。
另请参阅:wait for children and grand-children
use Fcntl;
# OCF scripts invoked by Pacemaker will be killed by Pacemaker with
# a SIGKILL if the script exceeds the configured resource timeout. In
# addition to killing the script, Pacemaker also kills all of the children
# invoked by that script. Because it is a kill, the scripts cannot trap
# the signal and clean up; because all of the children are killed as well,
# we cannot simply fork and have the parent wait on the child. In order
# to work around that, we need the child not to have a parent proccess
# of the OCF script---and the only way to do that is to grandchild the
# process. However, we still want the parent to wait for the grandchild
# process to exit so that the OCF script exits when the grandchild is
# done and not before. This is done by leaking the write file descriptor
# from pipe() into the grandchild and then the parent reads the read file
# descriptor, thus blocking until it gets IO or the grandchild exits. Since
# the file descriptor is never written to by the grandchild, the parent
# blocks until the child exits.
sub grandchild_wait_exit
{
# We use "our" instead of "my" for the write side of the pipe. If
# we did not, then when the sub exits and $w goes out of scope,
# the file descriptor will close and the parent will exit.
pipe(my $r, our $w);
# Enable leaking the file descriptor into the children
my $flags = fcntl($w, F_GETFD, 0) or warn $!;
fcntl($w, F_SETFD, $flags & (~FD_CLOEXEC)) or die "Can't set flags: $!\n";
# Fork the child
my $child = fork();
if ($child) {
# We are the parent, waitpid for the child and
# then read to wait for the grandchild.
close($w);
waitpid($child, 0);
<$r>;
exit;
}
# Otherwise we are the child, so close the read side of the pipe.
close($r);
# Fork a grandchild, exit the child.
if (fork()) {
exit;
}
# Turn off leaking of the file descriptor in the grandchild so
# that no other process can write to the open file descriptor
# that would prematurely exit the parent.
$flags = fcntl($w, F_GETFD, 0) or warn $!;
fcntl($w, F_SETFD, $flags | FD_CLOEXEC) or die "Can't set flags: $!\n";
}
grandchild_wait_exit();
sleep 1;
print getppid() . "\n";
print "$$: gc\n";
sleep 30;
exit;