Linux 如果后跟管道，tee 命令有时会失败

Question

I 运行带有 tee 日志和 xargs 进程输出的查找命令；我不小心忘记在第二个管道中添加 xargs 并发现了这个问题。

范例：

% tree
.
├── a.sh
└── home
    └── localdir
        ├── abc_3
        ├── abc_6
        ├── mydir_1
        ├── mydir_2
        └── mydir_3

7 directories, 1 file

而a.sh的内容是：

% cat a.sh
#!/bin/bash
LOG="/tmp/abc.log"

find home/localdir -name "mydir*" -type d  -print | tee $LOG | echo

如果我使用某些命令添加第二个管道，例如 echo 或 ls，写日志操作有时会失败。

这些是我多次运行./a.sh时的一些例子：

% bash -x ./a.sh; cat /tmp/abc.log  // this tee failed
+ LOG=/tmp/abc.log
+ find home/localdir -name 'mydir*' -type d -print
+ tee /tmp/abc.log
+ echo


% bash -x ./a.sh; cat /tmp/abc.log  // this tee ok
+ LOG=/tmp/abc.log
+ find home/localdir -name 'mydir*' -type d -print
+ tee /tmp/abc.log
+ echo

home/localdir/mydir_2  // this is cat /tmp/abc.log output
home/localdir/mydir_3
home/localdir/mydir_1

为什么如果我用某个命令添加第二个管道（忘记 xargs），tee 命令偶尔会失败？

Answer 1

问题是，默认情况下，tee 在写入管道失败时退出。所以，考虑：

find home/localdir -name "mydir*" -type d  -print | tee $LOG | echo

如果 echo 先完成，管道将失败并且 tee 将退出。不过，时机并不准确。管道中的每个命令都在一个单独的子 shell 中。此外，还有缓冲的变幻莫测。因此，有时日志文件是在 tee 退出之前写入的，有时不是。

为清楚起见，让我们考虑一个更简单的管道：

$ seq 10 | tee abc.log | true; declare -p PIPESTATUS; cat abc.log
declare -a PIPESTATUS='([0]="0" [1]="0" [2]="0")'
1
2
3
4
5
6
7
8
9
10
$ seq 10 | tee abc.log | true; declare -p PIPESTATUS; cat abc.log
declare -a PIPESTATUS='([0]="0" [1]="141" [2]="0")'
$

在第一次执行中，管道中的每个进程都以成功状态退出并写入日志文件。在第二次执行同一命令时，tee 失败，退出代码为 141，并且未写入日志文件。

我用 true 代替 echo 来说明 echo 没有什么特别之处。 tee 之后可能会拒绝输入的任何命令都存在此问题。

文档

tee 的最新版本有一个选项可以控制管道失败退出行为。来自 man tee 来自 coreutils-8.25：

--output-error[=MODE]
set behavior on write error. See MODE below

MODE 的可能性是：

MODE determines behavior with write errors on the outputs:
   'warn' diagnose errors writing to any output

   'warn-nopipe'
          diagnose errors writing to any output not a pipe

   'exit' exit on error writing to any output

   'exit-nopipe'
          exit on error writing to any output not a pipe
The default MODE for the -p option is 'warn-nopipe'. The default operation when --output-error is not specified, is to exit immediately on error writing to a pipe, and diagnose errors writing to non pipe outputs.

如您所见，默认行为是"立即退出写入管道时出错。因此，如果在 tee 写入日志文件之前尝试写入 tee 之后的进程失败，则 tee 将退出不写日志文件。

Answer 2

我调试了 tee 源代码，但我不熟悉 Linux C，所以可能有问题。

tee属于coreutils包，在src/tee.c

下

首先，它设置缓冲区：

setvbuf (stdout, NULL, _IONBF, 0); // for standard output
setvbuf (descriptors[i], NULL, _IONBF, 0);  // for file descriptor

所以它是无缓冲的？

其次，tee 将 stdout 作为描述符数组中的 first 项，并将使用 for 循环写入描述符：

/* In the array of NFILES + 1 descriptors, make
   the first one correspond to standard output.   */
descriptors[0] = stdout;
files[0] = _("standard output");
setvbuf (stdout, NULL, _IONBF, 0);

...

  for (i = 0; i <= nfiles; i++) {
    if (descriptors[i]
        && fwrite (buffer, bytes_read, 1, descriptors[i]) != 1)  // failed!!!
      {
        error (0, errno, "%s", files[i]);
        descriptors[i] = NULL;
        ok = false;
      }
    }

比如tee a.log，descriptors[0]是stdout，descriptors[1]是a.log。

正如@John1024所说，管道是并行（我之前误解了）。第二个管道命令，如echo、ls或true、不接受input，所以不会"wait" input，如果它执行得更快，它会在tee写入输出end之前关闭管道（input end），所以上面的代码，注释行将失败而不是不继续写入文件描述符。

供应：

strace 结果 killed by SIGPIPE:

write(1, "1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n", 21) = -1 EPIPE (Broken pipe)
--- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=22649, si_uid=1000} ---
+++ killed by SIGPIPE +++

Answer 3

是的，从 tee 管道到提前退出的东西（在你的情况下不依赖于从 tee 读取输入）会导致间歇性错误。有关此陷阱的摘要，请参阅：

http://www.pixelbeat.org/docs/coreutils-gotchas.html#tee

Linux 如果后跟管道，tee 命令有时会失败

Linux tee command occasionally fails if followed by a pipe

linux

bash

tee

文档