grep 如何知道它正在写入输入文件?
How does grep know it is writing to the input file?
如果我尝试将 grep
的输出重定向到它正在读取的同一个文件,如下所示:
$ grep stuff file.txt > file.txt
我收到错误消息 grep: input file 'file.txt' is also the output
。 grep
如何确定这一点?
根据 GNU grep 源代码,grep
检查输入和输出的 i 节点:
if (!out_quiet && list_files == 0 && 1 < max_count
&& S_ISREG (out_stat.st_mode) && out_stat.st_ino
&& SAME_INODE (st, out_stat)) /* <------------------ */
{
if (! suppress_errors)
error (0, 0, _("input file %s is also the output"), quote (filename));
errseen = 1;
goto closeout;
}
通过调用 fstat
对 STDOUT_FILENO
填充 out_stat
。
if (fstat (STDOUT_FILENO, &tmp_stat) == 0 && S_ISREG (tmp_stat.st_mode))
out_stat = tmp_stat;
查看源代码 - 您可以看到它会检查这种情况(文件已打开供 grep
读取)并报告它,请参阅下面的 SAME_INODE
检查:
/* If there is a regular file on stdout and the current file refers
to the same i-node, we have to report the problem and skip it.
Otherwise when matching lines from some other input reach the
disk before we open this file, we can end up reading and matching
those lines and appending them to the file from which we're reading.
Then we'd have what appears to be an infinite loop that'd terminate
only upon filling the output file system or reaching a quota.
However, there is no risk of an infinite loop if grep is generating
no output, i.e., with --silent, --quiet, -q.
Similarly, with any of these:
--max-count=N (-m) (for N >= 2)
--files-with-matches (-l)
--files-without-match (-L)
there is no risk of trouble.
For --max-count=1, grep stops after printing the first match,
so there is no risk of malfunction. But even --max-count=2, with
input==output, while there is no risk of infloop, there is a race
condition that could result in "alternate" output. */
if (!out_quiet && list_files == 0 && 1 < max_count
&& S_ISREG (out_stat.st_mode) && out_stat.st_ino
&& SAME_INODE (st, out_stat))
{
if (! suppress_errors)
error (0, 0, _("input file %s is also the output"), quote (filename));
errseen = true;
goto closeout;
}
以下是写回某个文件的方法:
grep stuff file.txt > tmp && mv tmp file.txt
尝试使用 cat 或 tac 进行管道处理:
cat file | grep 'searchpattern' > newfile
这是实现的最佳实践和简称
如果我尝试将 grep
的输出重定向到它正在读取的同一个文件,如下所示:
$ grep stuff file.txt > file.txt
我收到错误消息 grep: input file 'file.txt' is also the output
。 grep
如何确定这一点?
根据 GNU grep 源代码,grep
检查输入和输出的 i 节点:
if (!out_quiet && list_files == 0 && 1 < max_count
&& S_ISREG (out_stat.st_mode) && out_stat.st_ino
&& SAME_INODE (st, out_stat)) /* <------------------ */
{
if (! suppress_errors)
error (0, 0, _("input file %s is also the output"), quote (filename));
errseen = 1;
goto closeout;
}
通过调用 fstat
对 STDOUT_FILENO
填充 out_stat
。
if (fstat (STDOUT_FILENO, &tmp_stat) == 0 && S_ISREG (tmp_stat.st_mode))
out_stat = tmp_stat;
查看源代码 - 您可以看到它会检查这种情况(文件已打开供 grep
读取)并报告它,请参阅下面的 SAME_INODE
检查:
/* If there is a regular file on stdout and the current file refers
to the same i-node, we have to report the problem and skip it.
Otherwise when matching lines from some other input reach the
disk before we open this file, we can end up reading and matching
those lines and appending them to the file from which we're reading.
Then we'd have what appears to be an infinite loop that'd terminate
only upon filling the output file system or reaching a quota.
However, there is no risk of an infinite loop if grep is generating
no output, i.e., with --silent, --quiet, -q.
Similarly, with any of these:
--max-count=N (-m) (for N >= 2)
--files-with-matches (-l)
--files-without-match (-L)
there is no risk of trouble.
For --max-count=1, grep stops after printing the first match,
so there is no risk of malfunction. But even --max-count=2, with
input==output, while there is no risk of infloop, there is a race
condition that could result in "alternate" output. */
if (!out_quiet && list_files == 0 && 1 < max_count
&& S_ISREG (out_stat.st_mode) && out_stat.st_ino
&& SAME_INODE (st, out_stat))
{
if (! suppress_errors)
error (0, 0, _("input file %s is also the output"), quote (filename));
errseen = true;
goto closeout;
}
以下是写回某个文件的方法:
grep stuff file.txt > tmp && mv tmp file.txt
尝试使用 cat 或 tac 进行管道处理:
cat file | grep 'searchpattern' > newfile
这是实现的最佳实践和简称