删除匹配和上一行

Question

我需要使用 grep、awk、sed 或其他工具从流中删除包含 "not a dynamic executable" 的行和前一行。我当前的工作解决方案是 tr 整个流以去除换行符，然后使用 sed 将我的匹配项之前的换行符替换为其他内容，然后使用 tr 将换行符添加回去，然后使用 grep -v。我有点厌倦了这种方法的人工制品，但我现在看不出还有什么办法可以做到：

tr '\n' '|' | sed 's/|\tnot a dynamic executable/__MY_REMOVE/g' | tr '|' '\n'

编辑：

输入是通过管道传输到 xargs ldd 的混合文件列表，基本上我想忽略所有关于非库文件的输出，因为这与我接下来要做的事情无关。我不想使用 lib*.so 掩码，因为这可能会有所不同

Answer 1

在多行模式下使用 pcregrep 最简单：

pcregrep -vM '\n\tnot a dynamic executable' filename

如果 pcregrep 对您不可用，那么 awk 或 sed 也可以通过提前阅读一行并在出现标记行时跳过前几行的打印来执行此操作.

你可能对 awk 感到无聊（但理智）：

awk '/^\tnot a dynamic executable/ { flag = 1; next } !flag && NR > 1 { print lastline; } { flag = 0; lastline = [=11=] } END { if(!flag) print }' filename

即：

/^\tnot a dynamic executable/ {  # in lines that start with the marker
  flag = 1                       # set a flag
  next                           # and do nothing (do not print the last line)
}
!flag && NR > 1 {                # if the last line was not flagged and
                                 # is not the first line
  print lastline                 # print it
}
{                                # and if you got this far,
  flag = 0                       # unset the flag
  lastline = [=12=]                  # and remember the line to be possibly
                                 # printed.
}
END {                            # in the end
  if(!flag) print                # print the last line if it was not flagged
}

但是 sed 很有趣：

sed ':a; $! { N; /\n\tnot a dynamic executable/ d; P; s/.*\n//; ba }' filename

解释：

:a                                  # jump label

$! {                                # unless we reached the end of the input:

  N                                 # fetch the next line, append it

  /\n\tnot a dynamic executable/ d  # if the result contains a newline followed
                                    # by "\tnot a dynamic executable", discard
                                    # the pattern space and start at the top
                                    # with the next line. This effectively
                                    # removes the matching line and the one
                                    # before it from the output.

                                    # Otherwise:
  P                                 # print the pattern space up to the newline
  s/.*\n//                          # remove the stuff we just printed from
                                    # the pattern space, so that only the
                                    # second line is in it

  ba                                # and go to a
}
                                    # and at the end, drop off here to print
                                    # the last line (unless it was discarded).

或者，如果文件小到可以完全存储在内存中：

sed ':a $!{N;ba}; s/[^\n]*\n\tnot a dynamic executable[^\n]*\n//g' filename

在哪里

:a $!{ N; ba }                                  # read the whole file into
                                                # the pattern space
s/[^\n]*\n\tnot a dynamic executable[^\n]*\n//g # and cut out the offending bit.

Answer 2

永远记住，虽然 grep 和 sed 是面向行的，但 awk 是面向记录的，因此可以轻松处理跨越多行的问题。

这是一个猜测，因为您没有 post 任何示例输入和预期输出，但听起来您只需要（使用 GNU awk 进行多字符 RS）：

awk -v RS='^$' -v ORS= '{gsub(/[^\n]+\n\tnot a dynamic executable/,"")}1' file

Answer 3

这可能对你有用 (GNU sed)：

sed 'N;/\n.*not a dynamic executable/d;P;D' file

这会保留 2 行的移动 window，如果在第二行中找到所需的字符串，则将它们都删除。如果不是，则打印第一行然后删除，然后追加下一行并重复该过程。

删除匹配和上一行

Remove matching and previous line

regex

awk

grep

sed

tr