为什么我的 grep 命令在某些行之间输出“--”?

Why does my grep command output "--" between some lines?

我有一个 fasta 文件,就像这里的测试文件一样:

>HWI-D00196:168:C66U5ANXX:3:1106:16404:19663 1:N:0:GCCAAT
CCTAGCACCATGATTTAATGTTTCTTTTGTACGTTCTTTCTTTGGAAACTGCACTTGTTGCAACCTTGCAAGCCATATAAACACATTTCAGATATAAGGCT
>HWI-D00196:168:C66U5ANXX:3:1106:16404:19663 2:N:0:GCCAAT
AAAACATAAATTTGAGCTTGACAAAAATTAAAAATGAGCCCAGCCTTATATCTGAAATGTGTTTATATGGCTTGCAAGGTTGCAACAAGTGCAGTTTCCAA
>HWI-D00196:168:C66U5ANXX:4:1304:10466:100132 1:N:0:GCCAAT
ATATTTGAATTATCAGAAATAAACACAAAGAAAACCTAGAACAGATAATTTCTTCCACATTATTGATCAGATACAGATTTCAAGGGTACCGTTGTGAATTG
>HWI-D00196:168:C66U5ANXX:4:1304:10466:100132 2:N:0:GCCAAT
AAACGATTGATAGATCTATTTGCATTATAAAAACATTAAAAAAACAAAATACTGATTAAATGTCGTCTTTCTATTCCACAATTTTATAGATCTCACTGTAT
>HWI-D00196:168:C66U5ANXX:4:1307:12056:64030 1:N:0:GCCAAT
CTTACTTTGCCTCTCTCAGCCAATGTCTCCTGAGTCTAATTTTTTGGAGGCTAAGCTATGAGCTAATGATGGGTTCCATTTGGGGCCAATGCTTCAGCCTG
>HWI-D00196:168:C66U5ANXX:4:1307:12056:64030 2:N:0:GCCAAT
CTATTAGTTCTTATCTTTGCCTGCAAATATAAGACTAGCGCTTGAGTAGCTGACAGAGACAAAGTAAGCTGGAGTGTTTATCACCTGGTCACTCCAATTGT

当我输入一个简单的 grep 命令时:

grep -B1 "CTT" test.fasta

我得到一个非常奇怪的输出,其中“--”有时会放在 grep 命中上方的换行符上,如下所示:

>HWI-D00196:168:C66U5ANXX:4:1304:10466:100132 2:N:0:GCCAAT
AAACGATTGATAGATCTATTTGCATTATAAAAACATTAAAAAAACAAAATACTGATTAAATGTCGTCTTTCTATTCCACAATTTTATAGATCTCACTGTAT
--
>HWI-D00196:168:C66U5ANXX:4:1307:12056:64030 2:N:0:GCCAAT
CTATTAGTTCTTATCTTTGCCTGCAAATATAAGACTAGCGCTTGAGTAGCTGACAGAGACAAAGTAAGCTGGAGTGTTTATCACCTGGTCACTCCAATTGT

我不明白为什么有些 fasta 条目有这个而有些没有。当我删除 -B1 时,我没有遇到这个问题。我可以使用 grep -v "--" 语句从我的文件中删除这些行,但我真的很想了解这里发生了什么。

您正在使用 -B1 选项请求一行前导上下文。这意味着 grep 将同时显示匹配的行和它之前的行。每个匹配项将在一行中由 -- 分隔,如下所示:

$ man grep | grep -B1 context
     -A num, --after-context=num
             Print num lines of trailing context after each match.  See also
--
     -B num, --before-context=num
             Print num lines of leading context before each match.  See also
--
     -C[num, --context=num]
             Print num lines of leading and trailing context surrounding each
--
     --context[=num]
             Print num lines of leading and trailing context.  The default is

您在每场比赛之间看不到 -- 的原因是上下文仅显示在一系列连续比赛的上方。所以看下面的例子:

seq 13 | grep -B1 1
1
--
9
10
11
12
13

seq 命令生成 1 到 13 之间的所有数字。只有第一行和从 10 开始的行包含一个 1,因此您会看到 1 在其自己的组中,然后是 --,然后是一行上下文,然后是一组连续的匹配行。

grep 联机帮助页的

GREP_COLORS 部分说:

Specifies the colors and other attributes used to highlight various > parts of the output. Its value is a colon-separated list of capabilities that defaults to ms=01;31:mc=01;31:sl=:cx=:fn=35:ln=32:bn=32:se=36 with the rv and ne boolean capabilities omitted (i.e., false).

se=36
SGR substring for separators that are inserted between selected line fields (:), between context line fields, (-), and between groups of adjacent lines when nonzero context is specified (--). The default is a cyan text foreground over the terminal's default background.

考虑文件 sample.txt :

$cat sample.txt
ABBB
AAB
AAB
S
S
S
AABB
ABAA
BAA
CCC
$grep -B2 'AAB' sample.txt
ABBB
AAB
AAB
--
S
S
AABB

这里--grep的方式告诉你--之前的AAB--之后的S不相邻实际文件中的行。