在 linux 中包含特定单词的每一行之前引入一个新行
Introduce a new line before every row containing a specific word in linux
我是 linux 的新手。我有像下面这样的制表符分隔文本文件
A1 title body.1 gene
A1 head head.1 head
A1 trunk trunk.1 trunk
A1 tail tail.1 tail
A2 title body.2 gene
A2 head head.2 head
A2 trunk trunk.2 trunk
A2 tail tail.2 tail
A3 title body.3 gene
A3 head head.3 head
A3 trunk trunk.3 trunk
A4 title title.4 gene
A4 trunk trunk.4 trunk
A4 tail tail.4 tail
我想在最后一列中包含单词 "gene" 的每一行之前引入一个新行,如下所示:
A1 title body.1 gene
A1 head head.1 head
A1 trunk trunk.1 trunk
A1 tail tail.1 tail
A2 title body.2 gene
A2 head head.2 head
A2 trunk trunk.2 trunk
A2 tail tail.2 tail
A3 title body.3 gene
A3 head head.3 head
A3 trunk trunk.3 trunk
A4 title title.4 gene
A4 trunk trunk.4 trunk
A4 tail tail.4 tail
我尝试了以下命令
sed 's/gene/\
\n&\g' file.txt
但它在包含单词 "gene" 的行之后引入了一个新行。
如果有人能指导我如何在最后一列中包含单词“gene
”的行之前引入新行,那就太好了。
使用反向引用
sed 's/\(^.*gene\)/\n/g' file.txt
只需检查最后一个字段是否为 gene
。如果是这样,打印一个空行:
awk '$NF=="gene" {print ""}1' file
这个returns:
$ awk '$NF=="gene" {print ""}1' file
A1 title body.1 gene
A1 head head.1 head
A1 trunk trunk.1 trunk
A1 tail tail.1 tail
A2 title body.2 gene
A2 head head.2 head
A2 trunk trunk.2 trunk
A2 tail tail.2 tail
A3 title body.3 gene
A3 head head.3 head
A3 trunk trunk.3 trunk
A4 title title.4 gene
A4 trunk trunk.4 trunk
A4 tail tail.4 tail
您可能想要这样的东西(扩展的正则表达式语法):
$ sed -r 's/(^.*?\tgene$)/\n/' example
A1 title body.1 gene
A1 head head.1 head
A1 trunk trunk.1 trunk
A1 tail tail.1 tail
A2 title body.2 gene
A2 head head.2 head
A2 trunk trunk.2 trunk
A2 tail tail.2 tail
A3 title body.3 gene
A3 head head.3 head
A3 trunk trunk.3 trunk
A4 title title.4 gene
A4 trunk trunk.4 trunk
A4 tail tail.4 tail
在这个正则表达式中你可以看到:
- 替换命令
's/.../.../'
- 捕获以制表符和基因结尾的整行的组:
(^.*?\tgene$)
。
- 将换行符和先前捕获的组(第一个也是唯一的)插入到结果中:
\n
请注意你的问题有一个问题:
I would like introduce a new line before every row containing word
"gene" in the last column
这导致假设您需要结果的第一行为空(或者准确地说是一个换行符)
但是您的示例的第一行前面显然没有空行。
如果这确实是您所需要的,您应该使用 sed 寻址:
pono@pono-carbon:~$ sed -r '2,$s/(^.*?\tgene$)/\n/' example
A1 title body.1 gene
A1 head head.1 head
A1 trunk trunk.1 trunk
A1 tail tail.1 tail
A2 title body.2 gene
A2 head head.2 head
A2 trunk trunk.2 trunk
A2 tail tail.2 tail
A3 title body.3 gene
A3 head head.3 head
A3 trunk trunk.3 trunk
A4 title title.4 gene
A4 trunk trunk.4 trunk
A4 tail tail.4 tail
在 sed 中你可以使用插入命令 i:
sed '2,${/[\t ]gene$/i\
;}' file
2,$
条件用于防止在开头添加前导换行符。
我是 linux 的新手。我有像下面这样的制表符分隔文本文件
A1 title body.1 gene
A1 head head.1 head
A1 trunk trunk.1 trunk
A1 tail tail.1 tail
A2 title body.2 gene
A2 head head.2 head
A2 trunk trunk.2 trunk
A2 tail tail.2 tail
A3 title body.3 gene
A3 head head.3 head
A3 trunk trunk.3 trunk
A4 title title.4 gene
A4 trunk trunk.4 trunk
A4 tail tail.4 tail
我想在最后一列中包含单词 "gene" 的每一行之前引入一个新行,如下所示:
A1 title body.1 gene
A1 head head.1 head
A1 trunk trunk.1 trunk
A1 tail tail.1 tail
A2 title body.2 gene
A2 head head.2 head
A2 trunk trunk.2 trunk
A2 tail tail.2 tail
A3 title body.3 gene
A3 head head.3 head
A3 trunk trunk.3 trunk
A4 title title.4 gene
A4 trunk trunk.4 trunk
A4 tail tail.4 tail
我尝试了以下命令
sed 's/gene/\
\n&\g' file.txt
但它在包含单词 "gene" 的行之后引入了一个新行。
如果有人能指导我如何在最后一列中包含单词“gene
”的行之前引入新行,那就太好了。
使用反向引用
sed 's/\(^.*gene\)/\n/g' file.txt
只需检查最后一个字段是否为 gene
。如果是这样,打印一个空行:
awk '$NF=="gene" {print ""}1' file
这个returns:
$ awk '$NF=="gene" {print ""}1' file
A1 title body.1 gene
A1 head head.1 head
A1 trunk trunk.1 trunk
A1 tail tail.1 tail
A2 title body.2 gene
A2 head head.2 head
A2 trunk trunk.2 trunk
A2 tail tail.2 tail
A3 title body.3 gene
A3 head head.3 head
A3 trunk trunk.3 trunk
A4 title title.4 gene
A4 trunk trunk.4 trunk
A4 tail tail.4 tail
您可能想要这样的东西(扩展的正则表达式语法):
$ sed -r 's/(^.*?\tgene$)/\n/' example
A1 title body.1 gene
A1 head head.1 head
A1 trunk trunk.1 trunk
A1 tail tail.1 tail
A2 title body.2 gene
A2 head head.2 head
A2 trunk trunk.2 trunk
A2 tail tail.2 tail
A3 title body.3 gene
A3 head head.3 head
A3 trunk trunk.3 trunk
A4 title title.4 gene
A4 trunk trunk.4 trunk
A4 tail tail.4 tail
在这个正则表达式中你可以看到:
- 替换命令
's/.../.../'
- 捕获以制表符和基因结尾的整行的组:
(^.*?\tgene$)
。 - 将换行符和先前捕获的组(第一个也是唯一的)插入到结果中:
\n
请注意你的问题有一个问题:
I would like introduce a new line before every row containing word "gene" in the last column
这导致假设您需要结果的第一行为空(或者准确地说是一个换行符)
但是您的示例的第一行前面显然没有空行。 如果这确实是您所需要的,您应该使用 sed 寻址:
pono@pono-carbon:~$ sed -r '2,$s/(^.*?\tgene$)/\n/' example
A1 title body.1 gene
A1 head head.1 head
A1 trunk trunk.1 trunk
A1 tail tail.1 tail
A2 title body.2 gene
A2 head head.2 head
A2 trunk trunk.2 trunk
A2 tail tail.2 tail
A3 title body.3 gene
A3 head head.3 head
A3 trunk trunk.3 trunk
A4 title title.4 gene
A4 trunk trunk.4 trunk
A4 tail tail.4 tail
在 sed 中你可以使用插入命令 i:
sed '2,${/[\t ]gene$/i\
;}' file
2,$
条件用于防止在开头添加前导换行符。