awk 显示两个匹配项之间的行
awk show lines between two matches
我想从 <div class="AA">
和 <div class="clear"></div>
之间的文件中提取行。
也欢迎带 sed
和 grep
的正则表达式。
更新
这是我巨大的 XML 文件的一部分:
RUBBISH
RUBBISH
.
.
.
<div class="span9">
<div class="results-count">AAA</div>
<div class="AA">
<div class="A"><a href="/TEST">BBB</a>
</div>
<div class="BB"><span>CCC</span><br/><a href="/TEST1" class="B">DDD</a>
<div></div><span>EEE</span><br/><img src="TEST2" title="C"/><a href="/TEST3" class="D">FFF</a>,
<a href="/TEST4" class="E">GGG</a>
<div class="clear"></div><a href="/TEST5" class="details">Details</a>
</div>
<pre>HHH</pre>
<div class="clear"></div>
.
.
.
<div class="span9">
<div class="results-count">AAA</div>
<div class="AA">
<div class="A"><a href="/TEST">BBB</a>
</div>
<div class="BB"><span>CCC</span><br/><a href="/TEST1" class="B">DDD</a>
<div></div><span>EEE</span><br/><img src="TEST2" title="C"/><a href="/TEST3" class="D">FFF</a>,
<a href="/TEST4" class="E">GGG</a>
<div class="clear"></div><a href="/TEST5" class="details">Details</a>
</div>
<pre>HHH</pre>
<div class="clear"></div>
RUBBISH
RUBBISH
<div class="span9">
<div class="results-count">AAA</div>
<div class="AA">
<div class="A"><a href="/TEST">BBB</a>
</div>
<div class="BB"><span>CCC</span><br/><a href="/TEST1" class="B">DDD</a>
<div></div><span>EEE</span><br/><img src="TEST2" title="C"/><a href="/TEST3" class="D">FFF</a>,
<a href="/TEST4" class="E">GGG</a>
<div class="clear"></div><a href="/TEST5" class="details">Details</a>
</div>
<pre>HHH</pre>
<div class="clear"></div>
.
.
.
awk '/<div class="clear"><\/div>/{p=0} p{print} /<div class="results-count">/{p=1}'
通过grep,
$ grep -ozP '(?s)(?:\n|^)\s*<div class="results-count">[^\n]*\n\K.*?(?=\n\s*<div class="clear"></div>)' file
<div class="AA">
<div class="A"><a href="/TEST">BBB</a>
</div>
<div class="BB"><span>CCC</span><br/><a href="/TEST1" class="B">DDD</a>
<div></div><span>EEE</span><br/><img src="TEST2" title="C"/><a href="/TEST3" class="D">FFF</a>,
<a href="/TEST4" class="E">GGG</a>
我想从 <div class="AA">
和 <div class="clear"></div>
之间的文件中提取行。
也欢迎带 sed
和 grep
的正则表达式。
更新
这是我巨大的 XML 文件的一部分:
RUBBISH
RUBBISH
.
.
.
<div class="span9">
<div class="results-count">AAA</div>
<div class="AA">
<div class="A"><a href="/TEST">BBB</a>
</div>
<div class="BB"><span>CCC</span><br/><a href="/TEST1" class="B">DDD</a>
<div></div><span>EEE</span><br/><img src="TEST2" title="C"/><a href="/TEST3" class="D">FFF</a>,
<a href="/TEST4" class="E">GGG</a>
<div class="clear"></div><a href="/TEST5" class="details">Details</a>
</div>
<pre>HHH</pre>
<div class="clear"></div>
.
.
.
<div class="span9">
<div class="results-count">AAA</div>
<div class="AA">
<div class="A"><a href="/TEST">BBB</a>
</div>
<div class="BB"><span>CCC</span><br/><a href="/TEST1" class="B">DDD</a>
<div></div><span>EEE</span><br/><img src="TEST2" title="C"/><a href="/TEST3" class="D">FFF</a>,
<a href="/TEST4" class="E">GGG</a>
<div class="clear"></div><a href="/TEST5" class="details">Details</a>
</div>
<pre>HHH</pre>
<div class="clear"></div>
RUBBISH
RUBBISH
<div class="span9">
<div class="results-count">AAA</div>
<div class="AA">
<div class="A"><a href="/TEST">BBB</a>
</div>
<div class="BB"><span>CCC</span><br/><a href="/TEST1" class="B">DDD</a>
<div></div><span>EEE</span><br/><img src="TEST2" title="C"/><a href="/TEST3" class="D">FFF</a>,
<a href="/TEST4" class="E">GGG</a>
<div class="clear"></div><a href="/TEST5" class="details">Details</a>
</div>
<pre>HHH</pre>
<div class="clear"></div>
.
.
.
awk '/<div class="clear"><\/div>/{p=0} p{print} /<div class="results-count">/{p=1}'
通过grep,
$ grep -ozP '(?s)(?:\n|^)\s*<div class="results-count">[^\n]*\n\K.*?(?=\n\s*<div class="clear"></div>)' file
<div class="AA">
<div class="A"><a href="/TEST">BBB</a>
</div>
<div class="BB"><span>CCC</span><br/><a href="/TEST1" class="B">DDD</a>
<div></div><span>EEE</span><br/><img src="TEST2" title="C"/><a href="/TEST3" class="D">FFF</a>,
<a href="/TEST4" class="E">GGG</a>