Select 使用 "Before" 参数时每组的最后一场比赛

Select the last match per group while using "Before" parameter

我有一个看起来像这样的文本文件...

<title> south asia </title>
India is country that is part of south asia.
<title> africa </title>
kenya is a country that is part of africa.

此命令按预期工作,returns 正确的类别...

grep -B1 'kenya' wiki.txt | grep title

但是如果文本文件看起来像这样,这个技巧就不起作用了...

<title> south asia </title>
India is country that is part of south asia.
<title> africa </title>
List of countries:
kenya is a country that is part of africa.

如果我不知道“之前”参数的正确值,那么我会得到额外的(错误的)标题。

# grep -B5 'kenya' wiki.txt | grep title
<title> south asia </title>
<title> africa </title>

在使用 -B 参数时是否可以 select 每个组的最后一个“标题”?

预计: title africa title 行应该返回 "kenya" 这个词,即使我不知道文章中使用的行数。

这是一个 tac + awk 解决方案,使用您展示的尝试和示例编写和测试。

tac Input_file | 
awk -F'>[[:space:]]+|[[:space:]]*<' '
/kenya/{
  found=1
  next
}
found && /\<title\>/{
  print
  found=""
}
'

说明:为以上代码添加详细说明。

tac Input_file |                        ##Using tac to print file from bottom to top and sending its output as input to awk program.
awk -F'>[[:space:]]+|[[:space:]]*<' '   ##Starting awk program setting field separator to >[[:space:]]+ OR [[:space:]]*< here.
/kenya/{                                ##Checking condition if word kenya is found in line.
  found=1                               ##Setting found to 1 here.
  next                                  ##next will skip all further statements from here.
}
found && /\<title\>/{                   ##Checking if found is SET and it contains title.
  print                                 ##Printing current line.
  found=""                              ##Nullifying found here.
}
'

可以取最后一行输出:

grep -B5 'kenya' wiki.txt | grep title | tail -n 1

使用 awk 只需将 title 记录存储在变量 (t) 中,并在遇到匹配词时打印它们 (变量 w):

$ awk -vw='kenya' '/<title>/ {t=[=10=]} [=10=]~w {print t}' wiki.txt
<title> africa </title>