仅当列的值存在于文本文件中时才从 .anno 文件获取行

Getting a row from a .anno file only when the value of a column is present in a text file

我真的是脚本和堆栈的新手,所以如果我的问题很愚蠢或放错地方,我很抱歉。

我必须在 Bash 完成任务。

我有一个像这样的 DATA.anno 文件:

ID POP LOCALITY
1  Apu Italy
2  Apu Italy
3  Tir Albania
4  Tir Albania
5  Ber Germany
6  Ber Germany

我有一个 pop.txt 文件,其中包含前面文件第二列中存在的两个人口名称:

Apu
Ber

现在我想获取另一个文件,其中仅包含 pop.txt 文件中存在的人口行。在这种情况下,我要获取的输出文件如下:

ID POP LOCALITY
1  Apu Italy
2  Apu Italy
4  Ber Germany
5  Ber Germany

我试过这个脚本,但它似乎不起作用:

cat pop.txt | while read line; do grep $line DATA.anno | cut -f 2,3 >> outputfile.txt

能否请您尝试以下。

awk 'BEGIN{print "ID POP LOCALITY"} FNR==NR{array[[=10=]];next} ( in array)'   pop.txt data.anno

说明:添加代码的详细说明。

awk '                         ##Starting awk program from here.
BEGIN{                        ##Starting BEGIN section from here.
  print "ID POP LOCALITY"     ##Printing headers here.
}
FNR==NR{                      ##Checking condition FNR==NR which will be TRUE when first Input_fie is being read.
  array[[=11=]]                   ##Creating array with index of current line.
  next                        ##next will skip all further statements from here.
}
( in array)                 ##Checking condition if current line 2nd field is present in array then print that line.
'   pop.txt data.anno         ##Mentioning Input_file names here.