仅当列的值存在于文本文件中时才从 .anno 文件获取行
Getting a row from a .anno file only when the value of a column is present in a text file
我真的是脚本和堆栈的新手,所以如果我的问题很愚蠢或放错地方,我很抱歉。
我必须在 Bash 完成任务。
我有一个像这样的 DATA.anno 文件:
ID POP LOCALITY
1 Apu Italy
2 Apu Italy
3 Tir Albania
4 Tir Albania
5 Ber Germany
6 Ber Germany
我有一个 pop.txt 文件,其中包含前面文件第二列中存在的两个人口名称:
Apu
Ber
现在我想获取另一个文件,其中仅包含 pop.txt 文件中存在的人口行。在这种情况下,我要获取的输出文件如下:
ID POP LOCALITY
1 Apu Italy
2 Apu Italy
4 Ber Germany
5 Ber Germany
我试过这个脚本,但它似乎不起作用:
cat pop.txt | while read line; do grep $line DATA.anno | cut -f 2,3 >> outputfile.txt
能否请您尝试以下。
awk 'BEGIN{print "ID POP LOCALITY"} FNR==NR{array[[=10=]];next} ( in array)' pop.txt data.anno
说明:添加代码的详细说明。
awk ' ##Starting awk program from here.
BEGIN{ ##Starting BEGIN section from here.
print "ID POP LOCALITY" ##Printing headers here.
}
FNR==NR{ ##Checking condition FNR==NR which will be TRUE when first Input_fie is being read.
array[[=11=]] ##Creating array with index of current line.
next ##next will skip all further statements from here.
}
( in array) ##Checking condition if current line 2nd field is present in array then print that line.
' pop.txt data.anno ##Mentioning Input_file names here.
我真的是脚本和堆栈的新手,所以如果我的问题很愚蠢或放错地方,我很抱歉。
我必须在 Bash 完成任务。
我有一个像这样的 DATA.anno 文件:
ID POP LOCALITY
1 Apu Italy
2 Apu Italy
3 Tir Albania
4 Tir Albania
5 Ber Germany
6 Ber Germany
我有一个 pop.txt 文件,其中包含前面文件第二列中存在的两个人口名称:
Apu
Ber
现在我想获取另一个文件,其中仅包含 pop.txt 文件中存在的人口行。在这种情况下,我要获取的输出文件如下:
ID POP LOCALITY
1 Apu Italy
2 Apu Italy
4 Ber Germany
5 Ber Germany
我试过这个脚本,但它似乎不起作用:
cat pop.txt | while read line; do grep $line DATA.anno | cut -f 2,3 >> outputfile.txt
能否请您尝试以下。
awk 'BEGIN{print "ID POP LOCALITY"} FNR==NR{array[[=10=]];next} ( in array)' pop.txt data.anno
说明:添加代码的详细说明。
awk ' ##Starting awk program from here.
BEGIN{ ##Starting BEGIN section from here.
print "ID POP LOCALITY" ##Printing headers here.
}
FNR==NR{ ##Checking condition FNR==NR which will be TRUE when first Input_fie is being read.
array[[=11=]] ##Creating array with index of current line.
next ##next will skip all further statements from here.
}
( in array) ##Checking condition if current line 2nd field is present in array then print that line.
' pop.txt data.anno ##Mentioning Input_file names here.