打印缺失的单词和文件名 - linux

Question

我有两个给定格式的文件：

文件 1：

India 215.0
country 165.0
Indian 163.0
s 133.0
Maoist 103.0
Nepal 89.0
group 85.0
Kathmandu 85.0

文件 2：

Nepal 89.0
would 88.0
Kathmandu 85.0
rule 82.0
king 80.0
parliament 79.0
card 79.0

我想打印出现在一个文件中但不出现在另一个文件中的单词。找到每个单词的文件也应该打印在单词旁边。例如，我希望输出为：

India 215.0, file 1
country 165.0, file 1
group 85.0, file 1
....
....
would 88.0, file 2

我尝试使用：

grep -v file1 file2

我得到 file2 中不存在的词，但我想要 file1 中而不是 file2 中存在的词，反之亦然，以及它们各自的文件名字。我怎样才能做到这一点？请帮忙！

Answer 1

# print out all the rows only in file2 and append filename
$ awk 'NR==FNR{a[]++;next} !( in a){print [=10=], FILENAME}' file1 file2                                                                                                                
would 88.0 file2
rule 82.0 file2
king 80.0 file2
parliament 79.0 file2
card 79.0 file2

# print all the rows only in file1 and append filename
$ awk 'NR==FNR{a[]++;next} !( in a){print [=10=], FILENAME}' file2 file1                                                                                                                
India 215.0 file1
country 165.0 file1
Indian 163.0 file1
s 133.0 file1
Maoist 103.0 file1
group 85.0 file1

默认的字段分隔符是space，是第一列。

打印缺失的单词和文件名 - linux

Print the missing words and the file name - linux

linux

word

match