从一个文件中查找没有（甚至部分）出现在另一个文件中的行

Question

我需要一个 Unix shell 命令来从文件 1 中查找 根本没有出现在文件 2 中 的行。例如-

文件 1:

aaa 
bbb

文件 2:

aaaccc 
bb

预期输出：

bbb

（file1 中的“aaa”确实出现在 file2 中，作为较大字符串“aaaccc”的一部分）。

我不能使用“comm”，因为它只适用于完整的行。在这种情况下，我还希望排除 file2 中包含 file1 中的行作为较大字符串的一部分的行，如上所述。

请注意，如果存在，我更喜欢快速方法，因为我的文件非常大。

Answer 1

awk 中的一个，mawk 可能是最快的所以使用那个：

$ awk '
NR==FNR {                # process file1
    a[[=10=]]                # hash all records to memory
    next                 # process next record
}
{                        # process file2
    for(i in a)          # for each file1 entry in memory
        if([=10=] ~ i)       # see if it is found in current file2 record
            delete a[i]  # and delete if found
}
END {                    # in the end
    for(i in a)          # all left from file1
        print i          # are outputted
}' file1 file2           # mind the order

输出：

bbb

从一个文件中查找没有（甚至部分）出现在另一个文件中的行

Find lines from one file that do not appear (even partially) in another file

shell

search

match