匹配两个文件的列元素并使用 AWK/PERL 将其替换为匹配的行

Question

我有两个文件，每个文件有 3 列。我想将 file1 的第 3 列的元素与 file2 的第 3 列相匹配。如果匹配，则将 file1 的整行替换为与 file2 中的匹配对应的行，否则移至下一行。

示例如下：在file2中，第3列元素reg[2]和reg[9][9]出现在file1的第3列中。因此，file1 的相应行被 file2 中的行替换。

文件 1:

Nancy Owen reg[2]
Nancy Owen reg[4_8]
Nancy Owen reg[7]
Nancy Owen reg[9][9]
Nancy Owen reg[54]

文件 2：

Done Approval reg[9][9]
Nancy Owen reg[10_8]
Nancy Owen reg[4][10]
Done Approval reg[2]

期望的输出

Done Approval reg[2]
Nancy Owen reg[4_8]
Nancy Owen reg[7]
Done Approval reg[9][9]
Nancy Owen reg[54]

尝试的代码：

awk -F, 'NR==FNR{a[]=[=14=];next;}a[]{[=14=]=a[]}1' file2 file1

我在使用oneliner awk命令方面还是个新手。我肯定在上面的代码中做错了什么。我想要做的是将第 3 列以键的形式放置，将整行作为值。如果键存在于 file1 的 column3 中，则将 fil1 当前行替换为 file2 中的当前值。否则跳过并移至下一行。

Answer 1

我会按照以下方式使用 GNU AWK，令 file1.txt 内容为

Nancy Owen reg[2]
Nancy Owen reg[4_8]
Nancy Owen reg[7]
Nancy Owen reg[9][9]
Nancy Owen reg[54]

和file2.txt内容为

Done Approval reg[9][9]
Nancy Owen reg[10_8]
Nancy Owen reg[4][10]
Done Approval reg[2]

然后

awk 'FNR==NR{arr[]=[=12=];next}{print( in arr?arr[]:[=12=])}' file2.txt file1.txt

输出

Done Approval reg[2]
Nancy Owen reg[4_8]
Nancy Owen reg[7]
Done Approval reg[9][9]
Nancy Owen reg[54]

解释：从处理 file2.txt 开始，将每一行存储在数组 arr 中，键为第 3 列 (</code>) 值，什么都不做（因此 <code>next用法），然后处理 file1.txt 如果第 3 个值存在于 arr 键（ in arr）中，则执行 print 对应的值，否则 print 当前行（[=25=]）。为此，我使用了 so-called 三元运算符 condition?valueiftrue:valueiffalse

（在 GNU Awk 5.0.1 中测试）

Answer 2

注意 perl 标签，这是一个 Perl 解决方案：

perl -ane 'if ($eof) {
               if (exists $h{ $F[2] }) {
                   print $h{ $F[2] }
               } else { print }
           } else {
               $h{ $F[2] } = $_;
               $eof = 1 if eof;
           }' -- file2 file1

-n逐行读取输入，运行每行代码；
-a 将空白处的每一行拆分为@F 数组；
我们在第一个文件的末尾设置变量 $eof，即 file2;
在读取第一个文件 (file2) 时，我们将每一行存储到以第三列为键的散列；
在读取第二个文件 (file1) 时，我们检查散列是否包含第三列的行：如果是，我们打印它，否则我们打印当前行。

匹配两个文件的列元素并使用 AWK/PERL 将其替换为匹配的行

Match column elements of two files and replace it with the matched line using AWK/PERL

regex

unix

perl

awk