比较两个文件，如果文件 1 中存在字符串，则打印 2，如果不存在，则打印 1

Question

我有两个文件要比较。如果文件 1 中存在字符串（只有一列），我想在它旁边打印一个 2。如果它只存在于文件 2 中，我想在它旁边打印一个 1。我想保留文件 2 的所有条目。

文件 1：

5131885
5751191

文件 2：

5131885 1000019 -0.013936 0.0069218 -0.0048443 -0.0053688 0.0074161
5751191 1000046 -0.015001 0.0015263 0.00039903 0.0017072 -0.0021732
1668460 1000081 0.026323 0.0068929 0.0048965 0.0077047 0.0061728

文件 3（所需输出）：

5131885 2 1000019 -0.013936 0.0069218 -0.0048443 -0.0053688 0.0074161
5751191 2 1000046 -0.015001 0.0015263 0.00039903 0.0017072 -0.0021732
1668460 1 1000081 0.026323 0.0068929 0.0048965 0.0077047 0.0061728

我尝试用awk来做，但是没有成功。我达到了这一点：

awk 'FNR==NR{arr[]=;next} ( in arr){print [=13=],arr[]}' file2 file1 > file3

但它不会添加额外的列。

Answer 1

使用您展示的示例，请尝试以下 awk 代码。

awk 'FNR==NR{arr[[=10=]];next} {= OFS ( in arr?2:1)} 1' file1 file2

解释：为以上添加详细解释。

awk '              ##Starting awk program from here.
FNR==NR{           ##Checking condition if FNR==NR which will be TRUE when file1 is being read.
  arr[[=11=]]          ##Creating arr with index of current line value here.
  next             ##next will skip all further statements from here.
}
{
  =( in arr)? OFS 2: OFS 1 ##Checking if  from file2 is present in arr then add 2 to it else add 1 to it.
}
1                  ##Printing current edited/non-edited line here.
' file1 file2      ##Mentioning Input_file names here.

OR @Kaz 在评论中建议的上述解决方案的一个小变体，请尝试以下操作：

awk 'FNR==NR{arr[[=12=]];next} = OFS ( in arr?2:1)' file1 file2

Answer 2

反转file1和file2：file1的记忆

提取第一个字段 (k)。

删除第一个字段（子）。

检测记忆文件 1 中的 k 是否影响 n（1 或 2）。

打印 k、n 和 file2 行的其余部分（不带 k）。

awk 'FNR==NR{arr[]=;next} {k=; sub(/^[^ ]+ /, ""); n=1} (k in arr){n=2} {print k " " n " " [=10=]}' file1 file2

比较两个文件，如果文件 1 中存在字符串，则打印 2，如果不存在，则打印 1

Comparing two files, if string exists in file 1, print 2 and if not print 1

awk

text-processing