如何使用 unix shell 脚本比较多列并打印不匹配的列

Question

我有两个文件。

File1
col1,col2,col3,col4,col5,col6,col7
col11,col12,col13,col14,col15,col16,col17

File2
col1,col2,col03,col4,col5,col06,col7
col11,col12,col13,col14,col015,col16,col17

我想逐行比较第二个文件和第一个文件，然后逐列比较并打印类似于

的消息

On line number 1, field#3 is different(col03)
On line number 1, field#6 is different(col06)
On line number 2, field#5 is different(col15)

也可以专门跳过某些列。谢谢

编辑：喜欢这个剧本

$ awk -F"," 'FNR==NR {a[]; next}  in a' f1 f2

    FNR==NR is performed when reading the first file.
    {a[]; next} stores in a[] the lines of the first file and goes to the next line.
     in a is evaluated when looping through the second file. It checks if the current line is within the a[] array.

给我第二个文件行。但我不想比较行，而是比较文件 1 中的特定列和文件 2

Answer 1

使用 GNU awk：

$ gawk -F, '                          # set delimiter
NR==FNR {                             # process file1
    for(i=1;i<=NF;i++)                # iterate all fields
        a[FNR][i]=$i                  # hash to 2D array
    next
}
{                                     # process file2
    for(i=1;i<=NF;i++)                # iterate fields
        if($i!=a[FNR][i])             # compare and output if needed
            printf "On line number %d, field#%d is different(%s)\n",FNR,i,$i
}' file1 file2

输出：

On line number 1, field#3 is different(col03)
On line number 1, field#6 is different(col06)
On line number 2, field#5 is different(col015)

此解决方案非常希望两个文件中的记录和字段计数相等。

Answer 2

您能否尝试执行以下操作，可以使用 GNU awk 中显示的示例编写和测试另一种方法。

awk '
BEGIN{
  FS=OFS=","
}
FNR==NR{
  for(i=1;i<=NF;i++){
    arr[FNR OFS i]=$i
  }
  next
}
{
  for(i=1;i<=NF;i++){
    if(arr[FNR OFS i]!=$i){
      print "On line number " FNR ", field#" i" is different("$i")")"
    }
  }
}
' file1 file2

说明： 为以上添加详细说明。

awk '                        ##Starting awk program from here.
BEGIN{                       ##Starting BEGIN section from here.
  FS=OFS=","                 ##Setting field separator and output field separator as comma.
}
FNR==NR{                     ##Checking condition FNR==NR which will be TRUE when file1 is being read.
  for(i=1;i<=NF;i++){        ##Starting a for loop till number of fields here.
    arr[FNR OFS i]=$i        ##Creating an array with index of current line number and field number and value is current field.
  }
  next                       ##next will skip all further statements from here.
}
{
  for(i=1;i<=NF;i++){        ##Starting for loop till number of fields here.
    if(arr[FNR OFS i]!=$i){  ##Checking condition if current field value is NOT equals to array with index of current line and field number.
      print "On line number " FNR ", field#" i" is different("$i")")"
                             ##Printing statement with field and line number as per OP request.
    }
  }
}
' file1 file2                ##Mentioning Input_file names here.

Answer 3

另一个使用粘贴命令的 awk。

$ cat forever1.txt
col1,col2,col3,col4,col5,col6,col7
col11,col12,col13,col14,col15,col16,col17

$ cat forever2.txt
col1,col2,col03,col4,col5,col06,col7
col11,col12,col13,col14,col015,col16,col17

$ paste -d, forever1.txt forever2.txt | \
awk -F, ' { nf=NF/2; for(i=1;i<=nf;i++) { if($i!=$(nf+i)) print "On line number " NR ",field " i," is different(" $i ")" ; } } '
On line number 1,field 3  is different(col3)
On line number 1,field 6  is different(col6)
On line number 2,field 5  is different(col15)

$

使用“加入”命令使用相同的分隔符加入 2 个文件，那么您总是会得到偶数个字段。将输出通过管道传输到另一个 awk 并循环比较前半部分和后半部分

如何使用 unix shell 脚本比较多列并打印不匹配的列

how to compare multiple columns and print not matching columns using unix shell script

linux

bash

shell

awk

ksh