如何比较两个文件并打印两个不同文件的值
How to compare two files and print the values of both the files which are different
有2个文件。我需要先对它们进行排序,然后比较这 2 个文件,然后比较我需要打印文件 1 和文件 2 中的值的差异。
文件 1:
pair,bid,ask
AED/MYR,3.918000,3.918000
AED/SGD,3.918000,3.918000
AUD/CAD,3.918000,3.918000
文件 2:
pair,bid,ask
AUD/CAD,3.918000,3.918000
AUD/CNY,3.918000,3.918000
AED/MYR,4.918000,4.918000
输出应该是:
pair,inputbid,inputask,outputbid,outtputask
AED/MYR,3.918000,3.918000,4.918000,4.918000
2 个文件的唯一区别是 AED/MYR 具有不同的 bid/ask 速率。如何打印文件 1 和文件 2 的差值。
我尝试使用以下命令:
nawk -F, 'NR==FNR{a[]=;a[]=;next} !( in a) || !( in a) {print FS a[] FS a[] FS FS }' file1 file2
结果输出如下:
pair,bid,ask,bid,ask
AUD/CAD,3.918000,3.918000,3.918000,3.918000
AUD/CHF,3.918000,3.918000,3.918000,3.918000
AUD/CNH,3.918000,3.918000,3.918000,3.918000
AUD/CNY,3.918000,3.918000,3.918000,3.918000
AED/MYR,3.918000,3.918000,4.918000,4.918000
我们仍然不能只得到差异。
能否请您尝试在 GNU awk
中使用所示示例进行跟踪、编写和测试。
awk -v header="pair,inputbid,inputask,outputbid,outtputask" '
BEGIN{
FS=OFS=","
}
FNR==NR{
arr[]=[=10=]
next
}
( in arr) && arr[]!=[=10=]{
val=
=""
sub(/^,/,"")
if(!found){
print header
found=1
}
print arr[val],[=10=]
}' Input_file1 Input_file2
说明: 为以上添加详细说明。
awk -v header="pair,inputbid,inputask,outputbid,outtputask" ' ##Starting awk program from here and setting this to header value here.
BEGIN{ ##Starting BEGIN section of this program from here.
FS=OFS="," ##Setting field separator and output field separator as comma here.
}
FNR==NR{ ##Checking condition FNR==NR which will be TRUE when Input_file1 is being read.
arr[]=[=11=] ##Creating arr with index and keep value as current line.
next ##next will skip all further statements from here.
}
( in arr) && arr[]!=[=11=]{ ##Checking condition if first field is present in arr and its value NOT equal to [=11=]
val= ##Creating val which has current line value in it.
="" ##Nullifying irst field here.
sub(/^,/,"") ##Substitute starting , with NULL here.
if(!found){ ##Checking if found is NULL then do following.
print header ##Printing header here only once.
found=1 ##Setting found here.
}
print arr[val],[=11=] ##Printing arr with index of val and current line here.
}' Input_file1 Input_file2 ##Mentioning Input_files here.
使用 bash
进程替换,然后 join
然后选择 awk
:
# print header
printf "%s\n" "pair,inputbid,inputask,outputbid,outtputask"
# remove first line from both files, then sort them on first field
# then join them on first field and output first 5 fields
join -t, -11 -21 -o1.1,1.2,1.3,2.2,2.3 <(tail -n +2 file1 | sort -t, -k1) <(tail -n +2 file2 | sort -t, -k1) |
# output only those lines, that columns differ
awk -F, ' != || != '
有2个文件。我需要先对它们进行排序,然后比较这 2 个文件,然后比较我需要打印文件 1 和文件 2 中的值的差异。
文件 1:
pair,bid,ask
AED/MYR,3.918000,3.918000
AED/SGD,3.918000,3.918000
AUD/CAD,3.918000,3.918000
文件 2:
pair,bid,ask
AUD/CAD,3.918000,3.918000
AUD/CNY,3.918000,3.918000
AED/MYR,4.918000,4.918000
输出应该是:
pair,inputbid,inputask,outputbid,outtputask
AED/MYR,3.918000,3.918000,4.918000,4.918000
2 个文件的唯一区别是 AED/MYR 具有不同的 bid/ask 速率。如何打印文件 1 和文件 2 的差值。
我尝试使用以下命令:
nawk -F, 'NR==FNR{a[]=;a[]=;next} !( in a) || !( in a) {print FS a[] FS a[] FS FS }' file1 file2
结果输出如下:
pair,bid,ask,bid,ask
AUD/CAD,3.918000,3.918000,3.918000,3.918000
AUD/CHF,3.918000,3.918000,3.918000,3.918000
AUD/CNH,3.918000,3.918000,3.918000,3.918000
AUD/CNY,3.918000,3.918000,3.918000,3.918000
AED/MYR,3.918000,3.918000,4.918000,4.918000
我们仍然不能只得到差异。
能否请您尝试在 GNU awk
中使用所示示例进行跟踪、编写和测试。
awk -v header="pair,inputbid,inputask,outputbid,outtputask" '
BEGIN{
FS=OFS=","
}
FNR==NR{
arr[]=[=10=]
next
}
( in arr) && arr[]!=[=10=]{
val=
=""
sub(/^,/,"")
if(!found){
print header
found=1
}
print arr[val],[=10=]
}' Input_file1 Input_file2
说明: 为以上添加详细说明。
awk -v header="pair,inputbid,inputask,outputbid,outtputask" ' ##Starting awk program from here and setting this to header value here.
BEGIN{ ##Starting BEGIN section of this program from here.
FS=OFS="," ##Setting field separator and output field separator as comma here.
}
FNR==NR{ ##Checking condition FNR==NR which will be TRUE when Input_file1 is being read.
arr[]=[=11=] ##Creating arr with index and keep value as current line.
next ##next will skip all further statements from here.
}
( in arr) && arr[]!=[=11=]{ ##Checking condition if first field is present in arr and its value NOT equal to [=11=]
val= ##Creating val which has current line value in it.
="" ##Nullifying irst field here.
sub(/^,/,"") ##Substitute starting , with NULL here.
if(!found){ ##Checking if found is NULL then do following.
print header ##Printing header here only once.
found=1 ##Setting found here.
}
print arr[val],[=11=] ##Printing arr with index of val and current line here.
}' Input_file1 Input_file2 ##Mentioning Input_files here.
使用 bash
进程替换,然后 join
然后选择 awk
:
# print header
printf "%s\n" "pair,inputbid,inputask,outputbid,outtputask"
# remove first line from both files, then sort them on first field
# then join them on first field and output first 5 fields
join -t, -11 -21 -o1.1,1.2,1.3,2.2,2.3 <(tail -n +2 file1 | sort -t, -k1) <(tail -n +2 file2 | sort -t, -k1) |
# output only those lines, that columns differ
awk -F, ' != || != '