在 python 或 Shell 中逐行比较字段数据
Comparing field data line by line in python or Shell
我有包含以下数据的输入文件:
Mode|Date|Count|timestamp|status
HR|06/08/2016|3000|Thu Jun 09 2016|Complete
HR|06/08/2016|3010|Thu Jun 09 2016|Complete
HR|06/08/2016|2000|Thu Jun 09 2016|Complete
HR|06/08/2016|2000|Fri Jun 09 2016|Complete
HR|06/08/2016|1000|Thu Jun 09 2016|Complete
HR|06/08/2016|1500|Thu Jun 09 2016|Failure
....
现在我试图比较每两行以找出哪个字段有数据不匹配。我使用 python 脚本尝试了很多方法。但是没有找到任何运气。我的输出应该如下所示
Count Mismatch:
HR|06/08/2016|3000|Thu Jun 09 2016|Complete
HR|06/08/2016|3010|Thu Jun 09 2016|Complete
timestamp Mismatch:
HR|06/08/2016|2000|Thu Jun 09 2016|Complete
HR|06/08/2016|2000|Fri Jun 09 2016|Complete
Count and Status Mismatch:
HR|06/08/2016|1000|Thu Jun 09 2016|Complete
HR|06/08/2016|1500|Thu Jun 09 2016|Failure
....
有人可以帮我解决这个问题吗?提前致谢
你可以使用 awk:
awk 'BEGIN{
FS="|"
}
NR==1 {
split([=10=], h, /\|/)
next
}
NR%2==0 {
pr=[=10=]
split([=10=], a, /\|/)
next
}
{
s = ""
for(i=1; i<=NF; i++)
if ($i != a[i])
s = sprintf("%s%s", s, (!s? "" : " and ") h[i])
print s, "Mismatch:" ORS pr ORS [=10=]
}' file
Count Mismatch:
HR|06/08/2016|3000|Thu Jun 09 2016|Complete
HR|06/08/2016|3010|Thu Jun 09 2016|Complete
timestamp Mismatch:
HR|06/08/2016|2000|Thu Jun 09 2016|Complete
HR|06/08/2016|2000|Fri Jun 09 2016|Complete
Count and status Mismatch:
HR|06/08/2016|1000|Thu Jun 09 2016|Complete
HR|06/08/2016|1500|Thu Jun 09 2016|Failur
我有包含以下数据的输入文件:
Mode|Date|Count|timestamp|status
HR|06/08/2016|3000|Thu Jun 09 2016|Complete
HR|06/08/2016|3010|Thu Jun 09 2016|Complete
HR|06/08/2016|2000|Thu Jun 09 2016|Complete
HR|06/08/2016|2000|Fri Jun 09 2016|Complete
HR|06/08/2016|1000|Thu Jun 09 2016|Complete
HR|06/08/2016|1500|Thu Jun 09 2016|Failure
....
现在我试图比较每两行以找出哪个字段有数据不匹配。我使用 python 脚本尝试了很多方法。但是没有找到任何运气。我的输出应该如下所示
Count Mismatch:
HR|06/08/2016|3000|Thu Jun 09 2016|Complete
HR|06/08/2016|3010|Thu Jun 09 2016|Complete
timestamp Mismatch:
HR|06/08/2016|2000|Thu Jun 09 2016|Complete
HR|06/08/2016|2000|Fri Jun 09 2016|Complete
Count and Status Mismatch:
HR|06/08/2016|1000|Thu Jun 09 2016|Complete
HR|06/08/2016|1500|Thu Jun 09 2016|Failure
....
有人可以帮我解决这个问题吗?提前致谢
你可以使用 awk:
awk 'BEGIN{
FS="|"
}
NR==1 {
split([=10=], h, /\|/)
next
}
NR%2==0 {
pr=[=10=]
split([=10=], a, /\|/)
next
}
{
s = ""
for(i=1; i<=NF; i++)
if ($i != a[i])
s = sprintf("%s%s", s, (!s? "" : " and ") h[i])
print s, "Mismatch:" ORS pr ORS [=10=]
}' file
Count Mismatch:
HR|06/08/2016|3000|Thu Jun 09 2016|Complete
HR|06/08/2016|3010|Thu Jun 09 2016|Complete
timestamp Mismatch:
HR|06/08/2016|2000|Thu Jun 09 2016|Complete
HR|06/08/2016|2000|Fri Jun 09 2016|Complete
Count and status Mismatch:
HR|06/08/2016|1000|Thu Jun 09 2016|Complete
HR|06/08/2016|1500|Thu Jun 09 2016|Failur