删除与特定 string/value 匹配的行

Delete line that matches specific string/value

我有一个名为 "Master_Data" 的平面文件,其中包含以下几行:(考虑到 Customer_Key 是主键)

Customer_Key,Customer_ID,Location
"1","1001","Washington D.C"
"2","1002","Los Angeles"
"3","1003","New York"

我收到名为 "Daily_Data" 的类似文件结构。我需要将这些行附加到 "Master_Data" 文件以防它是新的 line.Update /删除现有行。例如,我收到如下 "Daily_Data" 文件:

Customer_Key,Customer_ID,Location
"3","1003","Austin"
"4","1004","San Jose"

那么我的代码应该 produce/modify "Master_Data" 文件如下:

Customer_Key,Customer_ID,Location
"1","1001","Washington D.C"
"2","1002","Los Angeles"
"3","1003","Austin"
"4","1004","San Jose" 

到目前为止我已经试过了

sed -n '2,$p' /users/files/Daily_Data.csv >> /users/files/Master_Data.csv

但这只是从 Daily_Data 复制数据并附加到 Master_Data,如下所示:

Customer_Key,Customer_ID,Location
"1","1001","Washington D.C"
"2","1002","Los Angeles"
"3","1003","New York"
"3","1003","Austin"
"4","1004","San Jose"

我应该如何 use/try 以最好的方式消除行 "3","1003","New York"

使用 awk,你可以这样做:

awk -F, 'NR==FNR{a[]=[=10=]; next}  in a{[=10=]=a[]; delete a[]} 1;
END{for (i in a) print a[i]}' Daily_Data Master_Data

Customer_Key,Customer_ID,Location
"1","1001","Washington D.C"
"2","1002","Los Angeles"
"3","1003","Austin"
"4","1004","San Jose"

参考: Effective AWK Programming

awk -F, 'NR == FNR {print; id[]; next} !( in id)' Daily_Data Master_Data
Customer_Key,Customer_ID,Location
"3","1003","Austin"
"4","1004","San Jose"
"1","1001","Washington D.C"
"2","1002","Los Angeles"

要对其进行排序,您可以这样做

awk ... | { read -r header; echo "$header"; sort -t'"' -k2,2n; }

要将其保存回 Master_Data,请执行以下操作之一:

awk ... > tmp && mv tmp Master_Data
awk ... | sponge Master_Data         # using `sponge` from `moreutils` package