删除与特定 string/value 匹配的行

Question

我有一个名为 "Master_Data" 的平面文件，其中包含以下几行：（考虑到 Customer_Key 是主键）

Customer_Key,Customer_ID,Location
"1","1001","Washington D.C"
"2","1002","Los Angeles"
"3","1003","New York"

我收到名为 "Daily_Data" 的类似文件结构。我需要将这些行附加到 "Master_Data" 文件以防它是新的 line.Update /删除现有行。例如，我收到如下 "Daily_Data" 文件：

Customer_Key,Customer_ID,Location
"3","1003","Austin"
"4","1004","San Jose"

那么我的代码应该 produce/modify "Master_Data" 文件如下：

Customer_Key,Customer_ID,Location
"1","1001","Washington D.C"
"2","1002","Los Angeles"
"3","1003","Austin"
"4","1004","San Jose"

到目前为止我已经试过了

sed -n '2,$p' /users/files/Daily_Data.csv >> /users/files/Master_Data.csv

但这只是从 Daily_Data 复制数据并附加到 Master_Data，如下所示：

Customer_Key,Customer_ID,Location
"1","1001","Washington D.C"
"2","1002","Los Angeles"
"3","1003","New York"
"3","1003","Austin"
"4","1004","San Jose"

我应该如何 use/try 以最好的方式消除行 "3","1003","New York"。

Answer 1

使用 awk，你可以这样做：

awk -F, 'NR==FNR{a[]=[=10=]; next}  in a{[=10=]=a[]; delete a[]} 1;
END{for (i in a) print a[i]}' Daily_Data Master_Data

Customer_Key,Customer_ID,Location
"1","1001","Washington D.C"
"2","1002","Los Angeles"
"3","1003","Austin"
"4","1004","San Jose"

参考： Effective AWK Programming

Answer 2

awk -F, 'NR == FNR {print; id[]; next} !( in id)' Daily_Data Master_Data

Customer_Key,Customer_ID,Location
"3","1003","Austin"
"4","1004","San Jose"
"1","1001","Washington D.C"
"2","1002","Los Angeles"

要对其进行排序，您可以这样做

awk ... | { read -r header; echo "$header"; sort -t'"' -k2,2n; }

要将其保存回 Master_Data，请执行以下操作之一：

awk ... > tmp && mv tmp Master_Data
awk ... | sponge Master_Data         # using `sponge` from `moreutils` package

删除与特定 string/value 匹配的行

Delete line that matches specific string/value

unix

shell

ksh