删除与特定 string/value 匹配的行
Delete line that matches specific string/value
我有一个名为 "Master_Data" 的平面文件,其中包含以下几行:(考虑到 Customer_Key 是主键)
Customer_Key,Customer_ID,Location
"1","1001","Washington D.C"
"2","1002","Los Angeles"
"3","1003","New York"
我收到名为 "Daily_Data" 的类似文件结构。我需要将这些行附加到 "Master_Data" 文件以防它是新的 line.Update /删除现有行。例如,我收到如下 "Daily_Data" 文件:
Customer_Key,Customer_ID,Location
"3","1003","Austin"
"4","1004","San Jose"
那么我的代码应该 produce/modify "Master_Data" 文件如下:
Customer_Key,Customer_ID,Location
"1","1001","Washington D.C"
"2","1002","Los Angeles"
"3","1003","Austin"
"4","1004","San Jose"
到目前为止我已经试过了
sed -n '2,$p' /users/files/Daily_Data.csv >> /users/files/Master_Data.csv
但这只是从 Daily_Data 复制数据并附加到 Master_Data,如下所示:
Customer_Key,Customer_ID,Location
"1","1001","Washington D.C"
"2","1002","Los Angeles"
"3","1003","New York"
"3","1003","Austin"
"4","1004","San Jose"
我应该如何 use/try 以最好的方式消除行 "3","1003","New York"
。
使用 awk,你可以这样做:
awk -F, 'NR==FNR{a[]=[=10=]; next} in a{[=10=]=a[]; delete a[]} 1;
END{for (i in a) print a[i]}' Daily_Data Master_Data
Customer_Key,Customer_ID,Location
"1","1001","Washington D.C"
"2","1002","Los Angeles"
"3","1003","Austin"
"4","1004","San Jose"
awk -F, 'NR == FNR {print; id[]; next} !( in id)' Daily_Data Master_Data
Customer_Key,Customer_ID,Location
"3","1003","Austin"
"4","1004","San Jose"
"1","1001","Washington D.C"
"2","1002","Los Angeles"
要对其进行排序,您可以这样做
awk ... | { read -r header; echo "$header"; sort -t'"' -k2,2n; }
要将其保存回 Master_Data,请执行以下操作之一:
awk ... > tmp && mv tmp Master_Data
awk ... | sponge Master_Data # using `sponge` from `moreutils` package
我有一个名为 "Master_Data" 的平面文件,其中包含以下几行:(考虑到 Customer_Key 是主键)
Customer_Key,Customer_ID,Location
"1","1001","Washington D.C"
"2","1002","Los Angeles"
"3","1003","New York"
我收到名为 "Daily_Data" 的类似文件结构。我需要将这些行附加到 "Master_Data" 文件以防它是新的 line.Update /删除现有行。例如,我收到如下 "Daily_Data" 文件:
Customer_Key,Customer_ID,Location
"3","1003","Austin"
"4","1004","San Jose"
那么我的代码应该 produce/modify "Master_Data" 文件如下:
Customer_Key,Customer_ID,Location
"1","1001","Washington D.C"
"2","1002","Los Angeles"
"3","1003","Austin"
"4","1004","San Jose"
到目前为止我已经试过了
sed -n '2,$p' /users/files/Daily_Data.csv >> /users/files/Master_Data.csv
但这只是从 Daily_Data 复制数据并附加到 Master_Data,如下所示:
Customer_Key,Customer_ID,Location
"1","1001","Washington D.C"
"2","1002","Los Angeles"
"3","1003","New York"
"3","1003","Austin"
"4","1004","San Jose"
我应该如何 use/try 以最好的方式消除行 "3","1003","New York"
。
使用 awk,你可以这样做:
awk -F, 'NR==FNR{a[]=[=10=]; next} in a{[=10=]=a[]; delete a[]} 1;
END{for (i in a) print a[i]}' Daily_Data Master_Data
Customer_Key,Customer_ID,Location
"1","1001","Washington D.C"
"2","1002","Los Angeles"
"3","1003","Austin"
"4","1004","San Jose"
awk -F, 'NR == FNR {print; id[]; next} !( in id)' Daily_Data Master_Data
Customer_Key,Customer_ID,Location
"3","1003","Austin"
"4","1004","San Jose"
"1","1001","Washington D.C"
"2","1002","Los Angeles"
要对其进行排序,您可以这样做
awk ... | { read -r header; echo "$header"; sort -t'"' -k2,2n; }
要将其保存回 Master_Data,请执行以下操作之一:
awk ... > tmp && mv tmp Master_Data
awk ... | sponge Master_Data # using `sponge` from `moreutils` package