使用命令行工具合并 2 个文件
Merge 2 files using command line tools
我有 2 个这样的 csv 文件:
id, name, job
1, bob, fireman
3, alice, nurse
7, peter, policeman
...
并且:
id, name, age
2, john, 26
4, craig, 32
5, mary, 45
6, lucy, 23
...
如您所见,它们都是按 id 排序的,第一个 csv 中缺少的 id 实际上在第二个 csv 中。
是否可以通过命令行工具(例如 awk
或类似工具将这 2 个 csv 合并成一个这样的文件?
id, name, job, age
1, bob, fireman,
2, john, , 26
3, alice, nurse,
4, craig, , 32
...
非常感谢您的帮助?
应该这样做:
awk -F, -v OFS=, 'FNR==NR && FNR>1 {a[]=[=10=];c++;next} FNR>1{$NF=" ,"$NF;a[]=[=10=];c++} END {print "id, name, job, age";for (i=1;i<=c;i++) print a[i]}' file1 file2
id, name, job, age
1, bob, fireman
2, john, , 26
3, alice, nurse
4, craig, , 32
5, mary, , 45
6, lucy, , 23
7, peter, policeman
工作原理:
awk -F, -v OFS=, ' # Set input and output Field separator to ","
FNR==NR && FNR>1 { # For first file except first record do:
a[]=[=11=] # Store records inn to array "a"
c++ # Increment "c" for every record
next} # Skip to next record
FNR>1 { # For second file except first record do:
$NF=" ,"$NF # Replace last record with an extra ","
a[]=[=11=] # Store records inn to array "a"
c++} # Increment "c" for every record
END { # When all file is read do:
print "id, name, job, age" # Print header
for (i=1;i<=c;i++) # Loop "c" times
print a[i]} # Print records
' file1 file2 # Read the files
FNR==NR
经常在读取多个文件时用来区分要处理哪个文件
我有 2 个这样的 csv 文件:
id, name, job
1, bob, fireman
3, alice, nurse
7, peter, policeman
...
并且:
id, name, age
2, john, 26
4, craig, 32
5, mary, 45
6, lucy, 23
...
如您所见,它们都是按 id 排序的,第一个 csv 中缺少的 id 实际上在第二个 csv 中。
是否可以通过命令行工具(例如 awk
或类似工具将这 2 个 csv 合并成一个这样的文件?
id, name, job, age
1, bob, fireman,
2, john, , 26
3, alice, nurse,
4, craig, , 32
...
非常感谢您的帮助?
应该这样做:
awk -F, -v OFS=, 'FNR==NR && FNR>1 {a[]=[=10=];c++;next} FNR>1{$NF=" ,"$NF;a[]=[=10=];c++} END {print "id, name, job, age";for (i=1;i<=c;i++) print a[i]}' file1 file2
id, name, job, age
1, bob, fireman
2, john, , 26
3, alice, nurse
4, craig, , 32
5, mary, , 45
6, lucy, , 23
7, peter, policeman
工作原理:
awk -F, -v OFS=, ' # Set input and output Field separator to ","
FNR==NR && FNR>1 { # For first file except first record do:
a[]=[=11=] # Store records inn to array "a"
c++ # Increment "c" for every record
next} # Skip to next record
FNR>1 { # For second file except first record do:
$NF=" ,"$NF # Replace last record with an extra ","
a[]=[=11=] # Store records inn to array "a"
c++} # Increment "c" for every record
END { # When all file is read do:
print "id, name, job, age" # Print header
for (i=1;i<=c;i++) # Loop "c" times
print a[i]} # Print records
' file1 file2 # Read the files
FNR==NR
经常在读取多个文件时用来区分要处理哪个文件