使用命令行工具合并 2 个文件

Merge 2 files using command line tools

我有 2 个这样的 csv 文件:

id, name, job
1, bob, fireman
3, alice, nurse
7, peter, policeman
...

并且:

id, name, age
2, john, 26
4, craig, 32
5, mary, 45
6, lucy, 23
...

如您所见,它们都是按 id 排序的,第一个 csv 中缺少的 id 实际上在第二个 csv 中。

是否可以通过命令行工具(例如 awk 或类似工具将这 2 个 csv 合并成一个这样的文件?

id, name, job, age
1, bob, fireman,
2, john, , 26
3, alice, nurse,
4, craig, , 32
...

非常感谢您的帮助?

应该这样做:

awk -F, -v OFS=, 'FNR==NR && FNR>1 {a[]=[=10=];c++;next} FNR>1{$NF=" ,"$NF;a[]=[=10=];c++} END {print "id, name, job, age";for (i=1;i<=c;i++) print a[i]}' file1 file2
id, name, job, age
1, bob, fireman
2, john, , 26
3, alice, nurse
4, craig, , 32
5, mary, , 45
6, lucy, , 23
7, peter, policeman

工作原理:

awk -F, -v OFS=, '              # Set input and output Field separator to ","
FNR==NR && FNR>1 {              # For first file except first record do:
    a[]=[=11=]                    # Store records inn to array "a"
    c++                         # Increment "c" for every record
    next}                       # Skip to next record
FNR>1 {                         # For second file except first record do:
    $NF=" ,"$NF                 # Replace last record with an extra ","
    a[]=[=11=]                    # Store records inn to array "a"
    c++}                        # Increment "c" for every record
END {                           # When all file is read do:
    print "id, name, job, age"  # Print header
    for (i=1;i<=c;i++)          # Loop "c" times
        print a[i]}             # Print records
' file1 file2                   # Read the files

FNR==NR经常在读取多个文件时用来区分要处理哪个文件