使用 awk 或 sed 使用列名动态打印列

dynamically print column using column name using awk or sed

我有一个文件,我试图从该文件中使用列名动态打印名为“grant (actual)”的列。我能够通过使用以下命令迭代列号来派生列,当前位置是第 6 列

$ awk '/--/,/Datacenter/ ' cas.txt  | awk '{print }'
(actual)
49.9%
55.4%
53.5%
48.7%

(actual)
53.1%
50.0%
47.6%
48.3%

(actual)
50.0%
51.1%
48.9%
51.3%

但我想动态确定列号,这样如果列的位置发生变化,我的脚本应该可以工作。

$ cat cas.txt
Datacenter: DC01
====================
Status=TRUE/FALSE
|/ State=Normal/Leaving/Joining/Moving
--  Address     Load       USER       grant (actual)      Host ID    Vol
DN  10.0.0.138  221.03 MiB  256          49.9%             dd09f7aa  STG1
DN  10.0.0.139  173.47 MiB  256          55.4%             53179492  STG1
DN  10.0.0.136  200.08 MiB  256          53.5%             89a28140  STG1
DN  10.0.0.137  318.69 MiB  256          48.7%             8cc9dfac  STG1
Datacenter: DC02
====================
Status=TRUE/FALSE
|/ State=Normal/Leaving/Joining/Moving
--  Address     Load       USER       grant (actual)       Host ID    Vol
DN  10.0.0.142  270.01 MiB  256          53.1%             04210b53  STG1
DN  10.0.0.143  166.65 MiB  256          50.0%             d5469c9b STG1
DN  10.0.0.140  199.51 MiB  256          47.6%             fcc38a17  STG1
DN  10.0.0.141  170.52 MiB  256          48.3%             3d7b4e59  STG1
Datacenter: DC03
====================
Status=TRUE/FALSE
|/ State=Normal/Leaving/Joining/Moving
--  Address     Load       USER       grant (actual)       Host ID    Vol
DN  10.0.0.150  229.2 MiB  256           50.0%             0fa51a1a  STG1
DN  10.0.0.151  195.88 MiB  256          51.1%             e329ac17  STG1
DN  10.0.0.148  147.01 MiB  256          48.9%             c14bd7ae  STG1
DN  10.0.0.149  298.34 MiB  256          51.3%             6c73d2b5  STG1

考虑以下示例,令 file.txt 内容为

-- Able Baker Charlie
DN 1    2     3
DN 4    5     6
DN 7    8     9
-- Charlie
DN 10
DN 11
DN 12

然后

awk 'BEGIN{colname="Charlie"}/--/{delete names;for(i=1;i<=NF;i+=1){names[$i]=i};next}{print $(names[colname])}' file.txt

给出输出

3
6
9
10
11
12

说明:我使用 colname 变量来存储所需的列名。当遇到包含 -- 的行时,它被视为带有列名的 header 。 names 数组被清除,以防止有前一个块的残余,然后填充,以便列名(键)对应于它的位置(值)。这样做之后,我指示 GNU AWK 处理 next 行,即没有打印任何内容。对于其他行,我通知 GNU AWK 查找与所选名称对应的数字和 print 该列。

(在 gawk 4.2.1 中测试)

结合@Dan 和@Daweo 的想法

awk -F' {2,}' -v col='grant (actual)' '
  /^Datacenter/ {i=0}
   == "--" {for (i=1; i<=NF; i++) if ($i == col) break; next}
  i {print $i}
' cas.txt
49.9%
55.4%
53.5%
48.7%
53.1%
50.0%
47.6%
48.3%
50.0%
51.1%
48.9%
51.3%

如果你想在输出中看到col header,只需删除next

查看您的数据,我们将使用 split() 在 2 个或更多 space 秒处拆分记录 (/ +/):

$ awk '~/^--$/ {                  # -- starts the header record
    n=split([=10=],h,/  +/)             # get field count n of header record
    for(i=1;i<=n;i++)               # iterate fields 
        if(h[i]=="grant (actual)")  # looking for desired header
            break                   # break once found, i is the field number
}
split([=10=],a,/  +/)==n {              # process records with equal amount of fields
    print a[i]                      # and output ith field
}' file

输出:

grant (actual)
49.9%
55.4%
53.5%
48.7%
grant (actual)
53.1%
47.6%
48.3%
grant (actual)
50.0%
51.1%
48.9%
51.3%

对于最后一个字段仅由 1 分隔的记录,上述操作失败 space:

DN  10.0.0.143  166.65 MiB  256          50.0%             d5469c9b STG1

使用 FIELDWIDTHS 的 GNU awk 和 split() 的第 4 个参数,您可以创建一个数组(下面的 f[]),将列名映射到它们的编号,然后您可以打印、比较、重新排序或对列执行任何您喜欢的操作,只需使用列名对该数组进行索引即可:

$ cat tst.awk
/^--/ {
    if ( FIELDWIDTHS == "" ) {
        wids = ""
        numFlds = split([=10=],flds,/  +/,seps)
        for ( fldNr=1; fldNr<=numFlds; fldNr++ ) {
            f[flds[fldNr]] = fldNr
            wids = (fldNr>1 ? wids " " : "") length(flds[fldNr] seps[fldNr])
        }
        FIELDWIDTHS = wids
        [=10=] = [=10=]
    }
    inBlock = 1
}
inBlock {
    if ( /^Datacenter:/ ) {
        print ""
        inBlock = 0
        next
    }
    for ( i=1; i<=NF; i++ ) {
        gsub(/^\s+|\s+$/,"",$i)
    }
    print $(f["grant (actual)"])
}

$ awk -f tst.awk cas.txt
grant (actual)
49.9%
55.4%
53.5%
48.7%

grant (actual)
53.1%
50.0%
47.6%
48.3%

grant (actual)
50.0%
51.1%
48.9%
51.3%

概要

基于 awk 的解决方案:

- doesn't require gnu-gawk for FIELDWIDTHS/fixed width fields

- doesn't require fudging with FS/OFS/RS/FPAT

- doesn't require a specialized regex engine, 

                  e.g. with back-references support

- doesn't require array-splitting or dealing with the 
                  painfully slow match() function

- doesn't *even* require a single call to any function

输入

Datacenter: DC01
====================
Status=TRUE/FALSE
|/ State=Normal/Leaving/Joining/Moving
--  Address     Load       USER       grant (actual)      Host ID    Vol
DN  10.0.0.138  221.03 MiB  256          49.9%             dd09f7aa  STG1
DN  10.0.0.139  173.47 MiB  256          55.4%             53179492  STG1
DN  10.0.0.136  200.08 MiB  256          53.5%             89a28140  STG1
DN  10.0.0.137  318.69 MiB  256          48.7%             8cc9dfac  STG1
Datacenter: DC02
====================
Status=TRUE/FALSE
|/ State=Normal/Leaving/Joining/Moving
--  Address     Load       USER       grant (actual)       Host ID    Vol
DN  10.0.0.142  270.01 MiB  256          53.1%             04210b53  STG1
DN  10.0.0.143  166.65 MiB  256          50.0%             d5469c9b STG1
DN  10.0.0.140  199.51 MiB  256          47.6%             fcc38a17  STG1
DN  10.0.0.141  170.52 MiB  256          48.3%             3d7b4e59  STG1
Datacenter: DC03
====================
Status=TRUE/FALSE
|/ State=Normal/Leaving/Joining/Moving
--  Address     Load       USER       grant (actual)       Host ID    Vol
DN  10.0.0.150  229.2 MiB  256           50.0%             0fa51a1a  STG1
DN  10.0.0.151  195.88 MiB  256          51.1%             e329ac17  STG1
DN  10.0.0.148  147.01 MiB  256          48.9%             c14bd7ae  STG1
DN  10.0.0.149  298.34 MiB  256          51.3%             6c73d2b5  STG1
 

代码

< cas.txt |

{m,g}awk '   !NF   ? !_ : /^[=]+/ ? ($!_=!__ ? "" : " ") \
         : --NF<+_ ? !_ : __+=($!_=(/%/?"":$(_-_^!_)" ")($_))^!_' \_=6

输出

 1  grant (actual)
 2  49.9%
 3  55.4%
 4  53.5%
 5  48.7%
 6   
 7  grant (actual)
 8  53.1%
 9  50.0%
10  47.6%
11  48.3%
12   
13  grant (actual)
14  50.0%
15  51.1%
16  48.9%
17  51.3%