Bash/GitBash AWK 通过列名而不是列号获取列值（从命令输出）

Question

我安装了 latest GitBash 版本并且 $BASH_VERSION 是 4.4.23(1)。

现在我得到命令 ps aux 的输出，如下所示：

PID    PPID    PGID     WINPID   TTY         UID    STIME COMMAND
<4-DIGITS>  <1-DIGIT>  <4-DIGITS>   <4-DIGITS>  ?   <5-DIGITS> <CURR_TIME> <COMMAND>
<4-DIGITS>  <1-DIGIT>  <4-DIGITS>   <4-DIGITS>  ?   <5-DIGITS> <CURR_TIME> <COMMAND>
....
....
<4-DIGITS>  <1-DIGIT>  <4-DIGITS>   <4-DIGITS>  ?   <5-DIGITS> <CURR_TIME> <COMMAND>
<4-DIGITS>  <1-DIGIT>  <4-DIGITS>   <4-DIGITS>  ?   <5-DIGITS> <CURR_TIME> <COMMAND>

从这个输出中我希望能够通过指定列名（也可以是多列）而不是每次从左到右计算列号的列来提取特定的列值。

我有这个命令，但它只适用于文件，但我希望它也适用于另一个命令的输出：

awk -vcol=<COL_NAME> '(NR==1){colnum=-1;for(i=1;i<=NF;i++)if($(i)==col)colnum=i;}{print $(colnum)}'

我怎样才能像 ps aux | awk <COLUMN_NAME=WINPID> 这样过滤以前命令的输出？

Answer 1

假设：

ps 输出字段不包含白色 space（例如，STIME 下的条目看起来不像 Sep 27）
列名匹配区分大小写（这可以通过添加 tolower() 调用来更改）

示例输入文件：

$ cat ps.out
PID    PPID    PGID     WINPID   TTY         UID    STIME COMMAND
<4-DIGITS1>  <1-DIGIT1>  <4-DIGITS1>   <4-DIGITS1>  ?   <5-DIGITS1> <CURR_TIME1> <COMMAND1>
<4-DIGITS2>  <1-DIGIT2>  <4-DIGITS2>   <4-DIGITS2>  ?   <5-DIGITS2> <CURR_TIME2> <COMMAND2>
<4-DIGITS3>  <1-DIGIT3>  <4-DIGITS3>   <4-DIGITS3>  ?   <5-DIGITS3> <CURR_TIME3> <COMMAND3>
<4-DIGITS4>  <1-DIGIT4>  <4-DIGITS4>   <4-DIGITS4>  ?   <5-DIGITS4> <CURR_TIME4> <COMMAND4>

一个想法使用 awk:

$ columns='WINPID'
$ awk -v cols="${columns}" '
BEGIN  { n=split(cols,arr,",")            # parse list of column names
         for (i=1;i<=n;i++) 
             headers[arr[i]]              # convert to associative array
       }
FNR==1 { for (i=1;i<=NF;i++)              # for each field (aka column) header ...
             if ($i in headers)           # if it is in headers[] then ...
                fields[i]                 # keep track of the associated field #
       }
       { pfx=""
         for (i=1;i<=NF;i++) {            # for each input field # ...
             if (i in fields) {           # if it is in fields[] then ...
                printf "%s%s", pfx, $i    # print the field (aka column)
                pfx=OFS
             }
         }
         printf "\n"                      # terminate the line
       }
' ps.out

这会生成：

WINPID
<4-DIGITS1>
<4-DIGITS2>
<4-DIGITS3>
<4-DIGITS4>

使用 columns='WINPID,UID' 我们得到：

WINPID UID
<4-DIGITS1> <5-DIGITS1>
<4-DIGITS2> <5-DIGITS2>
<4-DIGITS3> <5-DIGITS3>
<4-DIGITS4> <5-DIGITS4>

注意： OP 可以修改 printf 格式以根据需要调整输出

要将 awk 脚本直接应用于 ps 的输出（通过 cat ps.out 模拟）：

$ columns='PID,STIME,COMMAND'
$ cat ps.out | awk -v cols="${columns}" '
BEGIN  { n=split(cols,arr,",")
         for (i=1;i<=n;i++) 
             headers[arr[i]]
       }
FNR==1 { for (i=1;i<=NF;i++)
             if ($i in headers)
                fields[i]
       }
       { pfx=""
         for (i=1;i<=NF;i++) {
             if (i in fields) {
                printf "%s%s", pfx, $i
                pfx=OFS
             }
         }
         printf "\n"
       }
'

这会生成：

PID STIME COMMAND
<4-DIGITS1> <CURR_TIME1> <COMMAND1>
<4-DIGITS2> <CURR_TIME2> <COMMAND2>
<4-DIGITS3> <CURR_TIME3> <COMMAND3>
<4-DIGITS4> <CURR_TIME4> <COMMAND4>

Bash/GitBash AWK 通过列名而不是列号获取列值（从命令输出）

Bash/GitBash AWK get column values(from command output) by column name instead of column number

awk

filtering

multiple-columns

git-bash