修改awk脚本添加循环逻辑

Question

我有一个 awk 脚本来打印出现在我的文件名中的 pids。其中 myfilename 包含一个 pids 列表，每个出现在一个新行上...

ps -eaf | awk -f script.awk myfilename -

这里是 script.awk...

的内容

# process the first file on the command line (aka myfilename)
# this is the list of pids
ARGIND == 1 {
    pids[[=11=]] = 1
}

# second and subsequent files ("-"/stdin in the example)
ARGIND > 1 {
    # is column 2 of the ps -eaf output [i.e.] the pid in the list of desired
    # pids? -- if so, print the entire line
    if ( in pids)
        printf("%s\n",[=11=])
}

目前 comman 按 ps -eaf 命令的顺序打印出 pids 但是我希望它按照它们在 myfilename 中出现的顺序打印出 pids。

我试图修改脚本以循环遍历 $pids 并重复相同的逻辑，但我不太正确。

如果有人能帮助我，我将不胜感激。

谢谢

Answer 1

原谅我生锈的 AWK。也许这是可用的？

ARGIND == 1 {
    pids[[=10=]] = NR # capture the order
}

ARGIND > 1 {
    if ( in pids) {
        idx = pids[];
        matches[idx] = [=10=]; # capture the line and associate it with the ps -eaf order
        if (idx > max)
            max = idx;
    }
}

END {
    for(i = 1; i <= max; i++)
        if (i in matches)
            print matches[i];
}

我不知道 ps -eaf 的输出是什么样的，也不知道从其输出中可以利用哪些假设。当我第一次阅读这个问题时，我认为 OP 对脚本有两个以上的输入。如果它真的只有两个，那么反转输入可能更有意义，如果不是，那么这可能是更通用的方法。

Answer 2

我会使用 time-honoured NR==FNR 构造来代替。它有点像这样 (one-liner)。

ps -eaf | awk 'NR==FNR{p[]++;next}  in p' mypidlist -

NR==FNR的想法是我们查看当前记录号（NR），并将其与当前文件（FNR）中的记录号进行比较。如果相同，则我们在同一个文件中，因此我们存储一条记录并移动到下一行输入。

如果 NR==FNR 不是 true，那么我们只需检查 </code> 是否在数组中。</p> <p>所以第一个表达式用 <code>mypidlist 的内容填充数组 p[]，第二个构造只是一个条件，默认为 {print} 作为它的语句。

当然，上面的one-liner并没有解决您要求按照pid输入文件的顺序打印结果的要求。为此，您需要保留一个索引并将数据记录在数组中以进行某种排序。当然不一定非得是real排序，只保留索引本身就足够了。下面有点长了一个one-liner:

ps -eaf | awk 'NR==FNR{p[]++;o[++n]=;next} in p {c[]=[=11=]} END {for(n=1;n<=length(o);n++){print n,o[n],c[o[n]]}}' mypidlist -

为了便于阅读，awk 脚本如下所示：

# Record the pid list... NR==FNR { p[]++ # Each pid is an element in this array. o[++n]= # This array records the order of the pids. next } # If the second+ input source has a matching pid... in p { c[]=[=12=] # record the line in a third array, pid as key. } END { # At the end of our input, step through the ordered pid list... for (n=1;n<=length(o);n++) { print c[o[n]] # and print the collected line, using our pid index as key. } }

请注意，如果 ps 输出中缺少列表中的 pid，结果将打印一个空行，因为 awk 不会抱怨对不存在的数组索引的引用。

另请注意，length(arrayname) 表示法适用于 GAWK 和 OneTrueAwk，但可能不通用。如果这对你不起作用，你可以在你的 awk 脚本中添加这样的东西：

function alength(arrayname, i, n) { for(i in arrayname) n++ return n }

Answer 3

如果只有一个文件，您可以翻转输入顺序并使用惯用的 awk，如下所示

$ awk 'NR==1; NR==FNR{a[]=[=10=];next} [=10=] in a{print a[[=10=]]}' <(ps -eaf) <(seq 10)

UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  0 02:36 ?        00:00:03 /sbin/init
root         2     0  0 02:36 ?        00:00:00 [kthreadd]
root         3     2  0 02:36 ?        00:00:00 [ksoftirqd/0]
root         4     2  0 02:36 ?        00:00:00 [kworker/0:0]
root         5     2  0 02:36 ?        00:00:00 [kworker/0:0H]
root         6     2  0 02:36 ?        00:00:00 [kworker/u30:0]
root         7     2  0 02:36 ?        00:00:00 [rcu_sched]
root         8     2  0 02:36 ?        00:00:00 [rcuos/0]
root         9     2  0 02:36 ?        00:00:00 [rcuos/1]
root        10     2  0 02:36 ?        00:00:00 [rcuos/2]

此处，seq 提供的 ID 列表，替换为您的文件。

修改awk脚本添加循环逻辑

Modify awk script to add looping logic

unix

aix

awk