用于处理来自跟踪文件的数据的 Awk 脚本

Awk Script to process data from a trace file

我有一个包含不同行(事件)的 table(.tr 文件)。

**Event**     **Time**   **PacketLength**  PacketId
sent             1              100           1
dropped          2              100           1
sent             3              100           2
sent             4.5            100           3
dropped          5              100           2
sent             6              100           4
sent             7              100           5
sent             8              100           6
sent             10             100           7

我想创建一个新的 table 如下,但我不知道如何在 AWK 中创建它。

**SentTime**       **PacketLength        Dropped**
1                         100              Yes
3                         100              Yes     
4.5                       100
6                         100
7                         100
8                         100
10                        100

我有一个简单的代码来查找丢弃或发送的数据包、时间和 ID,但我不知道如何在 table 中创建一个包含丢弃数据包结果的列。

BEGIN{}
{
    Event = ;
    Time = ;
    Packet = ;
    Node = ;
    id = ;
        if (Event=="s" && Node=="1.0.1.2"){
                printf ("%f\t %d\n", , );
        }
} 
    END {}

我会说...

awk '/sent/{pack[]=; len[]=}
     /dropped/{drop[]}
     END {print "Sent time", "PacketLength", "Dropped";
         for (p in pack) 
               print pack[p], len[p], ((p in drop)?"yes":"")
     }' file

这会将包裹存储在 pack[] 中,长度存储在 len[] 中,丢弃的存储在 drop[] 中,以便稍后获取它们。

测试

$ awk '/sent/{pack[]=; len[]=} /dropped/{drop[]} END {print "Sent time", "PacketLength", "Dropped"; for (p in pack) print pack[p], len[p], ((p in drop)?"yes":"")}' a
Sent time PacketLength Dropped
1 100 yes
3 100 yes
4.5 100 
6 100 
7 100 
8 100 
10 100 

您必须将所有信息保存在一个数组中,以便在文件末尾对其进行后处理。显然,如果文件很大,这可能会导致内存问题。

    BEGIN  {
            template="#sentTime\t#packetLength\t#dropped";
            }
            {
            print [=10=]; 
            event = ; 
            time = ; 
            packet_length = ;
            packet_id = ; 
            # save all the info in an array
            packet_info[packet_id] = packet_info[packet_id] "#" packet_length "#" time "#" event;
            }
    END     {
            # traverse the information of the array 
            for( time in packet_info ) 
            {
                print "the time is: " time " = " packet_info[time];
                # for every element in the array (= packet), 
                # the data has this format "#100#1#sent#100#2#dropped"
                split( packet_info[time], info, "#" );
                # info[2] <-- 100
                # info[3] <-- 1
                # info[4] <-- sent
                # info[5] <-- 100
                # info[6] <-- 2
                # info[7] <-- dropped
                line = template; 
                line = gensub( "#sentTime", info[3], "g", line );
                line = gensub( "#packetLength", info[2], "g", line ); 
                if( info[4] == "dropped" ) 
                    line = gensub( "#dropped", "yes", "g", line );
                if( info[7] == "dropped" ) 
                    line = gensub( "#dropped", "yes", "g", line );
                line = gensub( "#dropped", "", "g", line );
                print line; 
            } # for 
            }