读取日志文件以提取各种字段并计算发生次数

Read a log file to extract various fields and count the occurence

我有一个这样的记录器文件:

2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO  JobLoader c.t.commerce.common.utils.APIUtils - Enrichment data updated successful for partnumber : 13794017 with status : 201
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO  JobLoader c.t.commerce.common.utils.APIUtils - Enrichment data updated successful for partnumber : 13794017 with status : 201
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO  JobLoader c.t.commerce.common.utils.APIUtils - Enrichment data updated successful for partnumber : 13794017 with status : 201
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO  JobLoader c.t.c.w.b.JobParserBolt - execute() 13794017 itemStatus itemsProcessed=1, itemsUpdated=1, timeTaken=1.808 sec
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO  JobLoader c.t.c.w.b.JobParserBolt - execute() 13794017 itemStatus itemsProcessed=1, itemsUpdated=1, timeTaken=1.808 sec
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO  JobLoader c.t.c.w.b.JobParserBolt - execute() 13794017 itemStatus itemsProcessed=1, itemsUpdated=1, timeTaken=1.808 sec
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO  JobLoader c.t.c.w.b.JobParserBolt - execute()  Scene7 update for 13794017 itemStatus itemsProcessed=1, itemsUpdated=1, timeTaken=
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO  JobLoader c.t.c.w.b.JobParserBolt - execute()  Scene7 update for 13794017 itemStatus itemsProcessed=1, itemsUpdated=1, timeTaken=
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO  JobLoader c.t.c.w.b.JobParserBolt - execute()  Scene7 update for 13794017 itemStatus itemsProcessed=1, itemsUpdated=1, timeTaken=
2016-06-11 07:34:01.543 98460 [Thread-23-Job-parser-bolt] INFO  JobLoader c.t.c.w.b.JobParserBolt - execute EXIT
2016-06-11 07:34:01.543 98460 [Thread-23-Job-parser-bolt] INFO  JobLoader c.t.c.w.b.JobParserBolt - execute EXIT
2016-06-11 07:34:01.543 98460 [Thread-23-Job-parser-bolt] INFO  JobLoader c.t.c.w.b.JobParserBolt - execute EXIT
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO  JobLoader c.t.commerce.common.utils.APIUtils - Enrichment data updated successful for partnumber : 17696532 with status : 500
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO  JobLoader c.t.commerce.common.utils.APIUtils - Enrichment data updated successful for partnumber : 17696532 with status : 500
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO  JobLoader c.t.commerce.common.utils.APIUtils - Enrichment data updated successful for partnumber : 17696532 with status : 500
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO  JobLoader c.t.c.w.b.JobParserBolt - execute() 17696532 itemStatus itemsProcessed=1, itemsUpdated=1, timeTaken=1.808 sec
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO  JobLoader c.t.c.w.b.JobParserBolt - execute() 17696532 itemStatus itemsProcessed=1, itemsUpdated=1, timeTaken=1.808 sec
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO  JobLoader c.t.c.w.b.JobParserBolt - execute() 17696532 itemStatus itemsProcessed=1, itemsUpdated=1, timeTaken=1.808 sec
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO  JobLoader c.t.c.w.b.JobParserBolt - execute()  Scene7 update for 17696532 itemStatus itemsProcessed=1, itemsUpdated=1, timeTaken=
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO  JobLoader c.t.c.w.b.JobParserBolt - execute()  Scene7 update for 17696532 itemStatus itemsProcessed=1, itemsUpdated=1, timeTaken=
2016-06-11 07:34:01.542 98459 [Thread-23-Job-parser-bolt] INFO  JobLoader c.t.c.w.b.JobParserBolt - execute()  Scene7 update for 17696532 itemStatus itemsProcessed=1, itemsUpdated=1, timeTaken=
2016-06-11 07:34:01.543 98460 [Thread-23-Job-parser-bolt] INFO  JobLoader c.t.c.w.b.JobParserBolt - execute EXIT
2016-06-11 07:34:01.543 98460 [Thread-23-Job-parser-bolt] INFO  JobLoader c.t.c.w.b.JobParserBolt - execute EXIT
2016-06-11 07:34:01.543 98460 [Thread-23-Job-parser-bolt] INFO  JobLoader c.t.c.w.b.JobParserBolt - execute EXIT

重复了数百个具有不同部件号和状态代码的此类日志。我想将具有非 201 状态代码的不同零件号存储到单独的文件中,以便我们可以轻松监控它。虽然我想统计所有 201 个成功帖子。因此,我想要的示例输出应该如下所示:

No. of partnumbers with Status 201: 1
Partnumbers with Status 500: 17696532, ... , ...
Partnumbers with Status 401: ... ,...

我先用了awk,后来解析就没那么容易了。另请注意,同一个零件号出现多次,我如何添加一个支票,这样我才不会多次计算一个零件号。

到目前为止我的代码:

awk -F'Enrichment data updated successful for partnumber :' '{print }' file.log |rev | cut -c 4- | rev

我想像这样先提取部件号,但我无法应用检查来避免多个部件号问题并将其与相应的状态代码相关联。

这里是用awk解决的问题。请参阅内联注释以获取解释。

awk '/Enrichment data updated successful for partnumber/ {
    # store the results as a multidimensional array with the first key
    # being the status and the key of the second array being the product
    # number. This removes duplicates because array keys must be unique
    arr[$NF][]++
}
END {
    # iterate over the 201 status items and count them
    for (item in arr[201]) {
        count++
    }
    print "No. of partnumbers with Status 201: " count

    # iterate over the status array
    for (status in arr) {
        # skip 201 status
        if (status == 201)
            continue
        # join the array by "," for printing
        # taken from 
        joined = sep = ""
        for (product in arr[status]) {
            joined = joined sep product
            sep = ","
        }

        print "Partnumbers with Status " status ": " joined
    }
}
' foo.log

这会在您的示例日志文件中生成以下输出,我在其中添加了一些额外的行:

No. of partnumbers with Status 201: 1
Partnumbers with Status 401: 17623039
Partnumbers with Status 500: 17696532, 17696539

awk,使用datamash and pee:

echo -n "No. of partnumbers with Status 201: " ; \
grep "status : " file.log | pee \
    'grep    ": 201" | datamash -W -s countunique 16'  \
    'grep -v ": 201" | datamash -W -s -g20 unique 16 | \
        sed "s/^[0-9]*/Partnumbers with Status &:/;s/,/, /g"'

输出,(使用来自 OP 的示例数据):

No. of partnumbers with Status 201: 1
Partnumbers with Status 500:    17696532