需要一个 logstash-conf 文件来提取日志文件中不同字符串的计数

Need a logstash-conf file to extract the count of different strings in a log file

如何编写 logstash 配置文件以从日志文件中分离出两个不同的 (S:Info & S:Warn) 字符串并在 Kibana 中显示相应的计数?

尝试在 logstash 中使用 'grep' 过滤器,但不确定在 Kibana 中获取两个不同字符串(信息和警告)的计数。

下面是日志文件的片段:

Apr 23 21:34:07 LogPortSysLog: T:2015-04-23T21:34:07.276 N:933086 S:Info P:WorkerThread0#783 F:USBStrategyBaseAbs.cpp:724 D:T1T: Power request disabled for this cable. Defaulting to 1000mA
Apr 23 21:34:10 LogPortSysLog: T:2015-04-23T21:34:10.570 N:933087 S:Warn P:DasInterfaceThread#791 F:USBStrategyBaseAbs.cpp:1696 D:CP_CONTROL:Unexpected DasChildTag: 27 B:{}

您需要 grok 过滤器。我不一定得到完整的格式,但这些是我的猜测:

Apr 23 21:34:07 LogPortSysLog: T:2015-04-23T21:34:07.276 N:933086 S:Info P:WorkerThread0#783 F:USBStrategyBaseAbs.cpp:724 D:T1T: Power request disabled for this cable. Defaulting to 1000mA

这转化为:

LOG_TIMESTAMP LOG_NAME: T:ACTUAL_TIMESTAMP N:LOGGED_EVENT_NUMBER S:SEVERITY P:THREAD_NAME F:FILENAME:LINE_NUMBER D:MESSAGE

我似乎在 MESSAGE 中收集了一些额外的信息,但这应该可以帮助您入门。

文件:

  1. data.log 包含您输入的两行。
  2. portlogs.conf 包含用于解析日志的 Logstash "configuration"。

    input {
      # You can change this to the file/other inputs
      stdin { }
    }
    
    filter {
      grok {
        # "message" is the field name filled in by most inputs with the
        #  current line to parse
        # Note: I throw away the log's timestamp and use the message timestamp,
        #  which may not be true for all of your logs!
        match => [
          "message",
          "%{SYSLOGTIMESTAMP} %{DATA:name}: T:%{TIMESTAMP_ISO8601:timestamp} N:%{INT:log_number:int} S:%{DATA:severity} P:%{DATA:thread} F:%{DATA:filename}:%{INT:line_number:int} D:%{GREEDYDATA:log_message}"
        ]
      }
    }
    
    output {
      # Change this to go to your Elasticsearch cluster
      stdout {
        codec => rubydebug
      }
    }
    

结合两者,使用 Logstash,我得到了输出(运行 Logstash 1.5 RC3,但 RC4 本周出来了):

{
        "message" => "Apr 23 21:34:07 LogPortSysLog: T:2015-04-23T21:34:07.276 N:933086 S:Info P:WorkerThread0#783 F:USBStrategyBaseAbs.cpp:724 D:T1T: Power request disabled for this cable. Defaulting to 1000mA",
       "@version" => "1",
     "@timestamp" => "2015-04-24T01:34:07.276Z",
           "host" => "Chriss-MBP-2",
           "name" => "LogPortSysLog",
     "log_number" => 933086,
       "severity" => "Info",
         "thread" => "WorkerThread0#783",
       "filename" => "USBStrategyBaseAbs.cpp",
    "line_number" => 724,
    "log_message" => "T1T: Power request disabled for this cable. Defaulting to 1000mA"
}
{
        "message" => "Apr 23 21:34:10 LogPortSysLog: T:2015-04-23T21:34:10.570 N:933087 S:Warn P:DasInterfaceThread#791 F:USBStrategyBaseAbs.cpp:1696 D:CP_CONTROL:Unexpected DasChildTag: 27 B:{}",
       "@version" => "1",
     "@timestamp" => "2015-04-24T01:34:10.570Z",
           "host" => "Chriss-MBP-2",
           "name" => "LogPortSysLog",
     "log_number" => 933087,
       "severity" => "Warn",
         "thread" => "DasInterfaceThread#791",
       "filename" => "USBStrategyBaseAbs.cpp",
    "line_number" => 1696,
    "log_message" => "CP_CONTROL:Unexpected DasChildTag: 27 B:{}"
}

如果您正确配置输出,那将是发送到 Elasticsearch 的两个文档。 Grok 消息只是正则表达式,因此您绝对可以创建一个专门解析(或不解析!)log_message 内部部分的模式,其中包括忽略某些内容,如上面的 B:{}。要忽略它,只需不提供字段名称(例如,:log_message 命名匹配的模式 log_message,因此如果不命名它,它就会被忽略)。

从那里开始,只需加载 Kibana and created a Visualization。它将自动使用上面的字段使它们可搜索。例如,您可以搜索 severity:warn 以仅查看严重性为 "Warn"(不区分大小写)的日志行。要找到完全匹配,可以像severity.raw:Warn一样使用自动添加的severity.raw来搜索它,但用户通常不会这样做。