将单个和multi-line个App日志记录引入ELK(部分包含JSONobjects)

Bringing in single and multi-line App log records to ELK (some contain JSON objects)

我正在尝试从自定义 (node.js) 应用程序中获取日志记录,该应用程序会将数据放入弹性搜索中,然后由 Kibana 进行处理。我的环境是 Ubuntu,带有 ELK(Elasticsearch、Logstash 和 Kibana),日志生成应用程序在 Node.JS

我已经在处理标准系统日志文件,例如 syslog、nginx。 ELK环境和App在不同服务器

由于这是一个由我们的应用程序创建的日志文件,因此它包含具有各种模式的条目。但是每个条目确实以一个共同的 header 开头 [示例 - 2015-03-17T11:26:27.285Z (INFO|dev3) ) 创建交互文档...] 这是 [日期+时间戳(msg-level|system-ID)一些消息文本]

通常这是整个日志条目。但有时它后面可以跟一个JSON object。根据生成的消息,它可能有不同的 JSON Object。如果包含 JSON object(从下一行开始),该行将以“...”(减去引号)结尾,但并非所有以这种方式结尾的行都有 JSON object 关注

作为第一步,我将把整个 multi-line JSON object 作为消息的一部分引入。现在我正在使用系统日志过滤器,每一行都作为单独的消息出现。那么我的最终目标是解析 JSON object 并将包含的字段单独存储,以便 Kibana 可以干净地过滤它们各自的值。

据我目前所见,有两种方法可以做到这一点。

我的第一个问题是在长 运行 中哪种方法最灵活?创建一个 multi-line 过滤器并将 JSON object 作为一条消息导入可能是最快的。但是,如果直接写入弹性搜索可以更轻松地引入不同的 JSON object 并使各个字段可用于过滤器,这可能是我的长期目标。

我在下面包含了一些虚拟样本日志数据,以显示我要处理的内容

谢谢

2015-03-17T11:26:27.285Z (INFO|dev3) Creating interaction document...
{ req_url: '/nL4sWsw',
  company_cookie: '68d1dc4a32ed3bfd22c96a6e60a132924e5d8fa8',
  browsing_cookie: '68d1dc4a32ed3bfd22c96a6e60a132924e5d8fa8',
  visit_count: 1,
  campaign_id: 52d6ab20bbc1e6ac0500032f,
  switchboard_id: 54888c6ffc4ac2cb18a3b8c6,
  content_key: '2d0515120561b7be80c936027f6dce71b41a0391',
  http_header: 
   { 'x-host': 'subdomain.company.org',
     'x-ip': '111.222.333.444',
     host: 'logic9',
     connection: 'close',
     'user-agent': 'Mozilla/5.0 (compatible; ext-monitor - premium monitoring service; http://www.ext-monitor.com)' },
  timestamp: Tue Mar 17 2015 06:26:27 GMT-0500 (CDT),
  url: 'https://cdn.company.org/2d0515120561b7be80c936027f6dce71b41a0391/',
  type7id: 'nL4sWsw',
  pid: undefined,
  onramp_type: 'type7',
  http_user_agent: 'Other',
  http_browser: 'Other' }
2015-03-17T11:26:27.285Z (INFO|dev3) Inserting interactions data...
{ 'statistics.total_interactions': 1,
  'statistics.day_of_week.tuesday': 1,
  'statistics.onramp_type.type7': 1,
  'statistics.hour_of_day.11': 1,
  'statistics.operating_systems.other': 1,
  'statistics.browser_types.other': 1 }
2015-03-17T11:26:27.286Z (INFO|dev3) Updating campaign 52d6ab20bbc1e6ac0500032f with stats {"statistics.total_interactions":1,"statistics.day_of_week.tuesday":1,"statistics.onramp_type.type7":1,"statistics.hour_of_day.11":1,"statistics.operating_systems.other":1,"statistics.browser_types.other":1} ...
2015-03-17T11:26:27.286Z (INFO|dev3) Redirecting to https://cdn.company.org/2d0515120561b7be80c936027f6dce71b41a0391/ ...
2015-03-17T11:26:27.286Z (INFO|dev3) Campaign statistics recorded successfully
2015-03-17T11:26:27.287Z (INFO|dev3) GET /zVoxiPV
2015-03-17T11:26:27.287Z (INFO|dev3) GET /vumkm3A
2015-03-17T11:26:27.287Z (INFO|dev3) Starting response for type7v1 ...
2015-03-17T11:26:27.287Z (INFO|dev3) Header: {"x-host":"subdomain.company.org","x-ip":"111.222.333.444","host":"logic9","connection":"close","user-agent":"Mozilla/5.0 (compatible; ext-monitor - premium monitoring service; http://www.ext-monitor.com)"}
2015-03-17T11:26:27.287Z (INFO|dev3) Params: {"tid":"zVoxiPV"}
2015-03-17T11:26:27.287Z (INFO|dev3) Sending taIdentity cookie: f79b8ceca66f99608fb1291ab51d65b08fa3138f ...
2015-03-17T11:26:27.287Z (INFO|dev3) Sending taBrowse cookie: f79b8ceca66f99608fb1291ab51d65b08fa3138f ...
2015-03-17T11:26:27.287Z (INFO|dev3) Sending new cookie: 96ec5414d0b847790f58a1feee2399d282cf7907 with visit count 1 ...
2015-03-17T11:26:27.288Z (INFO|dev3) Finding in switchboard {"active":true,"campaign.start_at":{"$lte":"2015-03-17T11:26:27.287Z"},"campaign.end_at":{"$gte":"2015-03-17T11:26:27.287Z"},"type7id":"zVoxiPV"}
2015-03-17T11:26:27.288Z (INFO|dev3) Starting response for type7v1 ...
2015-03-17T11:26:27.288Z (INFO|dev3) Header: {"x-host":"subdomain.company.org","x-ip":"111.222.333.444","host":"logic9","connection":"close","user-agent":"Mozilla/5.0 (compatible; ext-monitor - premium monitoring service; http://www.ext-monitor.com)"}
2015-03-17T11:26:27.288Z (INFO|dev3) Params: {"tid":"vumkm3A"}
2015-03-17T11:26:27.288Z (INFO|dev3) Sending taIdentity cookie: adec72a656ef7999d101edc7e1e9cf901e1e56c9 ...
2015-03-17T11:26:27.288Z (INFO|dev3) Sending taBrowse cookie: adec72a656ef7999d101edc7e1e9cf901e1e56c9 ...
2015-03-17T11:26:27.288Z (INFO|dev3) Sending new cookie: 0c1354b30bf261595bf24a14c2e90ecac64545ed with visit count 1 ...
2015-03-17T11:26:27.288Z (INFO|dev3) Finding in switchboard {"active":true,"campaign.start_at":{"$lte":"2015-03-17T11:26:27.288Z"},"campaign.end_at":{"$gte":"2015-03-17T11:26:27.288Z"},"type7id":"vumkm3A"}
2015-03-17T11:26:27.289Z (INFO|dev3) Finding in matching set [object Object]
2015-03-17T11:26:27.289Z (INFO|dev3) Switchboard item {"_id":"5488a7ea60c5508693bebba7","content_provider":"redirect","content":{"_id":"54b8954eca0ca5eb87cb4fef","name":"Content for Switchboard 5488a7ea60c5508693bebba7","key":"ad354806eadd0f90ef55b1ab96a8c84272401186"},"type":"redirect","campaign":{"end_at":"2018-12-11T00:00:00.000Z","start_at":"2008-12-11T00:00:00.000Z","_id":"52a9dd9bfb9c94150600032f"}}
2015-03-17T11:26:27.289Z (INFO|dev3) No url for redirect, going local...
2015-03-17T11:26:27.289Z (INFO|dev3) url: https://cdn.company.org/ad354806eadd0f90ef55b1ab96a8c84272401186/
2015-03-17T11:26:27.289Z (INFO|dev3) Sending redirect to https://cdn.company.org/ad354806eadd0f90ef55b1ab96a8c84272401186/ ...
2015-03-17T11:26:27.289Z (INFO|dev3) Creating interaction document...
{ req_url: '/zVoxiPV',
  company_cookie: 'f79b8ceca66f99608fb1291ab51d65b08fa3138f',
  browsing_cookie: 'f79b8ceca66f99608fb1291ab51d65b08fa3138f',
  visit_count: 1,
  campaign_id: 52a9dd9bfb9c94150600032f,
  switchboard_id: 5488a7ea60c5508693bebba7,
  content_key: 'ad354806eadd0f90ef55b1ab96a8c84272401186',
  http_header: 
   { 'x-host': 'subdomain.company.org',
     'x-ip': '111.222.333.444',
     host: 'logic9',
     connection: 'close',
     'user-agent': 'Mozilla/5.0 (compatible; ext-monitor - premium monitoring service; http://www.ext-monitor.com)' },
  timestamp: Tue Mar 17 2015 06:26:27 GMT-0500 (CDT),
  url: 'https://cdn.company.org/ad354806eadd0f90ef55b1ab96a8c84272401186/',
  type7id: 'zVoxiPV',
  pid: undefined,
  onramp_type: 'type7',
  http_user_agent: 'Other',
  http_browser: 'Other' }

忘记创建一个应用程序来将日志写入 elasticsearch,您只是在重新发明轮子。 Logstash 可以做到这一点,您只需要稍微阅读一下如何让它做您想让它做的事情。当您将 json 编码的消息传递到 logstash 中的 json 过滤器时,它将采用键值对,一旦发送到 elasticsearch,该数据将被索引和搜索。

我建议您只需要先做几件事,放入一个 mulitline 过滤器,将 json 编码的数据放到同一行。我只使用过多行过滤器来重新加入具有一个识别功能的行,您可以使用该功能在行的 start/end 处匹配。在你的情况下我看不到一个,但我认为你可以将 2 个多行过滤器链接在一起:

filter {
  multiline {
    #this one will look for any line starting with whitespace and join it to the previous line
    what => "previous"
    pattern  => "^\s"
  }
  multiline {
    #this one will look for any line starting with { and join it to the previous line
    what => "previous"
    pattern  => "^\{"
  }
}

在多行过滤器之后,我会使用一个 grok filter, this can be used to pull out the date and any other parts of the message, you should be able to use this to capture the json encoded part of the message and then, you can run that through the json 过滤器,你已经将它捕获到一个字段中。

我对 logstash 和多行过滤器有很多经验,我可以告诉你,它非常脆弱,出现问题时很难调试。

如果您删除所有换行符并确保它是正确的 json,那么 logstash 可以毫无问题地接收 json。因此,我的建议是确保应用程序以一种易于使用 logstash 摄取的方式编写 json,特别是如果这是一个自定义应用程序。