扩展 apache 日志的 logstash 过滤器定义
logstash filter definition for an extended apache log
我正在尝试为扩展的 apache 日志过滤器定义配置 logstash 过滤器。它基本上是带有一些附加字段的 'combined' LogFormat,这里是 apache 日志格式定义:
LogFormat "%h %{X-LB-Client-IP}i %l %u %m %t \"%{Host}i\" \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\" %D" combinextended
这是一个示例日志文件内容:
12.123.456.789 122.123.122.133 - - GET [06/May/2015:18:42:41 +0200] "www.example.com" "GET /fr-fr/test/content/ HTTP/1.1" 200 14023 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.5) Gecko/2008121718 Gentoo Firefox/3.0.5" 7729
我配置了 logstash-forward 来发送文件:
{
"paths": [
"/var/log/mysite/extended.log",
"/var/log/myothersite/extended.log"
],
"fields": { "type": "apache-extended" }
}
我在 /etc/logstash/conf.d
中的一个名为 13-apache-extended.conf
的文件中使用 grok 模式配置了 logstash 服务器:
filter {
if [type] == "apache-extended" {
grok {
match => { "message" => "%{IPORHOST:proxyip} %{IPORHOST:clientip} %{USER:ident} %{USER:auth} %{WORD:method} \[%{HTTPDATE:timestamp}\] \"%{IPORHOST:host}\" \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) \"%{GREEDYDATA:referer}\" \"%{GREEDYDATA:agent}\" %{NUMBER:responsetime}" }
}
}
}
我在 https://grokdebug.herokuapp.com/ 测试过它似乎没问题:
日志示例:
12.123.456.789 122.123.122.133 - - GET [06/May/2015:18:42:41 +0200] "www.example.com" "GET /fr-fr/test/content/ HTTP/1.1" 200 14023 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.5) Gecko/2008121718 Gentoo Firefox/3.0.5" 7729
模式:
%{IPORHOST:proxyip} %{IPORHOST:clientip} %{USER:ident} %{USER:auth} %{WORD:method} \[%{HTTPDATE:timestamp}\] \"%{IPORHOST:host}\" \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) \"%{GREEDYDATA:referer}\" \"%{GREEDYDATA:agent}\" %{NUMBER:responsetime}
但是当我在我的主服务器上重新启动 logstash 时,我得到了一个错误:
{:timestamp=>"2015-05-06T18:36:28.846000+0200", :message=>"Exception in lumberjack input", :exception=>#<LogStash::ShutdownSignal: LogStash::ShutdownSignal>, :level=>:error}
{:timestamp=>"2015-05-06T18:36:44.342000+0200", :message=>"Error: Expected one of #, {, } at line 35, column 142 (byte 969) after filter {\n if [type] == \"apache-extended\" {\n grok {\n match => { \"message\" => \"%{IPORHOST:proxyip} %{IPORHOST:clientip} %{USER:ident} %{USER:auth} %{WORD:method} \[%{HTTPDATE:timestamp}\] \""}
{:timestamp=>"2015-05-06T18:36:44.349000+0200", :message=>"You may be interested in the '--configtest' flag which you can\nuse to validate logstash's configuration before you choose\nto restart a running system."}
非常感谢任何想法。
谢谢。
您想阅读 apache logs.I 认为您可以尝试下面的代码。
input {
file {
path => "/var/log/apache2/*.log"
type => "apache"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
if [path] =~ "access" {
mutate { replace => { "type" => "apache_access" } }
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
}
date {
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z"]
}
}
output {
elasticsearch { hosts => localhost }
stdout { codec => rubydebug }
}
我已经测试了你的问题,并为你提供了 2 种可能的解决方案。
- 你确定你的伐木工人配置正确吗?您是否使用基本日志文件检查过它?
您确定配置中的模式与 post 相同吗?因为我注意到你的模式 post 和错误输出不一样。
%{IPORHOST:proxyip} %{IPORHOST:clientip} %{USER:ident} %{USER:auth} %{WORD:method} \[%{HTTPDATE:timestamp}\] \"%{IPORHOST:host}\" \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) \"%{GREEDYDATA:referer}\" \"%{GREEDYDATA:agent}\" %{NUMBER:responsetime}
!=
%{IPORHOST:proxyip} %{IPORHOST:clientip} %{USER:ident} %{USER:auth} %{WORD:method} \[%{HTTPDATE:timestamp}\] \"}
您不能在语义模式中输入 enter,您需要 \ 每个特殊字符。
此错误(“错误:应为第 35 行第 142 列(字节 969)中的 #、{、} 之一)通常意味着您在那个地方有语法错误,例如当您忘记转义特殊字符。
我在没有 lumberjack 的情况下测试了您的配置,一切正常。
日志示例:
12.123.456.789 122.123.122.133 - - GET [06/May/2015:18:42:41 +0200] "www.example.com" "GET /fr-fr/test/content/ HTTP/1.1" 200 14023 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.5) Gecko/2008121718 Gentoo Firefox/3.0.5" 7729
配置:
input {
file {
path => "d:/Git/LogstashELKElision/logstash/bin/log/test.log"
type => extendedapache
}}
filter {
if [type] == "extendedapache" {
grok {
match => [ "message", "%{IPORHOST:proxyip} %{IPORHOST:clientip} %{USER:ident} %{USER:auth} %{WORD:method} \[%{HTTPDATE:timestamp}\] \"%{IPORHOST:host}\" \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) \"%{GREEDYDATA:referer}\" \"%{GREEDYDATA:agent}\" %{NUMBER:responsetime}" ]
}
}
}
output {
elasticsearch { hosts => ["localhost:9200"] }
stdout { codec => rubydebug }
}input {
file {
path => "d:/Git/LogstashELKElision/logstash/bin/log/test.log"
type => extendedapache
}}
filter {
if [type] == "extendedapache" {
grok {
match => [ "message", "%{IPORHOST:proxyip} %{IPORHOST:clientip} %{USER:ident} %{USER:auth} %{WORD:method} \[%{HTTPDATE:timestamp}\] \"%{IPORHOST:host}\" \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) \"%{GREEDYDATA:referer}\" \"%{GREEDYDATA:agent}\" %{NUMBER:responsetime}" ]
}
}
}
output {
elasticsearch { hosts => ["localhost:9200"] }
stdout { codec => rubydebug }
}
我正在尝试为扩展的 apache 日志过滤器定义配置 logstash 过滤器。它基本上是带有一些附加字段的 'combined' LogFormat,这里是 apache 日志格式定义:
LogFormat "%h %{X-LB-Client-IP}i %l %u %m %t \"%{Host}i\" \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\" %D" combinextended
这是一个示例日志文件内容:
12.123.456.789 122.123.122.133 - - GET [06/May/2015:18:42:41 +0200] "www.example.com" "GET /fr-fr/test/content/ HTTP/1.1" 200 14023 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.5) Gecko/2008121718 Gentoo Firefox/3.0.5" 7729
我配置了 logstash-forward 来发送文件:
{
"paths": [
"/var/log/mysite/extended.log",
"/var/log/myothersite/extended.log"
],
"fields": { "type": "apache-extended" }
}
我在 /etc/logstash/conf.d
中的一个名为 13-apache-extended.conf
的文件中使用 grok 模式配置了 logstash 服务器:
filter {
if [type] == "apache-extended" {
grok {
match => { "message" => "%{IPORHOST:proxyip} %{IPORHOST:clientip} %{USER:ident} %{USER:auth} %{WORD:method} \[%{HTTPDATE:timestamp}\] \"%{IPORHOST:host}\" \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) \"%{GREEDYDATA:referer}\" \"%{GREEDYDATA:agent}\" %{NUMBER:responsetime}" }
}
}
}
我在 https://grokdebug.herokuapp.com/ 测试过它似乎没问题:
日志示例:
12.123.456.789 122.123.122.133 - - GET [06/May/2015:18:42:41 +0200] "www.example.com" "GET /fr-fr/test/content/ HTTP/1.1" 200 14023 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.5) Gecko/2008121718 Gentoo Firefox/3.0.5" 7729
模式:
%{IPORHOST:proxyip} %{IPORHOST:clientip} %{USER:ident} %{USER:auth} %{WORD:method} \[%{HTTPDATE:timestamp}\] \"%{IPORHOST:host}\" \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) \"%{GREEDYDATA:referer}\" \"%{GREEDYDATA:agent}\" %{NUMBER:responsetime}
但是当我在我的主服务器上重新启动 logstash 时,我得到了一个错误:
{:timestamp=>"2015-05-06T18:36:28.846000+0200", :message=>"Exception in lumberjack input", :exception=>#<LogStash::ShutdownSignal: LogStash::ShutdownSignal>, :level=>:error}
{:timestamp=>"2015-05-06T18:36:44.342000+0200", :message=>"Error: Expected one of #, {, } at line 35, column 142 (byte 969) after filter {\n if [type] == \"apache-extended\" {\n grok {\n match => { \"message\" => \"%{IPORHOST:proxyip} %{IPORHOST:clientip} %{USER:ident} %{USER:auth} %{WORD:method} \[%{HTTPDATE:timestamp}\] \""}
{:timestamp=>"2015-05-06T18:36:44.349000+0200", :message=>"You may be interested in the '--configtest' flag which you can\nuse to validate logstash's configuration before you choose\nto restart a running system."}
非常感谢任何想法。
谢谢。
您想阅读 apache logs.I 认为您可以尝试下面的代码。
input {
file {
path => "/var/log/apache2/*.log"
type => "apache"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
if [path] =~ "access" {
mutate { replace => { "type" => "apache_access" } }
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
}
date {
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z"]
}
}
output {
elasticsearch { hosts => localhost }
stdout { codec => rubydebug }
}
我已经测试了你的问题,并为你提供了 2 种可能的解决方案。
- 你确定你的伐木工人配置正确吗?您是否使用基本日志文件检查过它?
您确定配置中的模式与 post 相同吗?因为我注意到你的模式 post 和错误输出不一样。
%{IPORHOST:proxyip} %{IPORHOST:clientip} %{USER:ident} %{USER:auth} %{WORD:method} \[%{HTTPDATE:timestamp}\] \"%{IPORHOST:host}\" \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) \"%{GREEDYDATA:referer}\" \"%{GREEDYDATA:agent}\" %{NUMBER:responsetime}
!=
%{IPORHOST:proxyip} %{IPORHOST:clientip} %{USER:ident} %{USER:auth} %{WORD:method} \[%{HTTPDATE:timestamp}\] \"}
您不能在语义模式中输入 enter,您需要 \ 每个特殊字符。
此错误(“错误:应为第 35 行第 142 列(字节 969)中的 #、{、} 之一)通常意味着您在那个地方有语法错误,例如当您忘记转义特殊字符。
我在没有 lumberjack 的情况下测试了您的配置,一切正常。
日志示例:
12.123.456.789 122.123.122.133 - - GET [06/May/2015:18:42:41 +0200] "www.example.com" "GET /fr-fr/test/content/ HTTP/1.1" 200 14023 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.5) Gecko/2008121718 Gentoo Firefox/3.0.5" 7729
配置:
input {
file {
path => "d:/Git/LogstashELKElision/logstash/bin/log/test.log"
type => extendedapache
}}
filter {
if [type] == "extendedapache" {
grok {
match => [ "message", "%{IPORHOST:proxyip} %{IPORHOST:clientip} %{USER:ident} %{USER:auth} %{WORD:method} \[%{HTTPDATE:timestamp}\] \"%{IPORHOST:host}\" \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) \"%{GREEDYDATA:referer}\" \"%{GREEDYDATA:agent}\" %{NUMBER:responsetime}" ]
}
}
}
output {
elasticsearch { hosts => ["localhost:9200"] }
stdout { codec => rubydebug }
}input {
file {
path => "d:/Git/LogstashELKElision/logstash/bin/log/test.log"
type => extendedapache
}}
filter {
if [type] == "extendedapache" {
grok {
match => [ "message", "%{IPORHOST:proxyip} %{IPORHOST:clientip} %{USER:ident} %{USER:auth} %{WORD:method} \[%{HTTPDATE:timestamp}\] \"%{IPORHOST:host}\" \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) \"%{GREEDYDATA:referer}\" \"%{GREEDYDATA:agent}\" %{NUMBER:responsetime}" ]
}
}
}
output {
elasticsearch { hosts => ["localhost:9200"] }
stdout { codec => rubydebug }
}