logstash cloudfront codec plugin: Error: Object: #Version: 1.0 is not a legal argument to this wrapper, cause it doesn't respond to "read"

logstash cloudfront codec plugin: Error: Object: #Version: 1.0 is not a legal argument to this wrapper, cause it doesn't respond to "read"

Logstash 版本 1.5.0.1

我正在尝试使用 logstash s3 input plugin to download cloudfront logs and the cloudfront codec plugin 来过滤流。

我用 bin/plugin install logstash-codec-cloudfront 安装了云端编解码器。

我收到以下信息:错误:Object:#Version:1.0 不是此包装器的合法参数,因为它不响应 "read"。

这是来自 /var/logs/logstash/logstash.log

的完整错误消息
 {:timestamp=>"2015-08-05T13:35:20.809000-0400", :message=>"A plugin had an unrecoverable error. Will restart this plugin.\n  Plugin: <LogStash::Inputs::S3 bucket=>\"[BUCKETNAME]\", prefix=>\"cloudfront/\", region=>\"us-east-1\", type=>\"cloudfront\", secret_access_key=>\"[SECRETKEY]/1\", access_key_id=>\"[KEYID]\", sincedb_path=>\"/opt/logstash_input/s3/cloudfront/sincedb\", backup_to_dir=>\"/opt/logstash_input/s3/cloudfront/backup\", temporary_directory=>\"/var/lib/logstash/logstash\">\n  Error: Object: #Version: 1.0\n is not a legal argument to this wrapper, cause it doesn't respond to \"read\".", :level=>:error}

我的 logstash 配置文件:/etc/logstash/conf.d/cloudfront.conf

input {
  s3 {
    bucket => "[BUCKETNAME]"
    delete => false
    interval => 60 # seconds
    prefix => "cloudfront/"
    region => "us-east-1"
    type => "cloudfront"
    codec => "cloudfront"
    secret_access_key => "[SECRETKEY]"
    access_key_id => "[KEYID]"
    sincedb_path => "/opt/logstash_input/s3/cloudfront/sincedb"
    backup_to_dir => "/opt/logstash_input/s3/cloudfront/backup"
    use_ssl => true
  }
}

我正在使用类似的 s3 输入流成功地将我的 cloudtrail 日志记录到基于来自 Whosebug post 的 Answer 的 logstash 中 post。

来自 s3 的 CloudFront 日志文件(我只包含文件中的 header):

 #Version: 1.0
 #Fields: date time x-edge-location sc-bytes c-ip cs-method cs(Host) cs-uri-stem sc-status cs(Referer) cs(User-Agent) cs-uri-query cs(Cookie) x-edge-result-type x-edge-request-id x-host-header cs-protocol cs-bytes time-taken x-forwarded-for ssl-protocol ssl-cipher x-edge-response-result-type

header 看起来基本上是基于云端插件 github repo cloudfront_spec.rb 中第 26-29 行的正确格式 和官方 AWS CloudFront Access Logs 文档。

有什么想法吗?谢谢!

[2015 年 9 月 23 日更新]

基于这个 post I tried using the gzip_lines 编解码器插件,与 bin/plugin install logstash-codec-gzip_lines 一起安装并使用过滤器解析文件,不幸的是我得到了完全相同的错误。日志文件的第一个字符 # 似乎是个问题。

作为记录,这是新的尝试,包括由于四个新字段而用于解析云端日志文件的更新模式:

/etc/logstash/conf.d/cloudfront.conf

input {
  s3 {
    bucket => "[BUCKETNAME]"
    delete => false
    interval => 60 # seconds
    prefix => "cloudfront/"
    region => "us-east-1"
    type => "cloudfront"
    codec => "gzip_lines"
    secret_access_key => "[SECRETKEY]"
    access_key_id => "[KEYID]"
    sincedb_path => "/opt/logstash_input/s3/cloudfront/sincedb"
    backup_to_dir => "/opt/logstash_input/s3/cloudfront/backup"
    use_ssl => true
  }
}
filter {
    grok {
    type => "cloudfront"
    pattern => "%{DATE_EU:date}\t%{TIME:time}\t%{WORD:x_edge_location}\t(?:%{NUMBER:sc_bytes}|-)\t%{IPORHOST:c_ip}\t%{WORD:cs_method}\t%{HOSTNAME:cs_host}\t%{NOTSPACE:cs_uri_stem}\t%{NUMBER:sc_status}\t%{GREEDYDATA:referrer}\t%{GREEDYDATA:User_Agent}\t%{GREEDYDATA:cs_uri_stem}\t%{GREEDYDATA:cookies}\t%{WORD:x_edge_result_type}\t%{NOTSPACE:x_edge_request_id}\t%{HOSTNAME:x_host_header}\t%{URIPROTO:cs_protocol}\t%{INT:cs_bytes}\t%{GREEDYDATA:time_taken}\t%{GREEDYDATA:x_forwarded_for}\t%{GREEDYDATA:ssl_protocol}\t%{GREEDYDATA:ssl_cipher}\t%{GREEDYDATA:x_edge_response_result_type}"
  }

mutate {
    type => "cloudfront"
        add_field => [ "listener_timestamp", "%{date} %{time}" ]
    }

date {
      type => "cloudfront"
      match => [ "listener_timestamp", "yy-MM-dd HH:mm:ss" ]
    }

}

(这个问题可能应该被标记为重复,但在那之前我复制 my answer to the same question on ServerFault

我遇到了同样的问题,从

codec > "gzip_lines"

codec => "plain"

在输入中为我修复了它。看起来 S3 输入自动解压缩 gzip 文件。 https://github.com/logstash-plugins/logstash-input-s3/blob/master/lib/logstash/inputs/s3.rb#L13

FTR 这是对我有用的完整配置:

input {
  s3 {
    bucket => "[BUCKET NAME]"
    delete => false
    interval => 60 # seconds
    prefix => "CloudFront/"
    region => "us-east-1"
    type => "cloudfront"
    codec => "plain"
    secret_access_key => "[SECRETKEY]"
    access_key_id => "[KEYID]"
    sincedb_path => "/opt/logstash_input/s3/cloudfront/sincedb"
    backup_to_dir => "/opt/logstash_input/s3/cloudfront/backup"
    use_ssl => true
  }
}

filter {
        if [type] == "cloudfront" {
                if ( ("#Version: 1.0" in [message]) or ("#Fields: date" in [message])) {
                        drop {}
                }

                grok {
                        match => { "message" => "%{DATE_EU:date}\t%{TIME:time}\t%{WORD:x_edge_location}\t(?:%{NUMBER:sc_bytes}|-)\t%{IPORHOST:c_ip}\t%{WORD:cs_method}\t%{HOSTNAME:cs_host}\t%{NOTSPACE:cs_uri_stem}\t%{NUMBER:sc_status}\t%{GREEDYDATA:referrer}\t%{GREEDYDATA:User_Agent}\t%{GREEDYDATA:cs_uri_stem}\t%{GREEDYDATA:cookies}\t%{WORD:x_edge_result_type}\t%{NOTSPACE:x_edge_request_id}\t%{HOSTNAME:x_host_header}\t%{URIPROTO:cs_protocol}\t%{INT:cs_bytes}\t%{GREEDYDATA:time_taken}\t%{GREEDYDATA:x_forwarded_for}\t%{GREEDYDATA:ssl_protocol}\t%{GREEDYDATA:ssl_cipher}\t%{GREEDYDATA:x_edge_response_result_type}" }
                }

                mutate {
                        add_field => [ "received_at", "%{@timestamp}" ]
                        add_field => [ "listener_timestamp", "%{date} %{time}" ]
                }

                date {
                        match => [ "listener_timestamp", "yy-MM-dd HH:mm:ss" ]
                }

                date {
                        locale => "en"
                        timezone => "UCT"
                        match => [ "listener_timestamp", "yy-MM-dd HH:mm:ss" ]
                        target => "@timestamp"
                        add_field => { "debug" => "timestampMatched"}
                }
        }
}