无法连接到退避(异步(tcp://ip:5044)):拨号 tcp ip:5044:i/o 超时
Failed to connect to backoff(async(tcp://ip:5044)): dial tcp ip:5044: i/o timeout
Filebeat 在机器 B 上 运行ning,它读取日志并推送到机器 A 上的 ELK logstash。
但是在机器B的filebeat日志中,显示错误i/o timeout
2019-08-24T12:13:10.065+0800 ERROR pipeline/output.go:100 Failed to connect to backoff(async(tcp://example.com:5044)): dial tcp xx.xx.xx.xx:5044: i/o timeout
2019-08-24T12:13:10.065+0800 INFO pipeline/output.go:93 Attempting to reconnect to backoff(async(tcp://example.com:5044)) with 1 reconnect attempt(s)
我检查了 运行 机器 A 上的 logstash,可以在 0 0.0.0.0:5044
上收听
这是logstash日志
[INFO ] 2019-08-24 12:09:35.217 [[main]-pipeline-manager] beats - Beats inputs: Starting input listener {:address=>"0.0.0.0:5044"}
这里是 netstat
输出,
$ sudo netstat -tlnp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:5044 0.0.0.0:* LISTEN 20668/java
我还检查了机器 A 上的防火墙是否关闭。
$ firewall-cmd --list-all
FirewallD is not running
$ sudo iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain FORWARD (policy DROP)
target prot opt source destination
我也使用 telnet 连接机器 A,但是我明白了,
$ telnet example.com 5044
Trying xx.xx.xx.xx...
telnet: connect to address xx.xx.xx.xx: Connection timed out
我运行在机器A(本地)上配置相同的filebeat来检查它在机器B(远程)上的filebeat配置是错误的,它工作正常。
2019-08-24T14:17:35.195+0800 INFO pipeline/output.go:95 Connecting to backoff(async(tcp://localhost:5044))
2019-08-24T14:17:35.198+0800 INFO pipeline/output.go:105 Connection to backoff(async(tcp://localhost:5044)) established
最后发现是VPS阿里云提供商造成的,只开放了22、80443等常用端口
我需要登录阿里云VPS管理页面,打开5044让VPSProvider绕过5044端口
*注:*附:ELK配置filebeat时遇到的一些其他问题
**问题 1:**无法连接到退避(异步(tcp://ip:5044)):拨打 tcp ip:5044:连接:连接被拒绝
2019-08-26T10:25:41.955+0800 ERROR pipeline/output.go:100 Failed to connect to backoff(async(tcp://example.com:5044)): dial tcp xx.xx.xx.xx:5044: connect: connection refused
2019-08-26T10:25:41.955+0800 INFO pipeline/output.go:93 Attempting to reconnect to backoff(async(tcp://example:5044)) with 2 reconnect attempt(s)
问题 2: 发布事件失败,原因是:写入 tcp ip:46890->ip:5044:写入:对等方重置连接
2019-08-26T10:28:32.274+0800 ERROR logstash/async.go:256 Failed to publish events caused by: write tcp xx.xx.xx.xx:46890->xx.xx.xx.xx:5044: write: connection reset by peer
2019-08-26T10:28:33.311+0800 ERROR pipeline/output.go:121 Failed to publish events: write tcp xx.xx.xx.xx:46890->xx.xx.xx.xx:5044: write: connection reset by peer
问题 3: Filebeat 错误:lumberjack 协议错误和 Logstash error: OPENSSL_internal:WRONG_VERSION_NUMBER
Filebeat 日志错误,
2019-08-26T08:49:09.505+0800 INFO pipeline/output.go:95 Connecting to backoff(async(tcp://example.com:5044))
2019-08-26T08:49:09.588+0800 INFO pipeline/output.go:105 Connection to backoff(async(tcp://example.com:5044)) established
2019-08-26T08:49:09.605+0800 ERROR logstash/async.go:256 Failed to publish events caused by: lumberjack protocol error
2019-08-26T08:49:09.606+0800 ERROR logstash/async.go:256 Failed to publish events caused by: client is not connected
Logstash 日志,
[INFO ] 2019-08-26 08:49:29.444 [defaultEventExecutorGroup-4-2] BeatsHandler - [local: 0.0.0.0:5044, remote: undefined] Handling exception: javax.net.ssl.SSLHandshakeException: error:100000f7:SSL routines:OPENSSL_internal:WRONG_VERSION_NUMBER
[WARN ] 2019-08-26 08:49:29.445 [nioEventLoopGroup-2-7] DefaultChannelPipeline - An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.
io.netty.handler.codec.DecoderException: javax.net.ssl.SSLHandshakeException: error:100000f7:SSL routines:OPENSSL_internal:WRONG_VERSION_NUMBER
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:472) ~[netty-all-4.1.30.Final.jar:4.1.30.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:278) ~[netty-all-4.1.30.Final.jar:4.1.30.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[netty-all-4.1.30.Final.jar:4.1.30.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[netty-all-4.1.30.Final.jar:4.1.30.Final]
...
这三个问题都是配置不当引起的,这里是可行的配置,
logstash 版本,
/usr/share/logstash/bin/logstash -V
logstash 7.3.1
filebeat 版本,
/usr/share/filebeat/bin/filebeat version
filebeat version 7.3.1 (amd64), libbeat 7.3.1 [a4be71b90ce3e3b8213b616adfcd9e455513da45 built 2019-08-19 19:30:50 +0000 UTC]
logstash 配置文件 /etc/logstash/conf.d/beat.conf
input {
beats {
port => 5044
ssl => true
ssl_certificate_authorities => "/etc/pki/tls/certs/logstash-forwarder.crt"
ssl_certificate => "/etc/pki/tls/certs/logstash-forwarder.crt"
ssl_key => "/etc/pki/tls/private/logstash-forwarder.key"
ssl_verify_mode => "peer"
}
}
output {
elasticsearch {
hosts => "http://127.0.0.1:9200"
manage_template => false
index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
document_type => "%{[@metadata][type]}"
}
}
filebeat 配置文件 /etc/filebeat/filebeat.yml
#=========================== Filebeat inputs =============================
filebeat.inputs:
# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.
- type: log
# Change to true to enable this input configuration.
enabled: true
# Paths that should be crawled and fetched. Glob based paths.
paths:
- /data/error_logs/Log_error_201908
#----------------------------- Logstash output --------------------------------
output.logstash:
# The Logstash hosts
hosts: ["example.com:5044"]
# Optional SSL. By default is off.
# List of root certificates for HTTPS server verifications
ssl.certificate_authorities: ["/etc/pki/tls/certs/logstash-forwarder.crt"]
# Certificate for SSL client authentication
ssl.certificate: "/etc/pki/tls/certs/logstash-forwarder.crt"
# Client Certificate Key
ssl.key: "/etc/pki/tls/private/logstash-forwarder.key"
Filebeat 在机器 B 上 运行ning,它读取日志并推送到机器 A 上的 ELK logstash。
但是在机器B的filebeat日志中,显示错误i/o timeout
2019-08-24T12:13:10.065+0800 ERROR pipeline/output.go:100 Failed to connect to backoff(async(tcp://example.com:5044)): dial tcp xx.xx.xx.xx:5044: i/o timeout
2019-08-24T12:13:10.065+0800 INFO pipeline/output.go:93 Attempting to reconnect to backoff(async(tcp://example.com:5044)) with 1 reconnect attempt(s)
我检查了 运行 机器 A 上的 logstash,可以在 0 0.0.0.0:5044
这是logstash日志
[INFO ] 2019-08-24 12:09:35.217 [[main]-pipeline-manager] beats - Beats inputs: Starting input listener {:address=>"0.0.0.0:5044"}
这里是 netstat
输出,
$ sudo netstat -tlnp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:5044 0.0.0.0:* LISTEN 20668/java
我还检查了机器 A 上的防火墙是否关闭。
$ firewall-cmd --list-all
FirewallD is not running
$ sudo iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain FORWARD (policy DROP)
target prot opt source destination
我也使用 telnet 连接机器 A,但是我明白了,
$ telnet example.com 5044
Trying xx.xx.xx.xx...
telnet: connect to address xx.xx.xx.xx: Connection timed out
我运行在机器A(本地)上配置相同的filebeat来检查它在机器B(远程)上的filebeat配置是错误的,它工作正常。
2019-08-24T14:17:35.195+0800 INFO pipeline/output.go:95 Connecting to backoff(async(tcp://localhost:5044))
2019-08-24T14:17:35.198+0800 INFO pipeline/output.go:105 Connection to backoff(async(tcp://localhost:5044)) established
最后发现是VPS阿里云提供商造成的,只开放了22、80443等常用端口
我需要登录阿里云VPS管理页面,打开5044让VPSProvider绕过5044端口
*注:*附:ELK配置filebeat时遇到的一些其他问题
**问题 1:**无法连接到退避(异步(tcp://ip:5044)):拨打 tcp ip:5044:连接:连接被拒绝
2019-08-26T10:25:41.955+0800 ERROR pipeline/output.go:100 Failed to connect to backoff(async(tcp://example.com:5044)): dial tcp xx.xx.xx.xx:5044: connect: connection refused
2019-08-26T10:25:41.955+0800 INFO pipeline/output.go:93 Attempting to reconnect to backoff(async(tcp://example:5044)) with 2 reconnect attempt(s)
问题 2: 发布事件失败,原因是:写入 tcp ip:46890->ip:5044:写入:对等方重置连接
2019-08-26T10:28:32.274+0800 ERROR logstash/async.go:256 Failed to publish events caused by: write tcp xx.xx.xx.xx:46890->xx.xx.xx.xx:5044: write: connection reset by peer
2019-08-26T10:28:33.311+0800 ERROR pipeline/output.go:121 Failed to publish events: write tcp xx.xx.xx.xx:46890->xx.xx.xx.xx:5044: write: connection reset by peer
问题 3: Filebeat 错误:lumberjack 协议错误和 Logstash error: OPENSSL_internal:WRONG_VERSION_NUMBER
Filebeat 日志错误,
2019-08-26T08:49:09.505+0800 INFO pipeline/output.go:95 Connecting to backoff(async(tcp://example.com:5044))
2019-08-26T08:49:09.588+0800 INFO pipeline/output.go:105 Connection to backoff(async(tcp://example.com:5044)) established
2019-08-26T08:49:09.605+0800 ERROR logstash/async.go:256 Failed to publish events caused by: lumberjack protocol error
2019-08-26T08:49:09.606+0800 ERROR logstash/async.go:256 Failed to publish events caused by: client is not connected
Logstash 日志,
[INFO ] 2019-08-26 08:49:29.444 [defaultEventExecutorGroup-4-2] BeatsHandler - [local: 0.0.0.0:5044, remote: undefined] Handling exception: javax.net.ssl.SSLHandshakeException: error:100000f7:SSL routines:OPENSSL_internal:WRONG_VERSION_NUMBER
[WARN ] 2019-08-26 08:49:29.445 [nioEventLoopGroup-2-7] DefaultChannelPipeline - An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.
io.netty.handler.codec.DecoderException: javax.net.ssl.SSLHandshakeException: error:100000f7:SSL routines:OPENSSL_internal:WRONG_VERSION_NUMBER
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:472) ~[netty-all-4.1.30.Final.jar:4.1.30.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:278) ~[netty-all-4.1.30.Final.jar:4.1.30.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[netty-all-4.1.30.Final.jar:4.1.30.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[netty-all-4.1.30.Final.jar:4.1.30.Final]
...
这三个问题都是配置不当引起的,这里是可行的配置,
logstash 版本,
/usr/share/logstash/bin/logstash -V
logstash 7.3.1
filebeat 版本,
/usr/share/filebeat/bin/filebeat version
filebeat version 7.3.1 (amd64), libbeat 7.3.1 [a4be71b90ce3e3b8213b616adfcd9e455513da45 built 2019-08-19 19:30:50 +0000 UTC]
logstash 配置文件 /etc/logstash/conf.d/beat.conf
input {
beats {
port => 5044
ssl => true
ssl_certificate_authorities => "/etc/pki/tls/certs/logstash-forwarder.crt"
ssl_certificate => "/etc/pki/tls/certs/logstash-forwarder.crt"
ssl_key => "/etc/pki/tls/private/logstash-forwarder.key"
ssl_verify_mode => "peer"
}
}
output {
elasticsearch {
hosts => "http://127.0.0.1:9200"
manage_template => false
index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
document_type => "%{[@metadata][type]}"
}
}
filebeat 配置文件 /etc/filebeat/filebeat.yml
#=========================== Filebeat inputs =============================
filebeat.inputs:
# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.
- type: log
# Change to true to enable this input configuration.
enabled: true
# Paths that should be crawled and fetched. Glob based paths.
paths:
- /data/error_logs/Log_error_201908
#----------------------------- Logstash output --------------------------------
output.logstash:
# The Logstash hosts
hosts: ["example.com:5044"]
# Optional SSL. By default is off.
# List of root certificates for HTTPS server verifications
ssl.certificate_authorities: ["/etc/pki/tls/certs/logstash-forwarder.crt"]
# Certificate for SSL client authentication
ssl.certificate: "/etc/pki/tls/certs/logstash-forwarder.crt"
# Client Certificate Key
ssl.key: "/etc/pki/tls/private/logstash-forwarder.key"