当某些上游无法访问时,如何将特使代理配置为故障转移
How can I config envoy proxy to failover when some upstream being unreachable
我是 envoy 代理的新手,我需要的是使用 envoy 作为 grpc 客户端和服务器之间的 sidecar 代理。
至此,我已经连接了一个grpc客户端和两个服务器,lb_policy设置为ROUND_ROBIN。但是当我关闭其中一台服务器时,grpc 客户端调用会失败。
那么,我该如何配置 envoy 来处理这种情况?
这是我的特使配置:
admin:
access_log_path: "/tmp/admin_access.log"
address:
socket_address:
address: "10.19.17.188"
port_value: 12000
static_resources:
listeners:
-
name: "grpc-listener"
address:
socket_address:
address: "10.19.17.188"
port_value: 12001
filter_chains:
-
filters:
-
name: "envoy.http_connection_manager"
config:
stat_prefix: "ingress"
codec_type: "AUTO"
route_config:
name: "grpc-route"
virtual_hosts:
-
name: "grpc-route"
domains:
- "*"
routes:
-
match:
prefix: "/"
route:
cluster: "grpc-service"
http_filters:
-
name: "envoy.router"
clusters:
-
name: "grpc-service"
connect_timeout: "0.25s"
type: "static"
lb_policy: "ROUND_ROBIN"
http2_protocol_options: {}
hosts:
-
socket_address:
address: "10.19.17.188"
port_value: 12011
-
socket_address:
address: "10.19.17.188"
port_value: 12012
python grpc客户端报错信息:
Traceback (most recent call last):
File "greeter_client.py", line 40, in <module>
run()
File "greeter_client.py", line 32, in run
response = stub.SayHello(helloworld_pb2.HelloRequest(name='%03d'%i))
File "/usr/local/lib/python3.6/site-packages/grpc/_channel.py", line 565, in __call__
return _end_unary_response_blocking(state, call, False, None)
File "/usr/local/lib/python3.6/site-packages/grpc/_channel.py", line 467, in _end_unary_response_blocking
raise _Rendezvous(state, None, None, deadline)
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "upstream connect error or disconnect/reset before headers. reset reason: connection failure"
debug_error_string = "{"created":"@1568810074.217216860","description":"Error received from peer ipv4:10.19.17.188:12001","file":"src/core/lib/surface/call.cc","file_line":1052,"grpc_message":"upstream connect error or disconnect/reset before headers. reset reason: connection failure","grpc_status":14}"
特使日志:
[2019-09-18 12:29:02.380][798][debug][pool] [source/common/http/conn_pool_base.cc:20] queueing request due to no available connections
[2019-09-18 12:29:02.380][798][debug][http] [source/common/http/conn_manager_impl.cc:1111] [C285][S679184262732628339] request end stream
[2019-09-18 12:29:02.380][798][debug][connection] [source/common/network/connection_impl.cc:561] [C286] delayed connection error: 111
[2019-09-18 12:29:02.380][798][debug][connection] [source/common/network/connection_impl.cc:190] [C286] closing socket: 0
[2019-09-18 12:29:02.380][798][debug][client] [source/common/http/codec_client.cc:82] [C286] disconnect. resetting 0 pending requests
[2019-09-18 12:29:02.380][798][debug][pool] [source/common/http/http2/conn_pool.cc:149] [C286] client disconnected
[2019-09-18 12:29:02.380][798][debug][router] [source/common/router/router.cc:868] [C285][S679184262732628339] upstream reset: reset reason connection failure
[2019-09-18 12:29:02.380][798][debug][http] [source/common/http/conn_manager_impl.cc:1186] [C285][S679184262732628339] Sending local reply with details upstream_reset_before_response_started{connection failure}
[2019-09-18 12:29:02.380][798][debug][http] [source/common/http/conn_manager_impl.cc:1378] [C285][S679184262732628339] encoding headers via codec (end_stream=true):
':status', '200'
'content-type', 'application/grpc'
'grpc-status', '14'
'grpc-message', 'upstream connect error or disconnect/reset before headers. reset reason: connection failure'
'date', 'Wed, 18 Sep 2019 12:29:02 GMT'
'server', 'envoy'
有帮助
clusters:
-
name: "grpc-service"
connect_timeout: "0.25s"
type: "static"
lb_policy: "ROUND_ROBIN"
http2_protocol_options: {}
hosts:
-
socket_address:
address: "10.19.17.188"
port_value: 12011
-
socket_address:
address: "10.19.17.188"
port_value: 12012
outlier_detection: # where amazing happened
consecutive_5xx: 1
您还可以使用以下方法之一:
在您的集群的 Envoy 中启用主动健康检查。为此,您还必须从您的服务中公开一个健康检查端点,这应该相当容易。参考 https://www.envoyproxy.io/docs/envoy/latest/api-v2/api/v2/core/health_check.proto
切换到使用额外组件的特使动态配置:a) 控制平面,例如 go-control-plane
和 b) 服务发现服务,例如 consul
。这肯定会增加您的服务设置的复杂性,但它也会支持更动态和更强大的解决方案。
我是 envoy 代理的新手,我需要的是使用 envoy 作为 grpc 客户端和服务器之间的 sidecar 代理。
至此,我已经连接了一个grpc客户端和两个服务器,lb_policy设置为ROUND_ROBIN。但是当我关闭其中一台服务器时,grpc 客户端调用会失败。
那么,我该如何配置 envoy 来处理这种情况?
这是我的特使配置:
admin:
access_log_path: "/tmp/admin_access.log"
address:
socket_address:
address: "10.19.17.188"
port_value: 12000
static_resources:
listeners:
-
name: "grpc-listener"
address:
socket_address:
address: "10.19.17.188"
port_value: 12001
filter_chains:
-
filters:
-
name: "envoy.http_connection_manager"
config:
stat_prefix: "ingress"
codec_type: "AUTO"
route_config:
name: "grpc-route"
virtual_hosts:
-
name: "grpc-route"
domains:
- "*"
routes:
-
match:
prefix: "/"
route:
cluster: "grpc-service"
http_filters:
-
name: "envoy.router"
clusters:
-
name: "grpc-service"
connect_timeout: "0.25s"
type: "static"
lb_policy: "ROUND_ROBIN"
http2_protocol_options: {}
hosts:
-
socket_address:
address: "10.19.17.188"
port_value: 12011
-
socket_address:
address: "10.19.17.188"
port_value: 12012
python grpc客户端报错信息:
Traceback (most recent call last):
File "greeter_client.py", line 40, in <module>
run()
File "greeter_client.py", line 32, in run
response = stub.SayHello(helloworld_pb2.HelloRequest(name='%03d'%i))
File "/usr/local/lib/python3.6/site-packages/grpc/_channel.py", line 565, in __call__
return _end_unary_response_blocking(state, call, False, None)
File "/usr/local/lib/python3.6/site-packages/grpc/_channel.py", line 467, in _end_unary_response_blocking
raise _Rendezvous(state, None, None, deadline)
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "upstream connect error or disconnect/reset before headers. reset reason: connection failure"
debug_error_string = "{"created":"@1568810074.217216860","description":"Error received from peer ipv4:10.19.17.188:12001","file":"src/core/lib/surface/call.cc","file_line":1052,"grpc_message":"upstream connect error or disconnect/reset before headers. reset reason: connection failure","grpc_status":14}"
特使日志:
[2019-09-18 12:29:02.380][798][debug][pool] [source/common/http/conn_pool_base.cc:20] queueing request due to no available connections
[2019-09-18 12:29:02.380][798][debug][http] [source/common/http/conn_manager_impl.cc:1111] [C285][S679184262732628339] request end stream
[2019-09-18 12:29:02.380][798][debug][connection] [source/common/network/connection_impl.cc:561] [C286] delayed connection error: 111
[2019-09-18 12:29:02.380][798][debug][connection] [source/common/network/connection_impl.cc:190] [C286] closing socket: 0
[2019-09-18 12:29:02.380][798][debug][client] [source/common/http/codec_client.cc:82] [C286] disconnect. resetting 0 pending requests
[2019-09-18 12:29:02.380][798][debug][pool] [source/common/http/http2/conn_pool.cc:149] [C286] client disconnected
[2019-09-18 12:29:02.380][798][debug][router] [source/common/router/router.cc:868] [C285][S679184262732628339] upstream reset: reset reason connection failure
[2019-09-18 12:29:02.380][798][debug][http] [source/common/http/conn_manager_impl.cc:1186] [C285][S679184262732628339] Sending local reply with details upstream_reset_before_response_started{connection failure}
[2019-09-18 12:29:02.380][798][debug][http] [source/common/http/conn_manager_impl.cc:1378] [C285][S679184262732628339] encoding headers via codec (end_stream=true):
':status', '200'
'content-type', 'application/grpc'
'grpc-status', '14'
'grpc-message', 'upstream connect error or disconnect/reset before headers. reset reason: connection failure'
'date', 'Wed, 18 Sep 2019 12:29:02 GMT'
'server', 'envoy'
有帮助
clusters:
-
name: "grpc-service"
connect_timeout: "0.25s"
type: "static"
lb_policy: "ROUND_ROBIN"
http2_protocol_options: {}
hosts:
-
socket_address:
address: "10.19.17.188"
port_value: 12011
-
socket_address:
address: "10.19.17.188"
port_value: 12012
outlier_detection: # where amazing happened
consecutive_5xx: 1
您还可以使用以下方法之一:
在您的集群的 Envoy 中启用主动健康检查。为此,您还必须从您的服务中公开一个健康检查端点,这应该相当容易。参考 https://www.envoyproxy.io/docs/envoy/latest/api-v2/api/v2/core/health_check.proto
切换到使用额外组件的特使动态配置:a) 控制平面,例如
go-control-plane
和 b) 服务发现服务,例如consul
。这肯定会增加您的服务设置的复杂性,但它也会支持更动态和更强大的解决方案。