Envoy Proxy with GRPC Server Streaming 抛出不可用:上游请求超时

Envoy Proxy with GRPC Server Streaming throwing UNAVAILABLE: upstream request timeout

我们拥有支持服务端流式传输的 GRPC 客户端和 GRPC 服务器。

rpc LotsOfReplies(HelloRequest) returns (stream HelloResponse);

GRPC 服务器 运行 在具有 GRPC 配置的 Envoy 代理之后。

问题是当我们将 GRPC 客户端连接到 Envoy Proxy -> Grpc Server 时,出现以下异常。当我们在没有 Envoy 代理的情况下将 GRPC 客户端直接连接到 GRPC 服务器时,代码完美无缺。

io.grpc.StatusRuntimeException: UNAVAILABLE: upstream request timeout
    at io.grpc.Status.asRuntimeException(Status.java:535)
    at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:478)
    at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:553)
    at io.grpc.internal.ClientCallImpl.access0(ClientCallImpl.java:68)
    at io.grpc.internal.ClientCallImpl$ClientStreamListenerImplStreamClosed.runInternal(ClientCallImpl.java:739)
    at io.grpc.internal.ClientCallImpl$ClientStreamListenerImplStreamClosed.runInContext(ClientCallImpl.java:718)
    at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
    at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

示例envoy.yaml配置如下,供参考。这方面的任何帮助都会非常有帮助。

static_resources:

  listeners:
    - name: listener_0
      address:
        socket_address:
          address: X.X.X.X
          port_value: 443
          ipv4Compat: true
      filter_chains:
        - filter_chain_match: {}
          transport_socket:
            name: envoy.transport_sockets.tls
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext
              common_tls_context:
                tls_params:
                  cipher_suites:
                    - ECDHE-ECDSA-AES128-GCM-SHA256
                    - ECDHE-RSA-AES128-GCM-SHA256
                    - ECDHE-ECDSA-AES128-SHA
                    - ECDHE-RSA-AES128-SHA
                    - AES128-GCM-SHA256
                    - AES128-SHA
                    - ECDHE-ECDSA-AES256-GCM-SHA384
                    - ECDHE-RSA-AES256-GCM-SHA384
                    - ECDHE-ECDSA-AES256-SHA
                    - ECDHE-RSA-AES256-SHA
                    - AES256-GCM-SHA384
                    - AES256-SHA
                  ecdh_curves:
                    - P-256
                tls_certificates:
                  - certificate_chain:
                      filename: "/home/.tomcat_cert.pem"
                    private_key:
                      filename: "/home/.tomcat_key.pem"
                validation_context:
                  trust_chain_verification: ACCEPT_UNTRUSTED
                alpn_protocols:
                  - h2
              require_client_certificate: false
          filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                stat_prefix: ingress_http
                http_filters:
                  - name: envoy.filters.http.router
                route_config:
                  name: local_route
                  virtual_hosts:
                    - name: local_service
                      domains: ["*"]
                      routes:
                        - match:
                            prefix: "/api.ApiService"
                          route:
                            cluster: grpc-server
                            idle_timeout: 0s
                            max_stream_duration:
                              grpc_timeout_header_max: 35s
                        - match:
                            prefix: "/site"
                          route:
                            cluster: site_router
  clusters:
    - name: site_router
      type: static
      # Comment out the following line to test on v6 networks
      lb_policy: round_robin
      connect_timeout: 25s
      http2_protocol_options: {}
      load_assignment:
        cluster_name: site_router
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: 127.0.0.1
                      port_value: 7880
    - name: grpc-server
      type: static
      # Comment out the following line to test on v6 networks
      lb_policy: round_robin
      connect_timeout: 25s
      http2_protocol_options: {}
      load_assignment:
        cluster_name: grpc-server
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: 127.0.0.1
                      port_value: 7879

请注意,这适用于一元 gRPC 调用,如下所示。

rpc SayHello(HelloRequest) returns (HelloResponse);

但不适用于服务器流式传输

rpc LotsOfReplies(HelloRequest) returns (stream HelloResponse);

在envoy.yaml里面提供timeout:0s解决了我的问题

static_resources:

  listeners:
    - name: listener_0
      address:
        socket_address:
          address: X.X.X.X.
          port_value: 443
          ipv4Compat: true
      filter_chains:
        - filter_chain_match: {}
          transport_socket:
            name: envoy.transport_sockets.tls
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext
              common_tls_context:
                tls_params:
                  cipher_suites:
                    - ECDHE-ECDSA-AES128-GCM-SHA256
                    - ECDHE-RSA-AES128-GCM-SHA256
                    - ECDHE-ECDSA-AES128-SHA
                    - ECDHE-RSA-AES128-SHA
                    - AES128-GCM-SHA256
                    - AES128-SHA
                    - ECDHE-ECDSA-AES256-GCM-SHA384
                    - ECDHE-RSA-AES256-GCM-SHA384
                    - ECDHE-ECDSA-AES256-SHA
                    - ECDHE-RSA-AES256-SHA
                    - AES256-GCM-SHA384
                    - AES256-SHA
                  ecdh_curves:
                    - P-256
                tls_certificates:
                  - certificate_chain:
                      filename: "/home/.tomcat_cert.pem"
                    private_key:
                      filename: "/home/.tomcat_key.pem"
                validation_context:
                  trust_chain_verification: ACCEPT_UNTRUSTED
                alpn_protocols:
                  - h2
              require_client_certificate: false
          filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                stat_prefix: ingress_http
                stream_idle_timeout: 0s
                http_filters:
                  - name: envoy.filters.http.router
                route_config:
                  name: local_route
                  virtual_hosts:
                    - name: local_service
                      domains: ["*"]
                      routes:
                        - match:
                            prefix: "/api.ApiService"
                          route:
                            cluster: grpc-server
                            timeout: 0s
                        - match:
                            prefix: "/policy"
                          route:
                            cluster: site_router
                            timeout: 30s
  clusters:
    - name: site_router
      type: static
      # Comment out the following line to test on v6 networks
      lb_policy: round_robin
      connect_timeout: 25s
      http2_protocol_options: {}
      load_assignment:
        cluster_name: site_router
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: 127.0.0.1
                      port_value: 7880
    - name: grpc-server
      type: static
      # Comment out the following line to test on v6 networks
      lb_policy: round_robin
      connect_timeout: 25s
      http2_protocol_options: {}
      load_assignment:
        cluster_name: grpc-server
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: 127.0.0.1
                      port_value: 7879