如何配置 GKE Autopilot w/Envoy & gRPC-Web
How to configure GKE Autopilot w/Envoy & gRPC-Web
我的本地机器上有一个应用程序 运行,它使用 React -> gRPC-Web -> Envoy -> Go 应用程序,一切运行都没有问题。我正在尝试使用 GKE Autopilot 进行部署,但我一直无法正确配置。我是所有 GCP/GKE 的新手,所以我正在寻求帮助以找出我哪里出错了。
我最初是在关注这个文档,尽管我只有一个 gRPC 服务:
https://cloud.google.com/architecture/exposing-grpc-services-on-gke-using-envoy-proxy
据我了解,GKE Autopilot 模式需要使用外部 HTTP(s) 负载平衡而不是上述解决方案中所述的网络负载平衡,因此我一直在努力让它发挥作用。经过各种尝试,我目前的策略有Ingress、BackendConfig、Service、Deployment。该部署包含三个容器:我的应用程序、一个用于转换 gRPC-Web 请求和响应的 Envoy sidecar,以及一个云 SQL 代理 sidecar。我最终想使用 TLS,但现在,我没有考虑它,以免让事情变得更复杂。
当我应用所有配置时,后端服务显示一个后端在一个区域并且健康检查失败。健康检查是为端口 8080 和路径 /healthz 设置的,这是我认为我在部署配置中指定的,但我很怀疑,因为当我查看 envoy-sidecar 容器的详细信息时,它显示了 Readiness 探测器如:http-get HTTP://:0/healthz headers=x-envoy-livenessprobe:healthz。 ":0" 是否仅表示它正在使用容器的默认地址和端口,还是表示存在配置问题?
我一直在阅读各种文档,但一直无法将它们拼凑在一起。某处是否有示例说明如何做到这一点?我一直在寻找,但没有找到。
我当前的配置是:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: grammar-games-ingress
#annotations:
# If the class annotation is not specified it defaults to "gce".
# kubernetes.io/ingress.class: "gce"
# kubernetes.io/ingress.global-static-ip-name: <IP addr>
spec:
defaultBackend:
service:
name: grammar-games-core
port:
number: 80
---
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
name: grammar-games-bec
annotations:
cloud.google.com/neg: '{"ingress": true}'
spec:
sessionAffinity:
affinityType: "CLIENT_IP"
healthCheck:
checkIntervalSec: 15
port: 8080
type: HTTP
requestPath: /healthz
timeoutSec: 60
---
apiVersion: v1
kind: Service
metadata:
name: grammar-games-core
annotations:
cloud.google.com/neg: '{"ingress": true}'
cloud.google.com/app-protocols: '{"http":"HTTP"}'
cloud.google.com/backend-config: '{"default": "grammar-games-bec"}'
spec:
type: ClusterIP
selector:
app: grammar-games-core
ports:
- name: http
protocol: TCP
port: 80
targetPort: 8080
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: grammar-games-core
spec:
# Two replicas for right now, just so I can see how RPC calls get directed.
# replicas: 2
selector:
matchLabels:
app: grammar-games-core
template:
metadata:
labels:
app: grammar-games-core
spec:
serviceAccountName: grammar-games-core-k8sa
containers:
- name: grammar-games-core
image: gcr.io/grammar-games/grammar-games-core:1.1.2
command:
- "/bin/grammar-games-core"
ports:
- containerPort: 52001
env:
- name: GAMESDB_USER
valueFrom:
secretKeyRef:
name: gamesdb-config
key: username
- name: GAMESDB_PASSWORD
valueFrom:
secretKeyRef:
name: gamesdb-config
key: password
- name: GAMESDB_DB_NAME
valueFrom:
secretKeyRef:
name: gamesdb-config
key: db-name
- name: GRPC_SERVER_PORT
value: '52001'
- name: GAMES_LOG_FILE_PATH
value: ''
- name: GAMESDB_LOG_LEVEL
value: 'debug'
resources:
requests:
# The proxy's memory use scales linearly with the number of active
# connections. Fewer open connections will use less memory. Adjust
# this value based on your application's requirements.
memory: "2Gi"
# The proxy's CPU use scales linearly with the amount of IO between
# the database and the application. Adjust this value based on your
# application's requirements.
cpu: "1"
readinessProbe:
exec:
command: ["/bin/grpc_health_probe", "-addr=:52001"]
initialDelaySeconds: 5
- name: cloud-sql-proxy
# It is recommended to use the latest version of the Cloud SQL proxy
# Make sure to update on a regular schedule!
image: gcr.io/cloudsql-docker/gce-proxy:1.24.0
command:
- "/cloud_sql_proxy"
# If connecting from a VPC-native GKE cluster, you can use the
# following flag to have the proxy connect over private IP
# - "-ip_address_types=PRIVATE"
# Replace DB_PORT with the port the proxy should listen on
# Defaults: MySQL: 3306, Postgres: 5432, SQLServer: 1433
- "-instances=grammar-games:us-east1:grammar-games-db=tcp:3306"
securityContext:
# The default Cloud SQL proxy image runs as the
# "nonroot" user and group (uid: 65532) by default.
runAsNonRoot: true
# Resource configuration depends on an application's requirements. You
# should adjust the following values based on what your application
# needs. For details, see https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
resources:
requests:
# The proxy's memory use scales linearly with the number of active
# connections. Fewer open connections will use less memory. Adjust
# this value based on your application's requirements.
memory: "2Gi"
# The proxy's CPU use scales linearly with the amount of IO between
# the database and the application. Adjust this value based on your
# application's requirements.
cpu: "1"
- name: envoy-sidecar
image: envoyproxy/envoy:v1.20-latest
ports:
- name: http
containerPort: 8080
resources:
requests:
cpu: 10m
ephemeral-storage: 256Mi
memory: 256Mi
volumeMounts:
- name: config
mountPath: /etc/envoy
readinessProbe:
httpGet:
port: http
httpHeaders:
- name: x-envoy-livenessprobe
value: healthz
path: /healthz
scheme: HTTP
volumes:
- name: config
configMap:
name: envoy-sidecar-conf
---
apiVersion: v1
kind: ConfigMap
metadata:
name: envoy-sidecar-conf
data:
envoy.yaml: |
static_resources:
listeners:
- name: listener_0
address:
socket_address:
address: 0.0.0.0
port_value: 8080
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
access_log:
- name: envoy.access_loggers.stdout
typed_config:
"@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
codec_type: AUTO
stat_prefix: ingress_http
route_config:
name: local_route
virtual_hosts:
- name: http
domains:
- "*"
routes:
- match:
prefix: "/grammar_games_protos.GrammarGames/"
route:
cluster: grammar-games-core-grpc
cors:
allow_origin_string_match:
- prefix: "*"
allow_methods: GET, PUT, DELETE, POST, OPTIONS
allow_headers: keep-alive,user-agent,cache-control,content-type,content-transfer-encoding,custom-header-1,x-accept-content-transfer-encoding,x-accept-response-streaming,x-user-agent,x-grpc-web,grpc-timeout
max_age: "1728000"
expose_headers: custom-header-1,grpc-status,grpc-message
http_filters:
- name: envoy.filters.http.health_check
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.health_check.v3.HealthCheck
pass_through_mode: false
headers:
- name: ":path"
exact_match: "/healthz"
- name: "x-envoy-livenessprobe"
exact_match: "healthz"
- name: envoy.filters.http.grpc_web
- name: envoy.filters.http.cors
- name: envoy.filters.http.router
typed_config: {}
clusters:
- name: grammar-games-core-grpc
connect_timeout: 0.5s
type: logical_dns
lb_policy: ROUND_ROBIN
http2_protocol_options: {}
load_assignment:
cluster_name: grammar-games-core-grpc
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: 0.0.0.0
port_value: 52001
health_checks:
timeout: 1s
interval: 10s
unhealthy_threshold: 2
healthy_threshold: 2
grpc_health_check: {}
admin:
access_log_path: /dev/stdout
address:
socket_address:
address: 127.0.0.1
port_value: 8090
这是一些关于 Setting up HTTP(S) Load Balancing with Ingress 的文档。本教程介绍如何通过配置 Ingress 资源 运行 外部 HTTP(S) 负载平衡器后面的 Web 应用程序。
关于使用 Ingress 在 GKE 上创建 HTTP 负载均衡器,我发现了两个线程,其中创建的实例被标记为不健康。
In the first one,他们提到需要手动启用防火墙规则以允许 http 负载均衡器 ip 范围通过健康检查。
In the second one,他们提到 Pod 的规范还必须包括 containerPort。
示例:
spec:
containers:
- name: nginx
image: nginx:1.7.9
ports:
- containerPort: 80
除此之外,这里还有一些关于以下内容的文档:
我终于解决了这个问题,所以想post我的答案以供参考。
事实证明,本文档中的解决方案有效:
https://cloud.google.com/architecture/exposing-grpc-services-on-gke-using-envoy-proxy#introduction
在一篇关于 GKE 自动驾驶模式的文档中,我的印象是您不能使用网络负载均衡器,而是需要使用 Ingress 进行 HTTP(S) 负载均衡。这就是我采用其他方法的原因,但即使在使用 Google 支持几周后,配置看起来都是正确的,但负载均衡器的健康检查无法正常工作。就在那时,我们发现这个带有网络负载均衡器的解决方案确实可行。
我在配置 https/TLS 时也遇到了一些问题。结果证明这是我的特使配置文件中的一个问题。
我还有一个关于 pods 稳定性的问题,但这是一个单独的问题,我将在另一个 post/question 中解决。只要我只要求 1 个副本,解决方案就稳定且运行良好,自动驾驶仪应该根据需要扩展 pods。
我知道所有这些的配置都非常棘手,所以我将其全部包含在此处以供参考(仅使用 my-app 作为占位符)。希望它能帮助其他人比我更快到达那里!我认为一旦 gRPC-Web 可以正常工作,它就是一个很好的解决方案。您还会注意到我正在使用 cloud-sql-proxy sidecar 与 DB Cloud SQL 通信,并且我正在使用 GKE 服务帐户进行身份验证。
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
spec:
replicas: 1
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
serviceAccountName: my-app-k8sa
terminationGracePeriodSeconds: 30
containers:
- name: my-app
image: gcr.io/my-project/my-app:1.1.0
command:
- "/bin/my-app"
ports:
- containerPort: 52001
env:
- name: DB_USER
valueFrom:
secretKeyRef:
name: db-config
key: username
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: db-config
key: password
- name: DB_NAME
valueFrom:
secretKeyRef:
name: db-config
key: db-name
- name: GRPC_SERVER_PORT
value: '52001'
readinessProbe:
exec:
command: ["/bin/grpc_health_probe", "-addr=:52001"]
initialDelaySeconds: 10
livenessProbe:
exec:
command: ["/bin/grpc_health_probe", "-addr=:52001"]
initialDelaySeconds: 15
- name: cloud-sql-proxy
# It is recommended to use the latest version of the Cloud SQL proxy
# Make sure to update on a regular schedule!
image: gcr.io/cloudsql-docker/gce-proxy:1.27.1
command:
- "/cloud_sql_proxy"
# If connecting from a VPC-native GKE cluster, you can use the
# following flag to have the proxy connect over private IP
# - "-ip_address_types=PRIVATE"
# Replace DB_PORT with the port the proxy should listen on
# Defaults: MySQL: 3306, Postgres: 5432, SQLServer: 1433
- "-instances=my-project:us-east1:my-app-db=tcp:3306"
securityContext:
# The default Cloud SQL proxy image runs as the
# "nonroot" user and group (uid: 65532) by default.
runAsNonRoot: true
---
apiVersion: v1
kind: Service
metadata:
name: my-app
spec:
type: ClusterIP
selector:
app: my-app
ports:
- name: my-app-port
protocol: TCP
port: 52001
clusterIP: None
---
apiVersion: v1
kind: Service
metadata:
name: envoy
spec:
type: LoadBalancer
selector:
app: envoy
ports:
- name: https
protocol: TCP
port: 443
targetPort: 8443
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: envoy
spec:
replicas: 1
selector:
matchLabels:
app: envoy
template:
metadata:
labels:
app: envoy
spec:
containers:
- name: envoy
image: envoyproxy/envoy:v1.20-latest
ports:
- name: https
containerPort: 8443
resources:
requests:
cpu: 10m
ephemeral-storage: 256Mi
memory: 256Mi
volumeMounts:
- name: config
mountPath: /etc/envoy
- name: certs
mountPath: /etc/ssl/envoy
readinessProbe:
httpGet:
port: https
httpHeaders:
- name: x-envoy-livenessprobe
value: healthz
path: /healthz
scheme: HTTPS
volumes:
- name: config
configMap:
name: envoy-conf
- name: certs
secret:
secretName: envoy-certs
---
apiVersion: v1
kind: ConfigMap
metadata:
name: envoy-conf
data:
envoy.yaml: |
static_resources:
listeners:
- name: listener_0
address:
socket_address:
address: 0.0.0.0
port_value: 8443
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
access_log:
- name: envoy.access_loggers.stdout
typed_config:
"@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
codec_type: AUTO
stat_prefix: ingress_https
route_config:
name: local_route
virtual_hosts:
- name: https
domains:
- "*"
routes:
- match:
prefix: "/my_app_protos.MyService/"
route:
cluster: my-app-cluster
cors:
allow_origin_string_match:
- prefix: "*"
allow_methods: GET, PUT, DELETE, POST, OPTIONS
allow_headers: keep-alive,user-agent,cache-control,content-type,content-transfer-encoding,custom-header-1,x-accept-content-transfer-encoding,x-accept-response-streaming,x-user-agent,x-grpc-web,grpc-timeout
max_age: "1728000"
expose_headers: custom-header-1,grpc-status,grpc-message
http_filters:
- name: envoy.filters.http.health_check
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.health_check.v3.HealthCheck
pass_through_mode: false
headers:
- name: ":path"
exact_match: "/healthz"
- name: "x-envoy-livenessprobe"
exact_match: "healthz"
- name: envoy.filters.http.grpc_web
- name: envoy.filters.http.cors
- name: envoy.filters.http.router
typed_config: {}
transport_socket:
name: envoy.transport_sockets.tls
typed_config:
"@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext
require_client_certificate: false
common_tls_context:
tls_certificates:
- certificate_chain:
filename: /etc/ssl/envoy/tls.crt
private_key:
filename: /etc/ssl/envoy/tls.key
clusters:
- name: my-app-cluster
connect_timeout: 0.5s
type: STRICT_DNS
dns_lookup_family: V4_ONLY
lb_policy: ROUND_ROBIN
http2_protocol_options: {}
load_assignment:
cluster_name: my-app-cluster
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: my-app.default.svc.cluster.local
port_value: 52001
health_checks:
timeout: 1s
interval: 10s
unhealthy_threshold: 2
healthy_threshold: 2
grpc_health_check: {}
admin:
access_log_path: /dev/stdout
address:
socket_address:
address: 127.0.0.1
port_value: 8090
我仍然不确定在 Deployment 中指定两个容器的资源要求和副本数,但解决方案有效。
我的本地机器上有一个应用程序 运行,它使用 React -> gRPC-Web -> Envoy -> Go 应用程序,一切运行都没有问题。我正在尝试使用 GKE Autopilot 进行部署,但我一直无法正确配置。我是所有 GCP/GKE 的新手,所以我正在寻求帮助以找出我哪里出错了。
我最初是在关注这个文档,尽管我只有一个 gRPC 服务: https://cloud.google.com/architecture/exposing-grpc-services-on-gke-using-envoy-proxy
据我了解,GKE Autopilot 模式需要使用外部 HTTP(s) 负载平衡而不是上述解决方案中所述的网络负载平衡,因此我一直在努力让它发挥作用。经过各种尝试,我目前的策略有Ingress、BackendConfig、Service、Deployment。该部署包含三个容器:我的应用程序、一个用于转换 gRPC-Web 请求和响应的 Envoy sidecar,以及一个云 SQL 代理 sidecar。我最终想使用 TLS,但现在,我没有考虑它,以免让事情变得更复杂。
当我应用所有配置时,后端服务显示一个后端在一个区域并且健康检查失败。健康检查是为端口 8080 和路径 /healthz 设置的,这是我认为我在部署配置中指定的,但我很怀疑,因为当我查看 envoy-sidecar 容器的详细信息时,它显示了 Readiness 探测器如:http-get HTTP://:0/healthz headers=x-envoy-livenessprobe:healthz。 ":0" 是否仅表示它正在使用容器的默认地址和端口,还是表示存在配置问题?
我一直在阅读各种文档,但一直无法将它们拼凑在一起。某处是否有示例说明如何做到这一点?我一直在寻找,但没有找到。
我当前的配置是:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: grammar-games-ingress
#annotations:
# If the class annotation is not specified it defaults to "gce".
# kubernetes.io/ingress.class: "gce"
# kubernetes.io/ingress.global-static-ip-name: <IP addr>
spec:
defaultBackend:
service:
name: grammar-games-core
port:
number: 80
---
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
name: grammar-games-bec
annotations:
cloud.google.com/neg: '{"ingress": true}'
spec:
sessionAffinity:
affinityType: "CLIENT_IP"
healthCheck:
checkIntervalSec: 15
port: 8080
type: HTTP
requestPath: /healthz
timeoutSec: 60
---
apiVersion: v1
kind: Service
metadata:
name: grammar-games-core
annotations:
cloud.google.com/neg: '{"ingress": true}'
cloud.google.com/app-protocols: '{"http":"HTTP"}'
cloud.google.com/backend-config: '{"default": "grammar-games-bec"}'
spec:
type: ClusterIP
selector:
app: grammar-games-core
ports:
- name: http
protocol: TCP
port: 80
targetPort: 8080
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: grammar-games-core
spec:
# Two replicas for right now, just so I can see how RPC calls get directed.
# replicas: 2
selector:
matchLabels:
app: grammar-games-core
template:
metadata:
labels:
app: grammar-games-core
spec:
serviceAccountName: grammar-games-core-k8sa
containers:
- name: grammar-games-core
image: gcr.io/grammar-games/grammar-games-core:1.1.2
command:
- "/bin/grammar-games-core"
ports:
- containerPort: 52001
env:
- name: GAMESDB_USER
valueFrom:
secretKeyRef:
name: gamesdb-config
key: username
- name: GAMESDB_PASSWORD
valueFrom:
secretKeyRef:
name: gamesdb-config
key: password
- name: GAMESDB_DB_NAME
valueFrom:
secretKeyRef:
name: gamesdb-config
key: db-name
- name: GRPC_SERVER_PORT
value: '52001'
- name: GAMES_LOG_FILE_PATH
value: ''
- name: GAMESDB_LOG_LEVEL
value: 'debug'
resources:
requests:
# The proxy's memory use scales linearly with the number of active
# connections. Fewer open connections will use less memory. Adjust
# this value based on your application's requirements.
memory: "2Gi"
# The proxy's CPU use scales linearly with the amount of IO between
# the database and the application. Adjust this value based on your
# application's requirements.
cpu: "1"
readinessProbe:
exec:
command: ["/bin/grpc_health_probe", "-addr=:52001"]
initialDelaySeconds: 5
- name: cloud-sql-proxy
# It is recommended to use the latest version of the Cloud SQL proxy
# Make sure to update on a regular schedule!
image: gcr.io/cloudsql-docker/gce-proxy:1.24.0
command:
- "/cloud_sql_proxy"
# If connecting from a VPC-native GKE cluster, you can use the
# following flag to have the proxy connect over private IP
# - "-ip_address_types=PRIVATE"
# Replace DB_PORT with the port the proxy should listen on
# Defaults: MySQL: 3306, Postgres: 5432, SQLServer: 1433
- "-instances=grammar-games:us-east1:grammar-games-db=tcp:3306"
securityContext:
# The default Cloud SQL proxy image runs as the
# "nonroot" user and group (uid: 65532) by default.
runAsNonRoot: true
# Resource configuration depends on an application's requirements. You
# should adjust the following values based on what your application
# needs. For details, see https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
resources:
requests:
# The proxy's memory use scales linearly with the number of active
# connections. Fewer open connections will use less memory. Adjust
# this value based on your application's requirements.
memory: "2Gi"
# The proxy's CPU use scales linearly with the amount of IO between
# the database and the application. Adjust this value based on your
# application's requirements.
cpu: "1"
- name: envoy-sidecar
image: envoyproxy/envoy:v1.20-latest
ports:
- name: http
containerPort: 8080
resources:
requests:
cpu: 10m
ephemeral-storage: 256Mi
memory: 256Mi
volumeMounts:
- name: config
mountPath: /etc/envoy
readinessProbe:
httpGet:
port: http
httpHeaders:
- name: x-envoy-livenessprobe
value: healthz
path: /healthz
scheme: HTTP
volumes:
- name: config
configMap:
name: envoy-sidecar-conf
---
apiVersion: v1
kind: ConfigMap
metadata:
name: envoy-sidecar-conf
data:
envoy.yaml: |
static_resources:
listeners:
- name: listener_0
address:
socket_address:
address: 0.0.0.0
port_value: 8080
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
access_log:
- name: envoy.access_loggers.stdout
typed_config:
"@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
codec_type: AUTO
stat_prefix: ingress_http
route_config:
name: local_route
virtual_hosts:
- name: http
domains:
- "*"
routes:
- match:
prefix: "/grammar_games_protos.GrammarGames/"
route:
cluster: grammar-games-core-grpc
cors:
allow_origin_string_match:
- prefix: "*"
allow_methods: GET, PUT, DELETE, POST, OPTIONS
allow_headers: keep-alive,user-agent,cache-control,content-type,content-transfer-encoding,custom-header-1,x-accept-content-transfer-encoding,x-accept-response-streaming,x-user-agent,x-grpc-web,grpc-timeout
max_age: "1728000"
expose_headers: custom-header-1,grpc-status,grpc-message
http_filters:
- name: envoy.filters.http.health_check
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.health_check.v3.HealthCheck
pass_through_mode: false
headers:
- name: ":path"
exact_match: "/healthz"
- name: "x-envoy-livenessprobe"
exact_match: "healthz"
- name: envoy.filters.http.grpc_web
- name: envoy.filters.http.cors
- name: envoy.filters.http.router
typed_config: {}
clusters:
- name: grammar-games-core-grpc
connect_timeout: 0.5s
type: logical_dns
lb_policy: ROUND_ROBIN
http2_protocol_options: {}
load_assignment:
cluster_name: grammar-games-core-grpc
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: 0.0.0.0
port_value: 52001
health_checks:
timeout: 1s
interval: 10s
unhealthy_threshold: 2
healthy_threshold: 2
grpc_health_check: {}
admin:
access_log_path: /dev/stdout
address:
socket_address:
address: 127.0.0.1
port_value: 8090
这是一些关于 Setting up HTTP(S) Load Balancing with Ingress 的文档。本教程介绍如何通过配置 Ingress 资源 运行 外部 HTTP(S) 负载平衡器后面的 Web 应用程序。
关于使用 Ingress 在 GKE 上创建 HTTP 负载均衡器,我发现了两个线程,其中创建的实例被标记为不健康。
In the first one,他们提到需要手动启用防火墙规则以允许 http 负载均衡器 ip 范围通过健康检查。
In the second one,他们提到 Pod 的规范还必须包括 containerPort。 示例:
spec:
containers:
- name: nginx
image: nginx:1.7.9
ports:
- containerPort: 80
除此之外,这里还有一些关于以下内容的文档:
我终于解决了这个问题,所以想post我的答案以供参考。
事实证明,本文档中的解决方案有效:
https://cloud.google.com/architecture/exposing-grpc-services-on-gke-using-envoy-proxy#introduction
在一篇关于 GKE 自动驾驶模式的文档中,我的印象是您不能使用网络负载均衡器,而是需要使用 Ingress 进行 HTTP(S) 负载均衡。这就是我采用其他方法的原因,但即使在使用 Google 支持几周后,配置看起来都是正确的,但负载均衡器的健康检查无法正常工作。就在那时,我们发现这个带有网络负载均衡器的解决方案确实可行。
我在配置 https/TLS 时也遇到了一些问题。结果证明这是我的特使配置文件中的一个问题。
我还有一个关于 pods 稳定性的问题,但这是一个单独的问题,我将在另一个 post/question 中解决。只要我只要求 1 个副本,解决方案就稳定且运行良好,自动驾驶仪应该根据需要扩展 pods。
我知道所有这些的配置都非常棘手,所以我将其全部包含在此处以供参考(仅使用 my-app 作为占位符)。希望它能帮助其他人比我更快到达那里!我认为一旦 gRPC-Web 可以正常工作,它就是一个很好的解决方案。您还会注意到我正在使用 cloud-sql-proxy sidecar 与 DB Cloud SQL 通信,并且我正在使用 GKE 服务帐户进行身份验证。
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
spec:
replicas: 1
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
serviceAccountName: my-app-k8sa
terminationGracePeriodSeconds: 30
containers:
- name: my-app
image: gcr.io/my-project/my-app:1.1.0
command:
- "/bin/my-app"
ports:
- containerPort: 52001
env:
- name: DB_USER
valueFrom:
secretKeyRef:
name: db-config
key: username
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: db-config
key: password
- name: DB_NAME
valueFrom:
secretKeyRef:
name: db-config
key: db-name
- name: GRPC_SERVER_PORT
value: '52001'
readinessProbe:
exec:
command: ["/bin/grpc_health_probe", "-addr=:52001"]
initialDelaySeconds: 10
livenessProbe:
exec:
command: ["/bin/grpc_health_probe", "-addr=:52001"]
initialDelaySeconds: 15
- name: cloud-sql-proxy
# It is recommended to use the latest version of the Cloud SQL proxy
# Make sure to update on a regular schedule!
image: gcr.io/cloudsql-docker/gce-proxy:1.27.1
command:
- "/cloud_sql_proxy"
# If connecting from a VPC-native GKE cluster, you can use the
# following flag to have the proxy connect over private IP
# - "-ip_address_types=PRIVATE"
# Replace DB_PORT with the port the proxy should listen on
# Defaults: MySQL: 3306, Postgres: 5432, SQLServer: 1433
- "-instances=my-project:us-east1:my-app-db=tcp:3306"
securityContext:
# The default Cloud SQL proxy image runs as the
# "nonroot" user and group (uid: 65532) by default.
runAsNonRoot: true
---
apiVersion: v1
kind: Service
metadata:
name: my-app
spec:
type: ClusterIP
selector:
app: my-app
ports:
- name: my-app-port
protocol: TCP
port: 52001
clusterIP: None
---
apiVersion: v1
kind: Service
metadata:
name: envoy
spec:
type: LoadBalancer
selector:
app: envoy
ports:
- name: https
protocol: TCP
port: 443
targetPort: 8443
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: envoy
spec:
replicas: 1
selector:
matchLabels:
app: envoy
template:
metadata:
labels:
app: envoy
spec:
containers:
- name: envoy
image: envoyproxy/envoy:v1.20-latest
ports:
- name: https
containerPort: 8443
resources:
requests:
cpu: 10m
ephemeral-storage: 256Mi
memory: 256Mi
volumeMounts:
- name: config
mountPath: /etc/envoy
- name: certs
mountPath: /etc/ssl/envoy
readinessProbe:
httpGet:
port: https
httpHeaders:
- name: x-envoy-livenessprobe
value: healthz
path: /healthz
scheme: HTTPS
volumes:
- name: config
configMap:
name: envoy-conf
- name: certs
secret:
secretName: envoy-certs
---
apiVersion: v1
kind: ConfigMap
metadata:
name: envoy-conf
data:
envoy.yaml: |
static_resources:
listeners:
- name: listener_0
address:
socket_address:
address: 0.0.0.0
port_value: 8443
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
access_log:
- name: envoy.access_loggers.stdout
typed_config:
"@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
codec_type: AUTO
stat_prefix: ingress_https
route_config:
name: local_route
virtual_hosts:
- name: https
domains:
- "*"
routes:
- match:
prefix: "/my_app_protos.MyService/"
route:
cluster: my-app-cluster
cors:
allow_origin_string_match:
- prefix: "*"
allow_methods: GET, PUT, DELETE, POST, OPTIONS
allow_headers: keep-alive,user-agent,cache-control,content-type,content-transfer-encoding,custom-header-1,x-accept-content-transfer-encoding,x-accept-response-streaming,x-user-agent,x-grpc-web,grpc-timeout
max_age: "1728000"
expose_headers: custom-header-1,grpc-status,grpc-message
http_filters:
- name: envoy.filters.http.health_check
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.health_check.v3.HealthCheck
pass_through_mode: false
headers:
- name: ":path"
exact_match: "/healthz"
- name: "x-envoy-livenessprobe"
exact_match: "healthz"
- name: envoy.filters.http.grpc_web
- name: envoy.filters.http.cors
- name: envoy.filters.http.router
typed_config: {}
transport_socket:
name: envoy.transport_sockets.tls
typed_config:
"@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext
require_client_certificate: false
common_tls_context:
tls_certificates:
- certificate_chain:
filename: /etc/ssl/envoy/tls.crt
private_key:
filename: /etc/ssl/envoy/tls.key
clusters:
- name: my-app-cluster
connect_timeout: 0.5s
type: STRICT_DNS
dns_lookup_family: V4_ONLY
lb_policy: ROUND_ROBIN
http2_protocol_options: {}
load_assignment:
cluster_name: my-app-cluster
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: my-app.default.svc.cluster.local
port_value: 52001
health_checks:
timeout: 1s
interval: 10s
unhealthy_threshold: 2
healthy_threshold: 2
grpc_health_check: {}
admin:
access_log_path: /dev/stdout
address:
socket_address:
address: 127.0.0.1
port_value: 8090
我仍然不确定在 Deployment 中指定两个容器的资源要求和副本数,但解决方案有效。