为什么 GCE 负载均衡器通过域名和 IP 地址表现不同？

Question

后端服务恰好在负载均衡器的健康检查路径上返回状态 404。当我浏览到负载均衡器的域名时，我得到 "Error: Server Error/ The server encountered a temporary error"，并且日志显示

"type.googleapis.com/google.cloud.loadbalancing.type.LoadBalancerLogEntry" statusDetails: "failed_to_pick_backend"，有道理。

当我浏览到负载均衡器的静态 IP 时，我的浏览器显示底层 Kubernetes Pod 返回的 404 错误消息，换句话说，尽管健康检查失败，负载均衡器仍通过了请求。

为什么会有这两种不同的行为？

[编辑]

这是创建负载均衡器的 Ingress 的 yaml：

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: ingress1
spec:
  rules:
  - host: example.com
    http:
      paths:
      - backend:
          serviceName: myservice
          servicePort: 80

Answer 1

我对此做了 "deep dive" 并设法在我的 GKE 集群上重现了这种情况，所以现在我可以看出这里结合了一些东西。

A backend service happens to be returning Status 404 on the health check path of the Load Balancer.

可能有2个选项（从您提供的描述看不清楚）。

类似于： “错误：服务器错误 服务器遇到临时错误，无法完成您的请求。请在 30 秒后重试。"

这是您从 LoadBalancer 获得的，以防 pod 的 HealthCheck 失败。官方documentation on GKE Ingress object说

a Service exposed through an Ingress must respond to health checks from the load balancer.

Any container that is the final destination of load-balanced traffic must do one of the following to indicate that it is healthy:

Serve a response with an HTTP 200 status to GET requests on the / path.

Configure an HTTP readiness probe. Serve a response with an HTTP 200 status to GET requests on the path specified by the readiness probe. The Service exposed through an Ingress must point to the same container port on which the readiness probe is enabled.

需要修复 HealthCheck 处理。您可以访问 GCP 控制台 - 网络服务 - 负载平衡来查看负载平衡器的详细信息。

"404 未找到 -- nginx/1.17.6"

这个很清楚。那是端点返回的响应 myservice 正在向其发送请求。看起来那里配置错误。我的猜测是 pod 只是无法正确地满足该请求。可能是nginx web-server 问题等。请检查配置以找出pod 无法为请求提供服务的原因。

在使用设置时，我找到了一个 image，它允许您检查请求是否已到达 pod 并请求 headers.

因此可以像这样创建一个 pod：

apiVersion: v1
kind: Pod
metadata:
  annotations:
    run: fake-web
  name: fake-default-knp
#  namespace: kube-system
spec:
  containers:
  - image: mendhak/http-https-echo
    imagePullPolicy: IfNotPresent
    name: fake-web
    ports:
    - containerPort: 8080
      protocol: TCP

能够查看传入请求 (kubectl logs -f fake-default-knp) 中的所有 header。

When I browse to the Load Balancer's Static IP, my browser shows the 404 Error Message which the underlying Kubernetes Pod returned.

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: ingress1
spec:
  rules:
  - host: example.com
    http:
      paths:
      - backend:
          serviceName: myservice
          servicePort: 80

创建这样的 Ingress object 后，GKE 集群中将至少有 2 个后端。 - 您在创建 Ingress 时指定的后端（myservice 一个） - 默认值（在集群创建时创建）。

kubectl get pods -n kube-system -o wide
NAME                       READY   STATUS    RESTARTS   AGE   IP       
l7-default-backend-xyz     1/1     Running   0          20d   10.52.0.7

请注意，myservice 仅服务将 Host header 设置为 example.com 的请求。其余请求发送到 "default backend" 。这就是为什么您在浏览到 LoadBalancer 的 IP 地址时收到 "default backend - 404" 错误消息的原因。

从技术上讲，有一个 default-http-backend 服务将 l7-default-backend-xyz 作为端点。

kubectl get svc -n kube-system -o wide 
NAME                   TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)         AGE   SELECTOR
default-http-backend   NodePort    10.0.6.134    <none>        80:31806/TCP    20d   k8s-app=glbc

kubectl get ep -n kube-system
NAME                   ENDPOINTS       AGE
default-http-backend   10.52.0.7:8080  20d

同样，对于 "Host" header 不等于您在 Ingress 中指定的请求的 "object" returns "default backend - 404" 错误.

希望它能阐明问题 :)

编辑：

myservice serves only requests that have Host header set to example.com." So you are saying that requests go to the LB only when there is a host header?

不完全是。 LB接收所有的请求，按照"Host"header的值传递请求。 example.com 主机 header 的请求将在 myservice 后端处理。

简单来说，逻辑如下：

请求到达；
系统检查主机header（以确定用户的后端）
如果有合适的用户后端（根据 Ingress 配置）并且该后端是健康的，则请求被服务，否则“错误：服务器错误 服务器遇到临时错误无法完成您的请求。请在 30 秒后重试。”如果后端处于 non-healthy 状态则抛出；
如果请求的 Host header 与 Ingress 规范中的任何主机都不匹配，请求将发送到 l7-default-backend-xyz 后端（不是 Ingress 配置中提到的后端）。该后端回复："default backend - 404" error .

希望这能说明问题。

为什么 GCE 负载均衡器通过域名和 IP 地址表现不同？

Why does GCE Load Balancer behave differently through the domain name and the IP address?

load-balancing

health-monitoring

google-cloud-platform

kubernetes-ingress