Kubernetes 健康检查因自定义 Nginx Web 服务器配置而失败

Kubernetes health checks fail with custom Nginx webserver configuration

我的运行状况检查因以下设置而失败。

nginx.conf

user                            root;
worker_processes                auto;

error_log                       /var/log/nginx/error.log warn;

events {
    worker_connections          1024;
}

http {
    server {
        listen                  80;
        server_name             subdomain.domain.com
        auth_basic              "Restricted";
        auth_basic_user_file    /etc/nginx/.htpasswd;
    }
    server {
        listen                  80;
        auth_basic              off;
    }
    server {
        listen                  2222;
        auth_basic              off;
        location /healthz {
            return 200;
        }
    }
}

DOCKERFILE

FROM nginx:alpine
COPY index.html /usr/share/nginx/html/index.html
VOLUME /usr/share/nginx/html
COPY /server/nginx.conf /etc/nginx/
COPY /server/htpasswd /etc/nginx/.htpasswd
CMD ["nginx", "-g", "daemon off;"]
EXPOSE 80
EXPOSE 2222

deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
  namespace: my-namespace
  labels:
    app: my-app
spec:
  replicas: 1
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
        - name: my-app
          image: gcr.io/GOOGLE_CLOUD_PROJECT/my-app
          ports:
            - containerPort: 80
            - containerPort: 2222
          livenessProbe:
            httpGet:
              path: /healthz
              port: 2222
          readinessProbe:
            httpGet:
              path: /healthz
              port: 2222

当我删除 nginx.conf 中的 "server_name" 行并删除第二个服务器块时,它确实有效。 这可能是 ingress/load 平衡器的问题,因为我不知道更新需要多长时间(昨天我经历了一个健康的 pod 在几分钟后变得不健康)。 运行 它在 Google Kubernetes Engine (GKE) 上,带有 Google 自己的入口控制器(不是 NGINX 入口!)

我做错了什么?

问题在于 GKE 的负载均衡器会自行进行健康检查。这些默认情况下查看 / 并期望 return 中的 200。只有当 deployment/pod 中的健康检查声明了另一条路径时,负载均衡器健康检查才会选择这些路径。

应用入口 YAML 后配置负载均衡器。只要负载均衡器运行,任何影响负载均衡器的部署或入口更改都不会被接受。这意味着我必须先删除负载均衡器,然后应用部署、服务和入口 YAML(然后入口自动设置负载均衡器)。可以手动输入正确的路径(并等待几分钟),而不是删除负载均衡器。

由于负载均衡器似乎对每个打开的端口进行健康检查,我删除了我的 2222 端口并添加了 location /healthz 到 nginx 中端口 80 的每个服务器块,auth_basic 关闭。

参见:https://cloud.google.com/load-balancing/docs/health-check-concepts and and

新建nginx.conf

user                            root;
worker_processes                auto;

error_log                       /var/log/nginx/error.log warn;

events {
    worker_connections          1024;
}

http {
    server {
        listen                  80;
        server_name             subdomain1.domain.com;
        root                    /usr/share/nginx/html;
        index                   index.html;
        auth_basic              "Restricted";
        auth_basic_user_file    /etc/nginx/.htpasswd_subdomain1;
        location /healthz {
            auth_basic          off;
            allow               all;
            return              200;
        }
    }
    server {
        listen                  80;
        server_name             subdomain2.domain.com;
        root                    /usr/share/nginx/html;
        index                   index.html;
        auth_basic              "Restricted";
        auth_basic_user_file    /etc/nginx/.htpasswd_subdomain2;
        location /healthz {
            auth_basic          off;
            allow               all;
            return              200;
        }
    }
    server {
        listen                  80;
        server_name             domain.com www.domain.com;
        root                    /usr/share/nginx/html;
        index                   index.html;
        auth_basic              "Restricted";
        auth_basic_user_file    /etc/nginx/.htpasswd_domain;
        location /healthz {
            auth_basic          off;
            allow               all;
            return              200;
        }
    }
    ## next block probably not necessary
    server {
        listen                  80;
        auth_basic              off;
        location /healthz {
            return              200;
        }
    }
}

我的新deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
  namespace: my-namespace
  labels:
    app: my-app
spec:
  replicas: 1
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
        - name: my-app
          image: gcr.io/GOOGLE_CLOUD_PROJECT/my-app
          ports:
            - containerPort: 80
          livenessProbe:
            httpGet:
              path: /healthz
              port: 80
          readinessProbe:
            httpGet:
              path: /healthz
              port: 80