使用 Kind: deployment 限制在 kubernetes 中首次部署 pods

Limit first time deployment of pods in kubernetes using Kind: deployment

我有几个 pods 需要一个一个地执行推出(初始和以后的更新)。 (实际上只需要先准备好,其余的就可以启动或升级) 我为此使用了有状态集,因为它确保一次只更新或创建一个,但我们使用远程呈现进行开发,它不支持替换有状态集。因此,我认为我可以使用部署而不是具有滚动更新策略的状态集,并限制 maxunavailable 或 maxsurge 或任何“限制”部署的数量。 但是对于最初的部署不起作用,因为 K8s 会一次创建所需的 2 个,而不是一个一个地创建。 有没有办法通过部署实现这一目标,或者我是否需要使用有状态集? (或者:是否有将远程呈现与有状态集一起使用的技巧)

根据评论中的问题进行澄清:

在我看来,有两种可能的解决方案,但都需要额外的努力。
我将描述这两种解决方案,您可以选择最适合您的一种。


解决方案一:使用脚本部署应用程序

您可以使用以下脚本部署您的应用程序:

$ cat deploy.sh
#!/bin/bash

# Usage: deploy.sh DEPLOYMENT_FILENAME NAMESPACE NUMBER_OF_REPLICAS

deploymentFileName=   # Deployment manifest file name
namespace=            # Namespace where app should be deployed
replicas=             # Numbers of replicas that should be deployed

# First deploy ONLY one replica - sed command changes actual number of replicas to 1 but the original manifest file remains intact
cat ${deploymentFileName} | sed -E 's/replicas: [0-9]+$/replicas: 1/' | kubectl apply -n $namespace -f -

# The "until" loop waits until the first replica is ready
# Check deployment rollout status every 10 seconds (max 10 minutes) until complete.
attempts=0
rollout_cmd="kubectl rollout status -f ${deploymentFileName} -n $namespace"
until $rollout_cmd || [ $attempts -eq 60 ]; do
    $rollout_cmd
    attempts=$((attempts + 1))
    sleep 10
done

if [ $attempts -eq 60 ]; then
    echo "ERROR: Timeout"
    exit 1
fi

# With the first replica running, deploy the rest unless we want to deploy only one replica
if [ $replicas -ne 1 ]; then
    kubectl scale -f ${deploymentFileName} -n $namespace --replicas=${replicas}
fi

我创建了一个简单的示例来说明它是如何工作的。

首先,我创建了 web-app.yml 部署清单文件:

$ kubectl create deployment web-app --image=nginx --replicas=3 --dry-run=client -oyaml > web-app.yml

然后我使用 deploy.sh 脚本部署了 web-app 部署:

$ ./deploy.sh web-app.yml default 3
deployment.apps/web-app created
Waiting for deployment "web-app" rollout to finish: 0 of 1 updated replicas are available...
deployment "web-app" successfully rolled out
deployment.apps/web-app scaled

从另一个终端window可以看到,只有第一个副本(web-app-5cd54cb75-krgtc)处于“运行”状态时,其余的才开始启动:

$ kubectl get pod -w
NAME                      READY   STATUS             RESTARTS   AGE
web-app-5cd54cb75-krgtc   0/1     Pending             0          0s
web-app-5cd54cb75-krgtc   0/1     Pending             0          0s
web-app-5cd54cb75-krgtc   0/1     ContainerCreating   0          0s
web-app-5cd54cb75-krgtc   1/1     Running             0          4s # First replica in the "Running state"
web-app-5cd54cb75-tmpcn   0/1     Pending             0          0s
web-app-5cd54cb75-tmpcn   0/1     Pending             0          0s
web-app-5cd54cb75-sstg6   0/1     Pending             0          0s
web-app-5cd54cb75-tmpcn   0/1     ContainerCreating   0          0s
web-app-5cd54cb75-sstg6   0/1     Pending             0          0s
web-app-5cd54cb75-sstg6   0/1     ContainerCreating   0          0s
web-app-5cd54cb75-tmpcn   1/1     Running             0          5s
web-app-5cd54cb75-sstg6   1/1     Running             0          7s

解决方案二:使用 initContainers

您可以使用 init Container,它将 运行 一个脚本来确定哪个 Pod 应该首先 运行:

$ cat checker.sh
#!/bin/bash

labelName="app" # label key of you rapplication
labelValue="web" # label value of your application
hostname=$(hostname) 
apt update && apt install -y jq 1>/dev/null 2>&1 # install the jq program - command-line JSON processor
startFirst=$(curl -sSk -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" "https://kubernetes.default.svc.cluster.local/api/v1/namespaces/default/pods/?labelSelector=${labelName}%3D${labelValue}&limit=500" | jq '.items[].metadata.name' | sort | head -n1 | tr -d '""') # determine which Pod should start first -> kubectl get pods -l app=web -o=name | sort | head -n1

firstPodStatusChecker=$(curl -sSk -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" "https://kubernetes.default.svc.cluster.local/api/v1/namespaces/default/pods/${startFirst}"| jq '.status.phase' | tr -d '""') # check status of the Pod that should start first

attempts=0
if [ ${hostname} != ${startFirst} ]
then
    while [ ${firstPodStatusChecker} != "Running" ] && [ $attempts -lt 60 ]; do
        attempts=$((attempts + 1))
        sleep 5
        firstPodStatusChecker=$(curl -sSk -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" "https://kubernetes.default.svc.cluster.local/api/v1/namespaces/default/pods/${startFirst}"| jq '.status.phase' | tr -d '""') # check status of the Pod that should start first
    done
fi

if [ $attempts -eq 60 ]; then
    echo "ERROR: Timeout"
    exit 1
fi

此脚本中最重要的一行是:

startFirst=$(curl -sSk -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" "https://kubernetes.default.svc.cluster.local/api/v1/namespaces/default/pods/?labelSelector=${labelName}%3D${labelValue}&limit=500" | jq '.items[].metadata.name' | sort | head -n1 | tr -d '""')

这一行决定了哪个 Pod 应该首先启动,其余的副本将等待第一个 Pod 启动。 我正在使用 curl 命令从 Pod 访问 API。我们不需要手动创建复杂的 curl 命令,但我们可以轻松转换 kubectl 命令到 curl 命令 - 如果您 运行 kubectl 命令带有 -v=10 选项,您可以看到 curl 请求。

注意: 在这种方法中,您需要为 ServiceAccount 添加适当的权限才能与 API 通信。 例如,您可以像这样向 ServiceAccount 添加一个 view 角色:

$ kubectl create clusterrolebinding --serviceaccount=default:default --clusterrole=view default-sa-view-access
clusterrolebinding.rbac.authorization.k8s.io/default-sa-view-access created

您可以安装此 checker.sh 脚本,例如作为 ConfigMap:

$ cat check-script-configmap.yml
apiVersion: v1
kind: ConfigMap
metadata:
  name: check-script
data:
  checkScript.sh: |
    #!/bin/bash

    labelName="app" # label key of you rapplication
    labelValue="web" # label value of your application
    hostname=$(hostname) 
    apt update && apt install -y jq 1>/dev/null 2>&1 # install the jq program - command-line JSON processor
    startFirst=$(curl -sSk -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" "https://kubernetes.default.svc.cluster.local/api/v1/namespaces/default/pods/?labelSelector=${labelName}%3D${labelValue}&limit=500" | jq '.items[].metadata.name' | sort | head -n1 | tr -d '""') # determine which Pod should start first -> kubectl get pods -l app=web -o=name | sort | head -n1

    firstPodStatusChecker=$(curl -sSk -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" "https://kubernetes.default.svc.cluster.local/api/v1/namespaces/default/pods/${startFirst}"| jq '.status.phase' | tr -d '""') # check status of the Pod that should start first

    attempts=0  
    if [ ${hostname} != ${startFirst} ]
    then
        while [ ${firstPodStatusChecker} != "Running" ] && [ $attempts -lt 60 ]; do
            attempts=$((attempts + 1))
            sleep 5
            firstPodStatusChecker=$(curl -sSk -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" "https://kubernetes.default.svc.cluster.local/api/v1/namespaces/default/pods/${startFirst}"| jq '.status.phase' | tr -d '""') # check status of the Pod that should start first
        done
    fi

    if [ $attempts -eq 60 ]; then
        echo "ERROR: Timeout"
        exit 1
    fi

我还创建了一个简单的示例来说明它是如何工作的。

首先,我创建了上面的check-script ConfigMap:

$ kubectl apply -f check-script-configmap.yml
configmap/check-script created

然后我将这个 ConfigMap 安装到 initContainer 并部署了这个部署:

$ cat web.yml
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: web
  name: web
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web
  template:
    metadata:
      labels:
        app: web
    spec:
      volumes:
        - name: check-script
          configMap:
            name: check-script
      initContainers:
      - image: nginx
        name: init
        command: ["bash", "/mnt/checkScript.sh"]
        volumeMounts:
        - name: check-script
          mountPath: /mnt/

      containers:
      - image: nginx
        name: nginx

$ kubectl apply -f web.yml
deployment.apps/web created

从另一个终端window可以看到,只有第一个副本(web-98c4d45dd-6zcsd)处于“运行”状态时,其余的才开始启动:

$ kubectl get pod -w
NAME                  READY   STATUS    RESTARTS   AGE
web-98c4d45dd-ztjlf   0/1     Pending   0          0s
web-98c4d45dd-ztjlf   0/1     Pending   0          0s
web-98c4d45dd-6zcsd   0/1     Pending   0          0s
web-98c4d45dd-mc2ww   0/1     Pending   0          0s
web-98c4d45dd-6zcsd   0/1     Pending   0          0s
web-98c4d45dd-mc2ww   0/1     Pending   0          0s
web-98c4d45dd-ztjlf   0/1     Init:0/1   0          0s
web-98c4d45dd-6zcsd   0/1     Init:0/1   0          0s
web-98c4d45dd-mc2ww   0/1     Init:0/1   0          1s
web-98c4d45dd-6zcsd   0/1     Init:0/1   0          3s
web-98c4d45dd-ztjlf   0/1     Init:0/1   0          3s
web-98c4d45dd-mc2ww   0/1     Init:0/1   0          4s
web-98c4d45dd-6zcsd   0/1     PodInitializing   0          10s
web-98c4d45dd-6zcsd   1/1     Running           0          12s
web-98c4d45dd-mc2ww   0/1     PodInitializing   0          23s
web-98c4d45dd-ztjlf   0/1     PodInitializing   0          23s
web-98c4d45dd-mc2ww   1/1     Running           0          25s
web-98c4d45dd-ztjlf   1/1     Running           0          26s