使用 Kind: deployment 限制在 kubernetes 中首次部署 pods

Question

我有几个 pods 需要一个一个地执行推出（初始和以后的更新）。（实际上只需要先准备好，其余的就可以启动或升级）我为此使用了有状态集，因为它确保一次只更新或创建一个，但我们使用远程呈现进行开发，它不支持替换有状态集。因此，我认为我可以使用部署而不是具有滚动更新策略的状态集，并限制 maxunavailable 或 maxsurge 或任何“限制”部署的数量。但是对于最初的部署不起作用，因为 K8s 会一次创建所需的 2 个，而不是一个一个地创建。有没有办法通过部署实现这一目标，或者我是否需要使用有状态集？（或者：是否有将远程呈现与有状态集一起使用的技巧）

根据评论中的问题进行澄清：

这里有问题的软件是集群模式下的flyway结合mariadb。然后 table 锁定不起作用，同时启动 pods 可以尝试同时执行架构和数据更新
init 容器没有帮助，因为它们同时为 pod 的多个实例启动，只需确保每个实例的主容器在 init 容器之后启动
问题只出现在第一次初始化上，因为之后我可以将滚动更新策略配置为一次只更新一个容器。在横向扩展的情况下，我必须以 1 的增量进行，但无论如何这将是一个手动过程。
我可以确保新部署的部署描述符使用规模 1，然后更新到规模 2，但这会导致非常复杂的自动部署过程，其规模取决于状态和构建链需要检查部署是否存在以决定它是更新还是首次部署。哪个容易出错且过于复杂

Answer 1

在我看来，有两种可能的解决方案，但都需要额外的努力。
我将描述这两种解决方案，您可以选择最适合您的一种。

解决方案一：使用脚本部署应用程序

您可以使用以下脚本部署您的应用程序：

$ cat deploy.sh
#!/bin/bash

# Usage: deploy.sh DEPLOYMENT_FILENAME NAMESPACE NUMBER_OF_REPLICAS

deploymentFileName=   # Deployment manifest file name
namespace=            # Namespace where app should be deployed
replicas=             # Numbers of replicas that should be deployed

# First deploy ONLY one replica - sed command changes actual number of replicas to 1 but the original manifest file remains intact
cat ${deploymentFileName} | sed -E 's/replicas: [0-9]+$/replicas: 1/' | kubectl apply -n $namespace -f -

# The "until" loop waits until the first replica is ready
# Check deployment rollout status every 10 seconds (max 10 minutes) until complete.
attempts=0
rollout_cmd="kubectl rollout status -f ${deploymentFileName} -n $namespace"
until $rollout_cmd || [ $attempts -eq 60 ]; do
    $rollout_cmd
    attempts=$((attempts + 1))
    sleep 10
done

if [ $attempts -eq 60 ]; then
    echo "ERROR: Timeout"
    exit 1
fi

# With the first replica running, deploy the rest unless we want to deploy only one replica
if [ $replicas -ne 1 ]; then
    kubectl scale -f ${deploymentFileName} -n $namespace --replicas=${replicas}
fi

我创建了一个简单的示例来说明它是如何工作的。

首先，我创建了 web-app.yml 部署清单文件：

$ kubectl create deployment web-app --image=nginx --replicas=3 --dry-run=client -oyaml > web-app.yml

然后我使用 deploy.sh 脚本部署了 web-app 部署：

$ ./deploy.sh web-app.yml default 3
deployment.apps/web-app created
Waiting for deployment "web-app" rollout to finish: 0 of 1 updated replicas are available...
deployment "web-app" successfully rolled out
deployment.apps/web-app scaled

从另一个终端window可以看到，只有第一个副本（web-app-5cd54cb75-krgtc）处于“运行”状态时，其余的才开始启动：

$ kubectl get pod -w
NAME                      READY   STATUS             RESTARTS   AGE
web-app-5cd54cb75-krgtc   0/1     Pending             0          0s
web-app-5cd54cb75-krgtc   0/1     Pending             0          0s
web-app-5cd54cb75-krgtc   0/1     ContainerCreating   0          0s
web-app-5cd54cb75-krgtc   1/1     Running             0          4s # First replica in the "Running state"
web-app-5cd54cb75-tmpcn   0/1     Pending             0          0s
web-app-5cd54cb75-tmpcn   0/1     Pending             0          0s
web-app-5cd54cb75-sstg6   0/1     Pending             0          0s
web-app-5cd54cb75-tmpcn   0/1     ContainerCreating   0          0s
web-app-5cd54cb75-sstg6   0/1     Pending             0          0s
web-app-5cd54cb75-sstg6   0/1     ContainerCreating   0          0s
web-app-5cd54cb75-tmpcn   1/1     Running             0          5s
web-app-5cd54cb75-sstg6   1/1     Running             0          7s

解决方案二：使用 initContainers

您可以使用 init Container，它将运行一个脚本来确定哪个 Pod 应该首先运行：

$ cat checker.sh
#!/bin/bash

labelName="app" # label key of you rapplication
labelValue="web" # label value of your application
hostname=$(hostname) 
apt update && apt install -y jq 1>/dev/null 2>&1 # install the jq program - command-line JSON processor
startFirst=$(curl -sSk -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" "https://kubernetes.default.svc.cluster.local/api/v1/namespaces/default/pods/?labelSelector=${labelName}%3D${labelValue}&limit=500" | jq '.items[].metadata.name' | sort | head -n1 | tr -d '""') # determine which Pod should start first -> kubectl get pods -l app=web -o=name | sort | head -n1

firstPodStatusChecker=$(curl -sSk -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" "https://kubernetes.default.svc.cluster.local/api/v1/namespaces/default/pods/${startFirst}"| jq '.status.phase' | tr -d '""') # check status of the Pod that should start first

attempts=0
if [ ${hostname} != ${startFirst} ]
then
    while [ ${firstPodStatusChecker} != "Running" ] && [ $attempts -lt 60 ]; do
        attempts=$((attempts + 1))
        sleep 5
        firstPodStatusChecker=$(curl -sSk -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" "https://kubernetes.default.svc.cluster.local/api/v1/namespaces/default/pods/${startFirst}"| jq '.status.phase' | tr -d '""') # check status of the Pod that should start first
    done
fi

if [ $attempts -eq 60 ]; then
    echo "ERROR: Timeout"
    exit 1
fi

此脚本中最重要的一行是：

startFirst=$(curl -sSk -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" "https://kubernetes.default.svc.cluster.local/api/v1/namespaces/default/pods/?labelSelector=${labelName}%3D${labelValue}&limit=500" | jq '.items[].metadata.name' | sort | head -n1 | tr -d '""')

这一行决定了哪个 Pod 应该首先启动，其余的副本将等待第一个 Pod 启动。我正在使用 curl 命令从 Pod 访问 API。我们不需要手动创建复杂的 curl 命令，但我们可以轻松转换 kubectl 命令到 curl 命令 - 如果您运行 kubectl 命令带有 -v=10 选项，您可以看到 curl 请求。

注意： 在这种方法中，您需要为 ServiceAccount 添加适当的权限才能与 API 通信。例如，您可以像这样向 ServiceAccount 添加一个 view 角色：

$ kubectl create clusterrolebinding --serviceaccount=default:default --clusterrole=view default-sa-view-access
clusterrolebinding.rbac.authorization.k8s.io/default-sa-view-access created

您可以安装此 checker.sh 脚本，例如作为 ConfigMap:

$ cat check-script-configmap.yml
apiVersion: v1
kind: ConfigMap
metadata:
  name: check-script
data:
  checkScript.sh: |
    #!/bin/bash

    labelName="app" # label key of you rapplication
    labelValue="web" # label value of your application
    hostname=$(hostname) 
    apt update && apt install -y jq 1>/dev/null 2>&1 # install the jq program - command-line JSON processor
    startFirst=$(curl -sSk -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" "https://kubernetes.default.svc.cluster.local/api/v1/namespaces/default/pods/?labelSelector=${labelName}%3D${labelValue}&limit=500" | jq '.items[].metadata.name' | sort | head -n1 | tr -d '""') # determine which Pod should start first -> kubectl get pods -l app=web -o=name | sort | head -n1

    firstPodStatusChecker=$(curl -sSk -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" "https://kubernetes.default.svc.cluster.local/api/v1/namespaces/default/pods/${startFirst}"| jq '.status.phase' | tr -d '""') # check status of the Pod that should start first

    attempts=0  
    if [ ${hostname} != ${startFirst} ]
    then
        while [ ${firstPodStatusChecker} != "Running" ] && [ $attempts -lt 60 ]; do
            attempts=$((attempts + 1))
            sleep 5
            firstPodStatusChecker=$(curl -sSk -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" "https://kubernetes.default.svc.cluster.local/api/v1/namespaces/default/pods/${startFirst}"| jq '.status.phase' | tr -d '""') # check status of the Pod that should start first
        done
    fi

    if [ $attempts -eq 60 ]; then
        echo "ERROR: Timeout"
        exit 1
    fi

我还创建了一个简单的示例来说明它是如何工作的。

首先，我创建了上面的check-script ConfigMap：

$ kubectl apply -f check-script-configmap.yml
configmap/check-script created

然后我将这个 ConfigMap 安装到 initContainer 并部署了这个部署：

$ cat web.yml
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: web
  name: web
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web
  template:
    metadata:
      labels:
        app: web
    spec:
      volumes:
        - name: check-script
          configMap:
            name: check-script
      initContainers:
      - image: nginx
        name: init
        command: ["bash", "/mnt/checkScript.sh"]
        volumeMounts:
        - name: check-script
          mountPath: /mnt/

      containers:
      - image: nginx
        name: nginx

$ kubectl apply -f web.yml
deployment.apps/web created

从另一个终端window可以看到，只有第一个副本（web-98c4d45dd-6zcsd）处于“运行”状态时，其余的才开始启动：

$ kubectl get pod -w
NAME                  READY   STATUS    RESTARTS   AGE
web-98c4d45dd-ztjlf   0/1     Pending   0          0s
web-98c4d45dd-ztjlf   0/1     Pending   0          0s
web-98c4d45dd-6zcsd   0/1     Pending   0          0s
web-98c4d45dd-mc2ww   0/1     Pending   0          0s
web-98c4d45dd-6zcsd   0/1     Pending   0          0s
web-98c4d45dd-mc2ww   0/1     Pending   0          0s
web-98c4d45dd-ztjlf   0/1     Init:0/1   0          0s
web-98c4d45dd-6zcsd   0/1     Init:0/1   0          0s
web-98c4d45dd-mc2ww   0/1     Init:0/1   0          1s
web-98c4d45dd-6zcsd   0/1     Init:0/1   0          3s
web-98c4d45dd-ztjlf   0/1     Init:0/1   0          3s
web-98c4d45dd-mc2ww   0/1     Init:0/1   0          4s
web-98c4d45dd-6zcsd   0/1     PodInitializing   0          10s
web-98c4d45dd-6zcsd   1/1     Running           0          12s
web-98c4d45dd-mc2ww   0/1     PodInitializing   0          23s
web-98c4d45dd-ztjlf   0/1     PodInitializing   0          23s
web-98c4d45dd-mc2ww   1/1     Running           0          25s
web-98c4d45dd-ztjlf   1/1     Running           0          26s

使用 Kind: deployment 限制在 kubernetes 中首次部署 pods

Limit first time deployment of pods in kubernetes using Kind: deployment

mariadb

flyway

kubernetes

解决方案一：使用脚本部署应用程序

解决方案二：使用 initContainers