Argo 工作流中的体积节点亲和力冲突

Volume node affinity conflicts in Argo workflows

我有一个包含两个步骤的 Argo 工作流程,第一步在 Linux 上运行,第二个在 Windows

上运行
apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
  name: my-workflow-v1.13
spec:
  entrypoint: process
  volumeClaimTemplates:
    - metadata:
        name: workdir
      spec:
        accessModes: [ "ReadWriteOnce" ]
        resources:
          requests:
            storage: 1Gi
  arguments:
    parameters:
      - name: jobId
        value: 0
  templates:
    - name: process
      steps:
        - - name: prepare
            template: prepare
        - - name: win-step
            template: win-step

    - name: win-step
      nodeSelector:
        kubernetes.io/os: windows
      container:
        image: mcr.microsoft.com/windows/nanoserver:1809
        command: ["cmd", "/c"]
        args: ["dir", "C:\workdir\source"]
        volumeMounts:
          - name: workdir
            mountPath: /workdir

    - name: prepare
      nodeSelector:
        kubernetes.io/os: linux
      inputs:
        artifacts:
          - name: src
            path: /opt/workdir/source.zip
            s3:
              endpoint: minio:9000
              insecure: true
              bucket: "{{workflow.parameters.jobId}}"
              key: "source.zip"
              accessKeySecret:
                name: my-minio-cred
                key: accesskey
              secretKeySecret:
                name: my-minio-cred
                key: secretkey
      script:
        image: garthk/unzip:latest
        imagePullPolicy: IfNotPresent
        command: [sh]
        source: |
          unzip /opt/workdir/source.zip -d /opt/workdir/source
        volumeMounts:
          - name: workdir
            mountPath: /opt/workdir

两个步骤共享一个卷。

为了在 Azure Kubernetes 服务中实现这一点,我必须创建两个节点池,一个用于 Linux 个节点,另一个用于 Windows 个节点

问题是,当我对工作流进行排队时,有时它会完成,有时 win-step(在 windows 容器中运行的步骤),hangs/fails 并显示这条消息

1 node(s) had volume node affinity conflict

我读到这可能会发生,因为卷被安排在特定区域,而 windows 容器(因为它在不同的池中)被安排在没有访问权限的不同区域到那个数量,但我找不到解决方案。

请帮忙。

the first runs on Linux and the second runs on Windows

我怀疑您能否在 Linux(通常是 ext4 文件系统)和 Windows 节点 Azure Windows containers uses NTFS 文件系统上安装相同的卷。

因此您在第二步中尝试安装的卷位于与您的 nodeSelector 不匹配的节点池上。