无法将 pod 间亲和力应用于 Airflow 调度程序

Question

当我尝试将 podAffinity 附加到来自官方 Airflow helm 图表的 调度程序部署时，我遇到了一个奇怪的行为，例如：

affinity: podAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app operator: In values: - postgresql topologyKey: "kubernetes.io/hostname"

使用示例部署，podAffinity 应该“连接”到：

metadata: name: {{ template "postgresql.fullname" . }} labels: app: postgresql chart: {{ template "postgresql.chart" . }} release: {{ .Release.Name | quote }} heritage: {{ .Release.Service | quote }} spec: serviceName: {{ template "postgresql.fullname" . }}-headless replicas: 1 selector: matchLabels: app: postgresql release: {{ .Release.Name | quote }} template: metadata: name: {{ template "postgresql.fullname" . }} labels: app: postgresql chart: {{ template "postgresql.chart" . }}

这导致：

NotTriggerScaleUp: pod didn't trigger scale-up: 1 node(s) didn't match pod affinity/anti-affinity, 1 node(s) didn't match pod affinity rules

但是，将相同的 podAffinity 配置应用于 Web 服务器部署工作得很好。另外，将示例 Deployment 更改为 vanilla nginx 会在结果中体现出来。

这似乎不是任何资源限制问题，因为我已经尝试了各种配置，每次都得到相同的结果。除了节点关联之外，我不使用任何自定义配置。

有没有人遇到同样的情况或者知道我可能做错了什么？

设置：

AKS 集群

Airflow 舵图 1.1.0

Airflow 1.10.15（但我认为这不重要）

kubectl 客户端 (1.22.1) 和服务器 (1.20.7)

Airflow 图表链接：

Scheduler

Webserver

Answer 1

我已经在我的 GKE 集群上重新创建了这个场景，我决定提供一个社区 Wiki 答案来表明 podAffinity on the Scheduler 按预期工作。我将在下面逐步描述我是如何测试它的。

在 values.yaml 文件中，我将 podAffinity 配置如下：

$ cat values.yaml
...
# Airflow scheduler settings
scheduler:  affinity:
    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: app
            operator: In
            values:
            - postgresql
        topologyKey: "kubernetes.io/hostname"
...

我已经使用 Helm 包管理器和指定的 values.yaml 文件在 Kubernetes 集群上安装了 Airflow。

$ helm install airflow apache-airflow/airflow --values values.yaml

稍后我们可以查看 scheduler:

的状态

$ kubectl get pods -owide | grep "scheduler"
airflow-scheduler-79bfb664cc-7n68f   0/2     Pending   0          8m6s   <none>      <none>                                 <none>           <none>

我创建了一个带有 app: postgresql 标签的示例部署：

$ cat test.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: postgresql
  name: test
spec:
  replicas: 1
  selector:
    matchLabels:
      app: postgresql
  template:
    metadata:
      labels:
        app: postgresql
    spec:
      containers:
      - image: nginx
        name: nginx
        
$ kubectl apply -f test.yaml
deployment.apps/test created

$ kubectl get pods --show-labels | grep test
test-7d4c9c654-7lqns                 1/1     Running   0          2m   app=postgresql,...

最后，我们可以检查 scheduler 是否已成功创建：

$ kubectl get pods -o wide | grep "scheduler\|test"
airflow-scheduler-79bfb664cc-7n68f   2/2     Running   0          14m     10.X.1.6    nodeA     
test-7d4c9c654-7lqns                 1/1     Running   0          2m27s   10.X.1.5    nodeA

此外，有关 pod affinity 和 pod anti-affinity 的详细信息可以在 Understanding pod affinity 文档中找到：

Pod affinity and pod anti-affinity allow you to constrain which nodes your pod is eligible to be scheduled on based on the key/value labels on other pods.

Pod affinity can tell the scheduler to locate a new pod on the same node as other pods if the label selector on the new pod matches the label on the current pod.

Pod anti-affinity can prevent the scheduler from locating a new pod on the same node as pods with the same labels if the label selector on the new pod matches the label on the current pod.

无法将 pod 间亲和力应用于 Airflow 调度程序

Cannot apply inter-pod affinity to Airflow scheduler

kubernetes

airflow

kubernetes-helm

azure-aks