节点的节点亲和力被忽略

Question

我有 3 个节点，每个节点都标记如下：

node-0 -> mongo-volume=volume-0
node-1 -> mongo-volume=volume-1
node-2 -> mongo-volume=volume-2

我正在寻找一种方法来在特殊节点上安排有状态集的副本。

我首先使用 requiredDuringSchedulingIgnoredDuringExecution 的困难方法，一切正常。

然后我想通过使用preferredDuringSchedulingIgnoredDuringExecution来测试软方式。

我首先告诉我的 statefulset 优先选择标签为 volume-0 的节点，没问题 pods 都部署在 节点-0。

然后我更改了具有标签 volume-1 的节点的首选项。还有我的问题，pods 部署在 node-0 和 node-2 上，但节点在 node-1.

我对标签 volume-2 做了同样的事情，它再次运行良好，pods 都部署在 node-2 上。

节点亲和力配置：

affinity:
  nodeAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:
    - weight: 100
      preference:
        matchExpressions:
        - key: mongo-volume
          operator: In
          values:
          - volume-1

当我查看节点的资源使用情况时，我注意到节点 1 的负载比其他节点多一点。它能解释为什么 sheduler 拒绝在此节点上部署 pods 吗？

NAME    CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
node-0   63m          6%     795Mi           41%
node-1   116m         11%    978Mi           51%
node-2   78m          7%     752Mi           39%

我想知道为什么它适用于 node-0 和 node-2 但不适用于 node-1。如果有可能的方法来修复它。

Answer 1

亲和性策略在此节点上优先而不是运行，而不是select直接节点。

亲和力的权重是亲和力策略的优先级。例如：

podAntiAffinity:
  preferredDuringSchedulingIgnoredDuringExecution:
  - weight: 100
    podAffinityTerm:
      labelSelector:
        matchExpressions:
        - key: k1
          operator: In
          values:
          - "v1"
      topologyKey: kubernetes.io/hostname
  - weight: 30
    podAffinityTerm:
      labelSelector:
        matchExpressions:
        - key: k2
          operator: In
          values:
          - "v2"
      topologyKey: kubernetes.io/hostname

K8s scheduler 医生说：

kube-scheduler selects a node for the pod in a 2-step operation:

Filtering

Scoring

The filtering step finds the set of Nodes where it’s feasible to schedule the Pod. For example, the PodFitsResources filter checks whether a candidate Node has enough available resource to meet a Pod’s specific resource requests. After this step, the node list contains any suitable Nodes; often, there will be more than one. If the list is empty, that Pod isn’t (yet) schedulable.

In the scoring step, the scheduler ranks the remaining nodes to choose the most suitable Pod placement. The scheduler assigns a score to each Node that survived filtering, basing this score on the active scoring rules.

Finally, kube-scheduler assigns the Pod to the Node with the highest ranking. If there is more than one node with equal scores, kube-scheduler selects one of these at random

亲和力是考虑的一部分，但不是全部。

节点的节点亲和力被忽略

Node affinity ignored for a node

kubernetes

kubectl

kubernetes-statefulset