节点选择器或污点有更多的优先级？

Question

我们有一个 aks 集群，目前有一个系统节点池和 2 个用户节点池 (usnp1&usnp2)。我们目前有多个应用程序 pods 运行跨用户节点池。

所以现在我们需要运行我们现有的应用程序之一 pods 专门拥有一个单独的节点池和命名空间。例如，我们的应用程序 myapp 当前正在运行命名空间“all-app-ns”中，其节点选择器设置为“usnp1”，并且在同一个池中我们还有其他应用程序 pods。因此需要将 myapp pods 和所有相关组件完全移动到专门用于“myapp-ns”的新命名空间，并且它应该只分配给“myapp-pool”

myapp-pool" 不应分配除 myapp 之外的任何其他 pods。哪个选项在这里更优先 - 带有 pods 或污点的节点选择器？我读到像 nodeselector 会强制调度程序“应该分配”到特定节点，其中污点会“可以分配”..所以 nodeselector 会是更好的选择？
因为 myapp 部署和 pods 目前已经运行ning 在“all-app-ns”中，是否将它们移动到新的命名空间“myapp-ns”，将这些删除命名空间 -all-app-ns 中现有的 myapp pods？这会导致停机吗？目前我们使用 helm chart 部署了它，helmstate 将删除旧的 pods 并创建新的pods，是否会发生停机？

Answer 1

... so nodeselector will be a better option?

您可以使用 nodeSelector，也可以使用 nodeAffinity。这只是配置问题。 nodeSelector 严格定义 Pod，例如

nodeSelector:
   usnp1: "true"

只能部署到标签为 usnp1=true 的节点。但是您可以在 Pod 配置中使用 nodeAffinity 定义相同的内容，例如

kind: Pod
metadata:
  name: <your-pod-name>
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: usnp1
            operator: In
            values:
            - "true"

这些配置是相同的。

... and any downtime will happen?

如果我理解清楚的话，你需要将目前部署在all-app-ns的myapp pods移动到myapp-ns。因此，在这种情况下，如果您要部署到您的案例 myapp-ns 中的不同命名空间，则 all-app-ns 中的 pods 将不会被取消部署。我想，因为在 helm install 中，如果您还没有切换到 kubectl set-context，则需要定义 --namespace 选项。因此，要取消部署，您必须 helm uninstall <RELEASE> --namespace=all-app-ns。应用程序的可用性取决于您的 DNS 记录，因此如果您需要公开应用程序，您可能需要配置它们。

回答以下问题：

so for point1. if we have any pods without any nodeselector defined, there will be chance to allocate that pods to the new nodepool in this cae right ? the aim here is not allow the new nodepools to have any other pods otherthan the myapp pods. whether taint will help is this scenario than nodeselctor or a admission controller called "podnodeselector" ?

为 myapp pod 配置使用上面描述的 nodeAffinity 配置。

将标签添加到 myapp pod 配置，例如

  template:
    metadata:
      labels:
        usnp1: myapp

对于您不想在该节点安排的任何其他 pods，或者最好说不安排有标签 usnp1: myapp 的 pods 创建 podAntiAffinity 配置例如

  affinity:
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: usnp1
            operator: In
            values:
            - myapp
        topologyKey: "kubernetes.io/hostname"

看看https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/

无论如何，这也不是 100% 的解决方案，因为 pods 的调度是一个复杂的算法，有很多规则，是分数加权的。您可以在 schedulerds 日志中看到分数和权重。

看看https://kubernetes.io/docs/concepts/scheduling-eviction/kube-scheduler/

Also for point2- we will be proceeding with helm upgrade with the modified manifests with the namespace changes from pipeline, in that case whether the helm statefile play a role here to delete the old pods?

对此，我不知道 helm 中的一项功能，您可以在其中并行地从一个命名空间取消部署并部署到第二个命名空间。因为如果我清楚地理解 helm install 应用的部署状态是每个命名空间。因此，对于 deploy/undeploy，如果您尚未接入，您需要始终定义 --namespace。这可能意味着您在部署相同的 helm chart 时不能干扰名称空间状态。

但我对 helm 的经验不多。

节点选择器或污点有更多的优先级？

Node selector or taint have more precedence?

kubernetes

kubectl

kubernetes-pod

azure-aks