NodeSelector 不适用于多节点池?
NodeSelector does not work for multiple node pools?
TL;DR: NodeSelector ignores nodes from another NodePool. How to distribute pods across more NodePools using a label nodeSelector or other technique?
我有两个这样的节点池:
...
# Spot node pool
resource "azurerm_kubernetes_cluster_node_pool" "aks_staging_np_compute_spot" {
name = "computespot"
(...)
vm_size = "Standard_F8s_v2"
max_count = 2
min_count = 2
(...)
priority = "Spot"
eviction_policy = "Delete"
(...)
node_labels = {
"pool_type" = "compute"
}
# Regular node pool
resource "azurerm_kubernetes_cluster_node_pool" "aks_staging_np_compute_base" {
name = "computebase"
(...)
vm_size = "Standard_F8s_v2"
max_count = 2
min_count = 2
node_labels = {
"pool_type" = "compute"
}
两个池都部署在 AKS 中,所有节点都处于 OK 状态。请注意两件事:
- 两者都有标签
pool_type: compute
- 两者的大小与
Standard_F8s_v2
相同
(我的集群中还有其他20个不同标签的节点并不重要。)
然后我得到了这样的部署(为简洁起见省略了不相关的行):
apiVersion: apps/v1
kind: Deployment
metadata:
(...)
spec:
replicas: 4
selector:
matchLabels:
app: myapp
template:
(...)
spec:
nodeSelector:
pool_type: compute
(...)
containers:
(...)
tolerations
中还有一个条目用于接受 Azure 现货实例。它显然有效。
tolerations:
- key: "kubernetes.azure.com/scalesetpriority"
operator: "Equal"
value: "spot"
effect: "NoSchedule"
问题是 应用程序只部署在一个节点池上("computespot"
在这种情况下)并且 从不 接触另一个(computebase
).即使各个节点的标签和大小相同。
- 2 pods 在
computespot
个节点上 运行,每个节点一个节点。
- 第二个 pods 没有安排经典错误
0/24 nodes are available: 14 Insufficient cpu, 17 Insufficient memory, 4 node(s) didn't match node selector.
这绝对是谎言 因为我可以看到 computebase
节点空荡荡地坐在那里。
如何解决?
找到了使用 pod affinity 的解决方案。
spec:
# This didn't work:
#
# nodeSelector:
# pool_type: compute
#
# But this does:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: pool_type
operator: In
values:
- compute
我不知道原因,因为我们仍在处理一个标签。如果有人知道,请分享。
TL;DR: NodeSelector ignores nodes from another NodePool. How to distribute pods across more NodePools using a label nodeSelector or other technique?
我有两个这样的节点池:
...
# Spot node pool
resource "azurerm_kubernetes_cluster_node_pool" "aks_staging_np_compute_spot" {
name = "computespot"
(...)
vm_size = "Standard_F8s_v2"
max_count = 2
min_count = 2
(...)
priority = "Spot"
eviction_policy = "Delete"
(...)
node_labels = {
"pool_type" = "compute"
}
# Regular node pool
resource "azurerm_kubernetes_cluster_node_pool" "aks_staging_np_compute_base" {
name = "computebase"
(...)
vm_size = "Standard_F8s_v2"
max_count = 2
min_count = 2
node_labels = {
"pool_type" = "compute"
}
两个池都部署在 AKS 中,所有节点都处于 OK 状态。请注意两件事:
- 两者都有标签
pool_type: compute
- 两者的大小与
Standard_F8s_v2
相同
(我的集群中还有其他20个不同标签的节点并不重要。)
然后我得到了这样的部署(为简洁起见省略了不相关的行):
apiVersion: apps/v1
kind: Deployment
metadata:
(...)
spec:
replicas: 4
selector:
matchLabels:
app: myapp
template:
(...)
spec:
nodeSelector:
pool_type: compute
(...)
containers:
(...)
tolerations
中还有一个条目用于接受 Azure 现货实例。它显然有效。
tolerations:
- key: "kubernetes.azure.com/scalesetpriority"
operator: "Equal"
value: "spot"
effect: "NoSchedule"
问题是 应用程序只部署在一个节点池上("computespot"
在这种情况下)并且 从不 接触另一个(computebase
).即使各个节点的标签和大小相同。
- 2 pods 在
computespot
个节点上 运行,每个节点一个节点。 - 第二个 pods 没有安排经典错误
0/24 nodes are available: 14 Insufficient cpu, 17 Insufficient memory, 4 node(s) didn't match node selector.
这绝对是谎言 因为我可以看到computebase
节点空荡荡地坐在那里。
如何解决?
找到了使用 pod affinity 的解决方案。
spec:
# This didn't work:
#
# nodeSelector:
# pool_type: compute
#
# But this does:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: pool_type
operator: In
values:
- compute
我不知道原因,因为我们仍在处理一个标签。如果有人知道,请分享。