Kops kubenet 集群自动缩放不工作
Kops kubenet cluster autoscaling not working
我有一个最多 75 个节点的 kops 集群,并添加了 cluster autoscaler. It uses kubenet 网络。
事情目前已经停止工作 - 即不再发生缩减。
群集的最大容量为 运行,即 75 个节点,即使几乎没有负载。不知道从哪里开始解决问题。
在 cluster autoscaler pod 中看到以下错误
I0222 01:45:14.327164 1 static_autoscaler.go:97] Starting main loop
W0222 01:45:14.770818 1 static_autoscaler.go:150] Cluster is not ready for autoscaling
I0222 01:45:15.043126 1 leaderelection.go:199] successfully renewed lease kube-system/cluster-autoscaler
I0222 01:45:17.121507 1 leaderelection.go:199] successfully renewed lease kube-system/cluster-autoscaler
I0222 01:45:19.126665 1 leaderelection.go:199] successfully renewed lease kube-system/cluster-autoscaler
I0222 01:45:21.327581 1 leaderelection.go:199] successfully renewed lease kube-system/cluster-autoscaler
I0222 01:45:23.331802 1 leaderelection.go:199] successfully renewed lease kube-system/cluster-autoscaler
I0222 01:45:24.775124 1 static_autoscaler.go:97] Starting main loop
W0222 01:45:25.085442 1 static_autoscaler.go:150] Cluster is not ready for autoscaling
自动缩放工作正常。
更新,在运行kops validate cluster
时也看到如下错误
VALIDATION ERRORS
KIND NAME MESSAGE
Node ip-172-20-32-173.ec2.internal node "ip-172-20-32-173.ec2.internal" is not ready
...
I0221 22:16:02.688911 2403 node_conditions.go:60] node "ip-172-20-51-238.ec2.internal" not ready: &NodeCondition{Type:NetworkUnavailable,Status:True,LastHeartbeatTime:2019-02-21 22:15:56 -0500 EST,LastTransitionTime:2019-02-21 22:15:56 -0500 EST,Reason:NoRouteCreated,Message:RouteController failed to create a route,}
我发现问题是我的集群进入了不健康状态,因为 this limitation 在 AWS VPC 路由中 tables.My 集群已经扩展到 75 个节点,然后变得不健康并且不是能够缩小。
来自link、
One important limitation when using kubenet networking is that an AWS routing table cannot have more than 50 entries, which sets a limit of 50 nodes per cluster.
我有一个最多 75 个节点的 kops 集群,并添加了 cluster autoscaler. It uses kubenet 网络。 事情目前已经停止工作 - 即不再发生缩减。
群集的最大容量为 运行,即 75 个节点,即使几乎没有负载。不知道从哪里开始解决问题。
在 cluster autoscaler pod 中看到以下错误
I0222 01:45:14.327164 1 static_autoscaler.go:97] Starting main loop
W0222 01:45:14.770818 1 static_autoscaler.go:150] Cluster is not ready for autoscaling
I0222 01:45:15.043126 1 leaderelection.go:199] successfully renewed lease kube-system/cluster-autoscaler
I0222 01:45:17.121507 1 leaderelection.go:199] successfully renewed lease kube-system/cluster-autoscaler
I0222 01:45:19.126665 1 leaderelection.go:199] successfully renewed lease kube-system/cluster-autoscaler
I0222 01:45:21.327581 1 leaderelection.go:199] successfully renewed lease kube-system/cluster-autoscaler
I0222 01:45:23.331802 1 leaderelection.go:199] successfully renewed lease kube-system/cluster-autoscaler
I0222 01:45:24.775124 1 static_autoscaler.go:97] Starting main loop
W0222 01:45:25.085442 1 static_autoscaler.go:150] Cluster is not ready for autoscaling
自动缩放工作正常。
更新,在运行kops validate cluster
VALIDATION ERRORS
KIND NAME MESSAGE
Node ip-172-20-32-173.ec2.internal node "ip-172-20-32-173.ec2.internal" is not ready
...
I0221 22:16:02.688911 2403 node_conditions.go:60] node "ip-172-20-51-238.ec2.internal" not ready: &NodeCondition{Type:NetworkUnavailable,Status:True,LastHeartbeatTime:2019-02-21 22:15:56 -0500 EST,LastTransitionTime:2019-02-21 22:15:56 -0500 EST,Reason:NoRouteCreated,Message:RouteController failed to create a route,}
我发现问题是我的集群进入了不健康状态,因为 this limitation 在 AWS VPC 路由中 tables.My 集群已经扩展到 75 个节点,然后变得不健康并且不是能够缩小。
来自link、
One important limitation when using kubenet networking is that an AWS routing table cannot have more than 50 entries, which sets a limit of 50 nodes per cluster.