Google 基于 Kubernetes 指标的云 GKE 水平 pod 自动缩放
Google cloud GKE horizontal pod autoscaling based on Kubernetes metrics
我想在 HPA 上使用 pod 网络接收的字节数标准 kubernetes 指标。使用以下 yaml 来完成此操作,但出现无法从自定义指标 API 获取指标之类的错误:没有自定义指标 API (custom.metrics.k8s.io) 已注册
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: xxxx-hoa
namespace: xxxxx
spec:
scaleTargetRef:
apiVersion: apps/v1beta1
kind: Deployment
name: xxxx-xxx
minReplicas: 2
maxReplicas: 6
metrics:
- type: Pods
pods:
metricName: received_bytes_count
targetAverageValue: 20k
如果有人有过使用同类指标的经验,那将非常有帮助
autoscaling/v1 是一个 API,以便仅根据 CPU 利用率自动缩放。因此,为了根据其他指标自动缩放,您应该使用 autoscaling/v2beta2。我建议您阅读此 doc 以检查 API 版本。
解决方案
要使其正常工作,您需要部署 Stackdriver Custom Metrics Adapter。下面的命令来部署它。
$ kubectl create clusterrolebinding cluster-admin-binding \
--clusterrole cluster-admin --user "$(gcloud config get-value account)"
$ kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/k8s-stackdriver/master/custom-metrics-stackdriver-adapter/deploy/production/adapter_new_resource_model.yaml
稍后您需要使用正确的 Custom Metric
,在您的情况下应该是 kubernetes.io|pod|network|received_bytes_count
描述
在 Custom and external metrics for autoscaling workloads 文档中,您有需要部署 StackDriver Adapter
才能获得自定义指标的信息。
Before you can use custom metrics, you must enable Monitoring in your Google Cloud project and install the Stackdriver adapter on your cluster.
下一步是部署您的应用程序(我使用 Nginx 部署进行测试)并创建适当的 HPA。
在您的 HPA 示例中,您遇到了一些问题
apiVersion: autoscaling/v2beta1 ## you can also use autoscaling/v2beta2 if you need more features, however for this scenario is ok
kind: HorizontalPodAutoscaler
metadata:
name: xxxx-hoa
namespace: xxxxx # HPA have namespace specified, deployment doesnt have
spec:
scaleTargetRef:
apiVersion: apps/v1beta1 # apiVersion: apps/v1beta1 is quite old. In Kubernetes 1.16+ it was changed to apps/v1
kind: Deployment
name: xxxx-xxx
minReplicas: 2
maxReplicas: 6
metrics:
- type: Pods
pods:
metricName: received_bytes_count # this metrics should be replaced with kubernetes.io|pod|network|received_bytes_count
targetAverageValue: 20k
在 GKE 中,您可以在 autoscaling/v2beta1
和 autoscaling/v2beta2
之间进行选择。您的案例将适用于 apiVersions
,但是如果您决定使用 autoscaling/v2beta2
,则需要更改清单语法。
为什么 kubernetes.io/pod/network/received_bytes_count
?
您指的是 Kubernetes 指标,/pod/network/received_bytes_count
在 this docs 中提供。
为什么 |
而不是 /
?如果您检查 Stackdriver documentation on Github,您会找到信息。
Stackdriver metrics have a form of paths separated by "/" character, but Custom Metrics API forbids using "/" character. When using Custom Metrics - Stackdriver Adapter either directly via Custom Metrics API or by specifying a custom metric in HPA, replace "/" character with "|". For example, to use custom.googleapis.com/my/custom/metric, specify custom.googleapis.com|my|custom|metric.
正确配置
v2beta1
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: xxxx-hoa
spec:
scaleTargetRef:
apiVersion: apps/v1 # In your case should be apps/v1beta1 but my deployment was created with apps/v1 apiVersion
kind: Deployment
name: nginx
minReplicas: 2
maxReplicas: 6
metrics:
- type: Pods
pods:
metricName: "kubernetes.io|pod|network|received_bytes_count"
targetAverageValue: 20k
对于 v2beta2
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: xxxx-hoa
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx
minReplicas: 2
maxReplicas: 6
metrics:
- type: Pods
pods:
metric:
name: "kubernetes.io|pod|network|received_bytes_count"
target:
type: AverageValue
averageValue: 20k
测试输出
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 2
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from pods metric kubernetes.io|pod|network|received_bytes_count
ScalingLimited True TooFewReplicas the desired replica count is more than the maximum replica count
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulRescale 8m18s horizontal-pod-autoscaler New size: 4; reason: pods metric kubernetes.io|pod|network|received_bytes_count above target
Normal SuccessfulRescale 8m9s horizontal-pod-autoscaler New size: 6; reason: pods metric kubernetes.io|pod|network|received_bytes_count above target
Normal SuccessfulRescale 17s horizontal-pod-autoscaler New size: 5; reason: All metrics below target
Normal SuccessfulRescale 9s (x2 over 8m55s) horizontal-pod-autoscaler New size: 2; reason: All metrics below target
您当前配置可能存在的问题
在您的 HPA 中您指定了命名空间,但在您的目标 Deployment 中没有指定。 HPA 和部署都应该有相同的命名空间。使用这种不匹配的配置,您可能会遇到以下问题:
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale False FailedGetScale the HPA controller was unable to get the target's current scale: deployments/scale.apps "nginx" not found
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedGetScale 94s (x264 over 76m) horizontal-pod-autoscaler deployments/scale.apps "nginx" not found
在 Kubernetes 1.16+ 中,部署使用 apiVersion: apps/v1
,您将无法在 Kubernets 1.16+
中使用 apiVersion: apps/v1beta1
创建部署
我想在 HPA 上使用 pod 网络接收的字节数标准 kubernetes 指标。使用以下 yaml 来完成此操作,但出现无法从自定义指标 API 获取指标之类的错误:没有自定义指标 API (custom.metrics.k8s.io) 已注册
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: xxxx-hoa
namespace: xxxxx
spec:
scaleTargetRef:
apiVersion: apps/v1beta1
kind: Deployment
name: xxxx-xxx
minReplicas: 2
maxReplicas: 6
metrics:
- type: Pods
pods:
metricName: received_bytes_count
targetAverageValue: 20k
如果有人有过使用同类指标的经验,那将非常有帮助
autoscaling/v1 是一个 API,以便仅根据 CPU 利用率自动缩放。因此,为了根据其他指标自动缩放,您应该使用 autoscaling/v2beta2。我建议您阅读此 doc 以检查 API 版本。
解决方案
要使其正常工作,您需要部署 Stackdriver Custom Metrics Adapter。下面的命令来部署它。
$ kubectl create clusterrolebinding cluster-admin-binding \
--clusterrole cluster-admin --user "$(gcloud config get-value account)"
$ kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/k8s-stackdriver/master/custom-metrics-stackdriver-adapter/deploy/production/adapter_new_resource_model.yaml
稍后您需要使用正确的 Custom Metric
,在您的情况下应该是 kubernetes.io|pod|network|received_bytes_count
描述
在 Custom and external metrics for autoscaling workloads 文档中,您有需要部署 StackDriver Adapter
才能获得自定义指标的信息。
Before you can use custom metrics, you must enable Monitoring in your Google Cloud project and install the Stackdriver adapter on your cluster.
下一步是部署您的应用程序(我使用 Nginx 部署进行测试)并创建适当的 HPA。
在您的 HPA 示例中,您遇到了一些问题
apiVersion: autoscaling/v2beta1 ## you can also use autoscaling/v2beta2 if you need more features, however for this scenario is ok
kind: HorizontalPodAutoscaler
metadata:
name: xxxx-hoa
namespace: xxxxx # HPA have namespace specified, deployment doesnt have
spec:
scaleTargetRef:
apiVersion: apps/v1beta1 # apiVersion: apps/v1beta1 is quite old. In Kubernetes 1.16+ it was changed to apps/v1
kind: Deployment
name: xxxx-xxx
minReplicas: 2
maxReplicas: 6
metrics:
- type: Pods
pods:
metricName: received_bytes_count # this metrics should be replaced with kubernetes.io|pod|network|received_bytes_count
targetAverageValue: 20k
在 GKE 中,您可以在 autoscaling/v2beta1
和 autoscaling/v2beta2
之间进行选择。您的案例将适用于 apiVersions
,但是如果您决定使用 autoscaling/v2beta2
,则需要更改清单语法。
为什么 kubernetes.io/pod/network/received_bytes_count
?
您指的是 Kubernetes 指标,/pod/network/received_bytes_count
在 this docs 中提供。
为什么 |
而不是 /
?如果您检查 Stackdriver documentation on Github,您会找到信息。
Stackdriver metrics have a form of paths separated by "/" character, but Custom Metrics API forbids using "/" character. When using Custom Metrics - Stackdriver Adapter either directly via Custom Metrics API or by specifying a custom metric in HPA, replace "/" character with "|". For example, to use custom.googleapis.com/my/custom/metric, specify custom.googleapis.com|my|custom|metric.
正确配置
v2beta1
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: xxxx-hoa
spec:
scaleTargetRef:
apiVersion: apps/v1 # In your case should be apps/v1beta1 but my deployment was created with apps/v1 apiVersion
kind: Deployment
name: nginx
minReplicas: 2
maxReplicas: 6
metrics:
- type: Pods
pods:
metricName: "kubernetes.io|pod|network|received_bytes_count"
targetAverageValue: 20k
对于 v2beta2
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: xxxx-hoa
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx
minReplicas: 2
maxReplicas: 6
metrics:
- type: Pods
pods:
metric:
name: "kubernetes.io|pod|network|received_bytes_count"
target:
type: AverageValue
averageValue: 20k
测试输出
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 2
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from pods metric kubernetes.io|pod|network|received_bytes_count
ScalingLimited True TooFewReplicas the desired replica count is more than the maximum replica count
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulRescale 8m18s horizontal-pod-autoscaler New size: 4; reason: pods metric kubernetes.io|pod|network|received_bytes_count above target
Normal SuccessfulRescale 8m9s horizontal-pod-autoscaler New size: 6; reason: pods metric kubernetes.io|pod|network|received_bytes_count above target
Normal SuccessfulRescale 17s horizontal-pod-autoscaler New size: 5; reason: All metrics below target
Normal SuccessfulRescale 9s (x2 over 8m55s) horizontal-pod-autoscaler New size: 2; reason: All metrics below target
您当前配置可能存在的问题
在您的 HPA 中您指定了命名空间,但在您的目标 Deployment 中没有指定。 HPA 和部署都应该有相同的命名空间。使用这种不匹配的配置,您可能会遇到以下问题:
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale False FailedGetScale the HPA controller was unable to get the target's current scale: deployments/scale.apps "nginx" not found
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedGetScale 94s (x264 over 76m) horizontal-pod-autoscaler deployments/scale.apps "nginx" not found
在 Kubernetes 1.16+ 中,部署使用 apiVersion: apps/v1
,您将无法在 Kubernets 1.16+
apiVersion: apps/v1beta1
创建部署