使用自定义指标设置 Kubernetes Autoscaling

Setting up Kubernetes Autoscaling with Custom Metrics

我正在尝试根据自定义指标在 Kubernetes 1.2.3(测试版)集群上设置自动缩放。 (我已经在集群上尝试了基于 CPU 的自动缩放,并且运行良好。)

我尝试遵循他们的 custom metrics proposal,但我在创建必要的设置时遇到了问题。

这是我到目前为止所做的:

  1. 向正在部署的 pod 规范添加了自定义指标注释 (类似于他们提案中提供的配置):

    apiVersion: v1
    kind: ReplicationController
    metadata:
      name: metrix
      namespace: "default"
    spec:
      replicas: 1
      template:
        metadata:
          labels:
            app: metrix
          annotations:
            metrics.alpha.kubernetes.io/custom-endpoints: >
              [
                {
                  "api": "prometheus",
                  "path": "/status",
                  "port": "9090",
                  "names": ["test1"]
                },
                {
                  "api": "prometheus",
                  "path": "/metrics",
                  "port": "9090"
                  "names": ["test2"]
                }
              ]
        spec:
          containers:
          - name: metrix
            image: janaka/prometheus-ep:v1
            resources:
              requests:
                cpu: 400m
    
  2. 创建了一个 Docker 标记为 janaka/prometheus-ep:v1 的容器(本地)运行在端口 9090 上安装了一个 Prometheus 兼容的服务器,/status/metrics 端点

  3. 通过在 /etc/default/kubelet 处将 --enable-custom-metrics=true 附加到 KUBELET_OPTS(基于 the kubelet CLI reference)在 kubelet 上启用自定义指标并重新启动 kubelet

所有 pods(在 defaultkube-system 命名空间中)都是 运行ning,heapster pod 日志不包含任何 'anomalous' 输出要么(除了启动时的小故障,由于 InfluxDB 暂时不可用):

$ kubesys logs -f heapster-daftr

I0427 05:07:45.807277       1 heapster.go:60] /heapster --source=kubernetes:https://kubernetes.default --sink=influxdb:http://monitoring-influxdb:8086
I0427 05:07:45.807359       1 heapster.go:61] Heapster version 1.1.0-beta1
I0427 05:07:45.807638       1 configs.go:60] Using Kubernetes client with master "https://kubernetes.default" and version "v1"
I0427 05:07:45.807661       1 configs.go:61] Using kubelet port 10255
E0427 05:08:15.847319       1 influxdb.go:185] issues while creating an InfluxDB sink: failed to ping InfluxDB server at "monitoring-influxdb:8086" - Get http://monitoring-influxdb:8086/ping: dial tcp xxx.xxx.xxx.xxx:8086: i/o timeout, will retry on use
I0427 05:08:15.847376       1 influxdb.go:199] created influxdb sink with options: host:monitoring-influxdb:8086 user:root db:k8s
I0427 05:08:15.847412       1 heapster.go:87] Starting with InfluxDB Sink
I0427 05:08:15.847427       1 heapster.go:87] Starting with Metric Sink
I0427 05:08:15.877349       1 heapster.go:166] Starting heapster on port 8082
I0427 05:08:35.000342       1 manager.go:79] Scraping metrics start: 2016-04-27 05:08:00 +0000 UTC, end: 2016-04-27 05:08:30 +0000 UTC
I0427 05:08:35.035800       1 manager.go:152] ScrapeMetrics: time: 35.209696ms size: 24
I0427 05:08:35.044674       1 influxdb.go:177] Created database "k8s" on influxDB server at "monitoring-influxdb:8086"
I0427 05:09:05.000441       1 manager.go:79] Scraping metrics start: 2016-04-27 05:08:30 +0000 UTC, end: 2016-04-27 05:09:00 +0000 UTC
I0427 05:09:06.682941       1 manager.go:152] ScrapeMetrics: time: 1.682157776s size: 24
I0427 06:43:38.767146       1 manager.go:79] Scraping metrics start: 2016-04-27 05:09:00 +0000 UTC, end: 2016-04-27 05:09:30 +0000 UTC
I0427 06:43:38.810243       1 manager.go:152] ScrapeMetrics: time: 42.940682ms size: 1
I0427 06:44:05.012989       1 manager.go:79] Scraping metrics start: 2016-04-27 06:43:30 +0000 UTC, end: 2016-04-27 06:44:00 +0000 UTC
I0427 06:44:05.063583       1 manager.go:152] ScrapeMetrics: time: 50.368106ms size: 24
I0427 06:44:35.002038       1 manager.go:79] Scraping metrics start: 2016-04-27 06:44:00 +0000 UTC, end: 2016-04-27 06:44:30 +0000 UTC

但是,自定义端点未被抓取。 (我通过为我的服务器的启动和端点处理程序添加 stderr 日志来验证它;只有服务器初始化日志显示在 pod 的 kubectl 日志上。)

由于我是 Kubernetes 的新手,我们将不胜感激。

(根据我对提案以及 this issue 的理解,我们不必 运行 集群中的单独 Prometheus 收集器作为 cAdvisor 应该已经从 pod 规范中定义的端点提取数据。这是真的吗,还是我还需要一个单独的 Prometheus 收集器?)

custom metrics proposal 已过期。

请参考user guide,目前正在审核中。