Prometheus 与 Kubernetes v1.16 不兼容

Question

我安装了 stable/prometheus helm chart，并在 helm/charts#17268 提出了一些小改动，以使其与 Kubernetes v1.16

兼容

安装后，none 的 Kubernetes grafana 仪表板显示正确的值。我正在使用 8769 (https://grafana.com/grafana/dashboards/8769) 仪表板，它提供了许多关于 cpu、内存、网络等的信息。这个仪表板在旧的 k8s 版本上工作正常，但在 v1.16 上它没有显示任何结果。我还随机尝试了一些其他仪表板（8588、6879、10551），但它们要么只显示每个 pod 的请求资源，而不显示实时使用情况，要么什么都不显示。

这些仪表板的作用是向 prometheus 发送 promql 查询并获取结果。例如，这是来自 8769 仪表板的 cpu 用法的 promql 查询：

sum (rate (container_cpu_usage_seconds_total{id!="/",namespace=~"$Namespace",pod_name=~"^$Deployment.*$"}[1m])) by (pod_name)

我不知道是我必须更改 promql 还是其他地方的问题。

Answer 1

尝试以这种方式安装，因为新的 CRD 有一些问题，所以我使用旧的 CRD-

kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.32/example/prometheus-operator-crd/alertmanager.crd.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.32/example/prometheus-operator-crd/prometheus.crd.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.32/example/prometheus-operator-crd/prometheusrule.crd.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.32/example/prometheus-operator-crd/servicemonitor.crd.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.32/example/prometheus-operator-crd/podmonitor.crd.yaml

helm install --name prometheus --namespace monitoring  stable/prometheus-operator --set prometheusOperator.createCustomResource=false

确保 CRD 不存在，您可以通过

删除它们

kubectl delete crd --all

Answer 2

Kubernetes 1.16 removes the labels pod_name and container_name from cAdvisor metrics, duplicates of pod and container.

您需要更改 pod_name -> pod，container_name -> Grafana 仪表板 JSON 模型中的容器。

Prometheus 与 Kubernetes v1.16 不兼容

Prometheus is not compatible with Kubernetes v1.16

grafana

kubernetes

prometheus

promql