使用 Prometheus 适配器清空自定义指标资源列表

Empty custom metrics resource list with Prometheus adapter

我的 windows 机器上安装了 minikube。我的应用程序公开了一个自定义指标 'http_requests_total'。我首先安装了 Prometheus operator 并配置了它来抓取自定义指标。我可以看到自定义指标出现在 Prometheus 仪表板中。

安装Prometheus Adapter时出现问题。我使用以下 helm 命令安装适配器:

helm install my-release prometheus-community/prometheus-adapter

然后我 运行 按照以下命令编辑配置映射并为我的自定义指标添加规则:

kubectl edit cm my-release-prometheus-adapter

我在此 configmap 中添加以下部分:

- seriesQuery: 'http_requests_total{kubernetes_namespace!="",kubernetes_pod_name!=""}'
      resources:
        overrides:
          kubernetes_namespace: {resource: "namespace"}
          kubernetes_pod_name: {resource: "pod"}
      name:
        matches: "^(.*)_total"
        as: "_per_second"
      metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m])) by (<<.GroupBy>>)

完成此操作后,当我执行以下命令时:

kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1

我没有看到我的自定义资源被读取:

{"kind":"APIResourceList","apiVersion":"v1","groupVersion":"custom.metrics.k8s.io/v1beta1","resources":[]}

我已经阅读了 Whosebug 上发布的多个类似问题,但 none 似乎有有效的答案。我试图为我的适配器更改 prometheus url 但没有成功。我错过了什么?以下是来自适配器 pod 的日志:

I0706 22:41:15.240788       1 adapter.go:101] successfully using in-cluster auth
E0706 22:41:15.274396       1 provider.go:227] unable to update list of all metrics: unable to fetch metrics for query "{namespace!=\"\",__name__!~\"^container_.*\"}": Get "http://prometheus-kube-prometheus-prometheus.prom.svc.cluster.local:9090/api/v1/series?match%5B%5D=%7Bnamespace%21%3D%22%22%2C__name__%21~%22%5Econtainer_.%2A%22%7D&start=1625611215.271": dial tcp: lookup prometheus-kube-prometheus-prometheus.prom.svc.cluster.local on 10.96.0.10:53: no such host
I0706 22:41:15.636300       1 serving.go:325] Generated self-signed cert (/tmp/cert/apiserver.crt, /tmp/cert/apiserver.key)
I0706 22:41:15.637101       1 dynamic_serving_content.go:111] Loaded a new cert/key pair for "serving-cert::/tmp/cert/apiserver.crt::/tmp/cert/apiserver.key"
I0706 22:41:16.192597       1 requestheader_controller.go:244] Loaded a new request header values for RequestHeaderAuthRequestController
I0706 22:41:16.193884       1 config.go:655] Not requested to run hook priority-and-fairness-config-consumer
I0706 22:41:16.249643       1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I0706 22:41:16.249689       1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
I0706 22:41:16.249807       1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0706 22:41:16.249829       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0706 22:41:16.249928       1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0706 22:41:16.249977       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0706 22:41:16.250288       1 dynamic_serving_content.go:130] Starting serving-cert::/tmp/cert/apiserver.crt::/tmp/cert/apiserver.key
I0706 22:41:16.250322       1 reflector.go:219] Starting reflector *v1.ConfigMap (12h0m0s) from k8s.io/apiserver/pkg/authentication/request/headerrequest/requestheader_controller.go:172
I0706 22:41:16.250390       1 reflector.go:255] Listing and watching *v1.ConfigMap from k8s.io/apiserver/pkg/authentication/request/headerrequest/requestheader_controller.go:172
I0706 22:41:16.250322       1 reflector.go:219] Starting reflector *v1.ConfigMap (12h0m0s) from k8s.io/apiserver/pkg/server/dynamiccertificates/configmap_cafile_content.go:206
I0706 22:41:16.250536       1 reflector.go:255] Listing and watching *v1.ConfigMap from k8s.io/apiserver/pkg/server/dynamiccertificates/configmap_cafile_content.go:206
I0706 22:41:16.250555       1 reflector.go:219] Starting reflector *v1.ConfigMap (12h0m0s) from k8s.io/apiserver/pkg/server/dynamiccertificates/configmap_cafile_content.go:206
I0706 22:41:16.250608       1 reflector.go:255] Listing and watching *v1.ConfigMap from k8s.io/apiserver/pkg/server/dynamiccertificates/configmap_cafile_content.go:206
I0706 22:41:16.250577       1 tlsconfig.go:200] loaded serving cert ["serving-cert::/tmp/cert/apiserver.crt::/tmp/cert/apiserver.key"]: "localhost@1625611275" [serving] validServingFor=[127.0.0.1,localhost,localhost] issuer="localhost-ca@1625611275" (2021-07-06 21:41:15 +0000 UTC to 2022-07-06 21:41:15 +0000 UTC (now=2021-07-06 22:41:16.2505337 +0000 UTC))
I0706 22:41:16.250950       1 named_certificates.go:53] loaded SNI cert [0/"self-signed loopback"]: "apiserver-loopback-client@1625611276" [serving] validServingFor=[apiserver-loopback-client] issuer="apiserver-loopback-client-ca@1625611275" (2021-07-06 21:41:15 +0000 UTC to 2022-07-06 21:41:15 +0000 UTC (now=2021-07-06 22:41:16.2509375 +0000 UTC))
I0706 22:41:16.251003       1 secure_serving.go:197] Serving securely on [::]:6443
I0706 22:41:16.251321       1 tlsconfig.go:240] Starting DynamicServingCertificateController
I0706 22:41:16.349970       1 shared_informer.go:270] caches populated
I0706 22:41:16.350023       1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController 
I0706 22:41:16.350129       1 shared_informer.go:270] caches populated
I0706 22:41:16.350142       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file 
I0706 22:41:16.350164       1 shared_informer.go:270] caches populated
I0706 22:41:16.350186       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file 
I0706 22:41:16.350698       1 tlsconfig.go:178] loaded client CA [0/"client-ca::kube-system::extension-apiserver-authentication::client-ca-file,client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"]: "minikubeCA" [client,serving] issuer="<self>" (2021-06-07 20:39:52 +0000 UTC to 2031-06-06 20:39:52 +0000 UTC (now=2021-07-06 22:41:16.3506808 +0000 UTC))
I0706 22:41:16.350796       1 tlsconfig.go:178] loaded client CA [1/"client-ca::kube-system::extension-apiserver-authentication::client-ca-file,client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"]: "front-proxy-ca" [] issuer="<self>" (2021-06-08 20:39:52 +0000 UTC to 2031-06-06 20:39:52 +0000 UTC (now=2021-07-06 22:41:16.3507595 +0000 UTC))
I0706 22:41:16.351138       1 tlsconfig.go:200] loaded serving cert ["serving-cert::/tmp/cert/apiserver.crt::/tmp/cert/apiserver.key"]: "localhost@1625611275" [serving] validServingFor=[127.0.0.1,localhost,localhost] issuer="localhost-ca@1625611275" (2021-07-06 21:41:15 +0000 UTC to 2022-07-06 21:41:15 +0000 UTC (now=2021-07-06 22:41:16.3511272 +0000 UTC))
I0706 22:41:16.351349       1 named_certificates.go:53] loaded SNI cert [0/"self-signed loopback"]: "apiserver-loopback-client@1625611276" [serving] validServingFor=[apiserver-loopback-client] issuer="apiserver-loopback-client-ca@1625611275" (2021-07-06 21:41:15 +0000 UTC to 2022-07-06 21:41:15 +0000 UTC (now=2021-07-06 22:41:16.3513364 +0000 UTC))
I0706 22:41:26.255679       1 reflector.go:530] k8s.io/apiserver/pkg/authentication/request/headerrequest/requestheader_controller.go:172: Watch close - *v1.ConfigMap total 0 items received
I0706 22:41:26.256047       1 reflector.go:530] k8s.io/apiserver/pkg/server/dynamiccertificates/configmap_cafile_content.go:206: Watch close - *v1.ConfigMap total 0 items received
I0706 22:41:26.257735       1 reflector.go:530] k8s.io/apiserver/pkg/server/dynamiccertificates/configmap_cafile_content.go:206: Watch close - *v1.ConfigMap total 0 items received

以下是适配器部署中的参数:

Args:
      /adapter
      --secure-port=6443
      --cert-dir=/tmp/cert
      --logtostderr=true
      --prometheus-url=http://prometheus.default.svc:9090
      --metrics-relist-interval=1m
      --v=6
      --config=/etc/adapter/config.yaml

以下是我的本地集群上存在的各种服务:

NAME                                      TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                      AGE
alertmanager-operated                     ClusterIP   None             <none>        9093/TCP,9094/TCP,9094/UDP   2d6h
kubernetes                                ClusterIP   10.96.0.1        <none>        443/TCP                      28d
my-release-prometheus-adapter             ClusterIP   10.106.147.10    <none>        443/TCP                      23m
prometheus-grafana                        ClusterIP   10.108.144.246   <none>        80/TCP                       2d6h
prometheus-kube-prometheus-alertmanager   ClusterIP   10.108.19.130    <none>        9093/TCP                     2d6h
prometheus-kube-prometheus-operator       ClusterIP   10.108.170.84    <none>        443/TCP                      2d6h
prometheus-kube-prometheus-prometheus     ClusterIP   10.98.67.168     <none>        9090/TCP                     2d6h
prometheus-kube-state-metrics             ClusterIP   10.97.252.48     <none>        8080/TCP                     2d6h
prometheus-operated                       ClusterIP   None             <none>        9090/TCP                     2d6h
prometheus-prometheus-node-exporter       ClusterIP   10.107.194.240   <none>        9100/TCP                     2d6h
sample-app                                ClusterIP   10.97.86.241     <none>        80/TCP                       4h10m

在适配器 pod 日志的开头,我看到以下错误行:

unable to update list of all metrics: unable to fetch metrics for query "{__name__=~\"^container_.*\",container!=\"POD\",namespace!=\"\",pod!=\"\"}": Get "http://prometheus.default.svc:9090/api/v1/series?match%5B%5D=%7B__name__%3D~%22%5Econtainer_.%2A%22%2Ccontainer%21%3D%22POD%22%2Cnamespace%21%3D%22%22%2Cpod%21%3D%22%22%7D&start=1625614091.754": dial tcp: lookup prometheus.default.svc on 10.96.0.10:53: no such host

prometheus-adapter 无法连接到您的 prometheus 实例,日志中的这一行表明:

E0706 22:41:15.274396       1 provider.go:227] unable to update list of all metrics: unable to fetch metrics for query "{namespace!=\"\",__name__!~\"^container_.*\"}": Get "http://prometheus-kube-prometheus-prometheus.prom.svc.cluster.local:9090/api/v1/series?match%5B%5D=%7Bnamespace%21%3D%22%22%2C__name__%21~%22%5Econtainer_.%2A%22%7D&start=1625611215.271": dial tcp: lookup prometheus-kube-prometheus-prometheus.prom.svc.cluster.local on 10.96.0.10:53: no such host

虽然您有一个名为 prometheus-kube-prometheus-prometheus 的服务,但 url 期望该服务存在于 prom 命名空间中。您应该编辑 prometheus url(与添加系列查询的方式相同)以获得正确的 url。如果此服务在默认命名空间中,您可以指定 prometheus-kube-prometheus-prometheus.svc.cluster.local 否则它将是 prometheus-kube-prometheus-prometheus.<NAMESPACE>.svc.cluster.local