Prometheus install using helm - prometheus 和 alertmanger pods 在循环中终止

Prometheus install using helm - prometheus and alertmanger pods Terminating in a loop

见鬼去吧- 我使用 Helm

安装了 Prometheus
chart : prometheus-community/kube-prometheus-stack

command :
helm install kube-prometheus-stack prometheus-community/kube-prometheus-stack -n monitoring-helm

Prometheus 和 AlertManager pods 没有出现 他们似乎在循环中终止。

Karans-MacBook-Pro:ingress-ns karanalang$ kc get all -n monitoring-helm
NAME                                                            READY   STATUS              RESTARTS   AGE
pod/alertmanager-kube-prometheus-stack-alertmanager-0           0/2     ContainerCreating   0          0s
pod/kube-prometheus-stack-grafana-7b8944c5bb-f6wcz              3/3     Running             0          128m
pod/kube-prometheus-stack-kube-state-metrics-596b9c6b55-gjgqc   1/1     Running             0          128m
pod/kube-prometheus-stack-operator-7bb8679c95-dzvfn             1/1     Running             0          128m
pod/kube-prometheus-stack-prometheus-node-exporter-5g7fr        1/1     Running             0          128m
pod/kube-prometheus-stack-prometheus-node-exporter-bpctq        1/1     Running             0          128m
pod/kube-prometheus-stack-prometheus-node-exporter-jdc9p        1/1     Running             0          128m
pod/kube-prometheus-stack-prometheus-node-exporter-tmhss        1/1     Running             0          67m
pod/kube-prometheus-stack-prometheus-node-exporter-xp2h8        1/1     Running             0          67m
pod/prometheus-kube-prometheus-stack-prometheus-0               0/2     Pending             0          0s

NAME                                                     TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)                      AGE
service/alertmanager-operated                            ClusterIP   None          <none>        9093/TCP,9094/TCP,9094/UDP   128m
service/kube-prometheus-stack-alertmanager               ClusterIP   10.80.6.51    <none>        9093/TCP                     128m
service/kube-prometheus-stack-grafana                    ClusterIP   10.80.4.232   <none>        80/TCP                       128m
service/kube-prometheus-stack-kube-state-metrics         ClusterIP   10.80.2.206   <none>        8080/TCP                     128m
service/kube-prometheus-stack-operator                   ClusterIP   10.80.3.88    <none>        443/TCP                      128m
service/kube-prometheus-stack-prometheus                 ClusterIP   10.80.14.9    <none>        9090/TCP                     128m
service/kube-prometheus-stack-prometheus-node-exporter   ClusterIP   10.80.0.172   <none>        9100/TCP                     128m
service/prometheus-operated                              ClusterIP   None          <none>        9090/TCP                     128m

NAME                                                            DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
daemonset.apps/kube-prometheus-stack-prometheus-node-exporter   5         5         5       5            5           <none>          128m

NAME                                                       READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/kube-prometheus-stack-grafana              1/1     1            1           128m
deployment.apps/kube-prometheus-stack-kube-state-metrics   1/1     1            1           128m
deployment.apps/kube-prometheus-stack-operator             1/1     1            1           128m

NAME                                                                  DESIRED   CURRENT   READY   AGE
replicaset.apps/kube-prometheus-stack-grafana-7b8944c5bb              1         1         1       128m
replicaset.apps/kube-prometheus-stack-kube-state-metrics-596b9c6b55   1         1         1       128m
replicaset.apps/kube-prometheus-stack-operator-7bb8679c95             1         1         1       128m

NAME                                                               READY   AGE
statefulset.apps/alertmanager-kube-prometheus-stack-alertmanager   0/1     128m
statefulset.apps/prometheus-kube-prometheus-stack-prometheus       0/1     128m

这是我在描述 prometheus pod 时看到的内容

Karans-MacBook-Pro:ingress-ns karanalang$ kc describe pod/prometheus-kube-prometheus-stack-prometheus-0  -n monitoring-helm
Name:           prometheus-kube-prometheus-stack-prometheus-0
Namespace:      monitoring-helm
Priority:       0
Node:           gke-strimzi-prometheus-default-pool-38ca804d-pf4j/
Labels:         app.kubernetes.io/instance=kube-prometheus-stack-prometheus
                app.kubernetes.io/managed-by=prometheus-operator
                app.kubernetes.io/name=prometheus
                app.kubernetes.io/version=2.32.1
                controller-revision-hash=prometheus-kube-prometheus-stack-prometheus-67599f5b6b
                operator.prometheus.io/name=kube-prometheus-stack-prometheus
                operator.prometheus.io/shard=0
                prometheus=kube-prometheus-stack-prometheus
                statefulset.kubernetes.io/pod-name=prometheus-kube-prometheus-stack-prometheus-0
Annotations:    kubectl.kubernetes.io/default-container: prometheus
Status:         Pending
IP:             
IPs:            <none>
Controlled By:  StatefulSet/prometheus-kube-prometheus-stack-prometheus
Init Containers:
  init-config-reloader:
    Image:      quay.io/prometheus-operator/prometheus-config-reloader:v0.53.1
    Port:       8080/TCP
    Host Port:  0/TCP
    Command:
      /bin/prometheus-config-reloader
    Args:
      --watch-interval=0
      --listen-address=:8080
      --config-file=/etc/prometheus/config/prometheus.yaml.gz
      --config-envsubst-file=/etc/prometheus/config_out/prometheus.env.yaml
      --watched-dir=/etc/prometheus/rules/prometheus-kube-prometheus-stack-prometheus-rulefiles-0
    Limits:
      cpu:     100m
      memory:  50Mi
    Requests:
      cpu:     100m
      memory:  50Mi
    Environment:
      POD_NAME:  prometheus-kube-prometheus-stack-prometheus-0 (v1:metadata.name)
      SHARD:     0
    Mounts:
      /etc/prometheus/config from config (rw)
      /etc/prometheus/config_out from config-out (rw)
      /etc/prometheus/rules/prometheus-kube-prometheus-stack-prometheus-rulefiles-0 from prometheus-kube-prometheus-stack-prometheus-rulefiles-0 (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-kqwfl (ro)
Containers:
  prometheus:
    Image:      quay.io/prometheus/prometheus:v2.32.1
    Port:       9090/TCP
    Host Port:  0/TCP
    Args:
      --web.console.templates=/etc/prometheus/consoles
      --web.console.libraries=/etc/prometheus/console_libraries
      --config.file=/etc/prometheus/config_out/prometheus.env.yaml
      --storage.tsdb.path=/prometheus
      --storage.tsdb.retention.time=10d
      --web.enable-lifecycle
      --web.external-url=http://kube-prometheus-stack-prometheus.monitoring-helm:9090
      --web.route-prefix=/
      --web.config.file=/etc/prometheus/web_config/web-config.yaml
    Readiness:    http-get http://:http-web/-/ready delay=0s timeout=3s period=5s #success=1 #failure=3
    Startup:      http-get http://:http-web/-/ready delay=0s timeout=3s period=15s #success=1 #failure=60
    Environment:  <none>
    Mounts:
      /etc/prometheus/certs from tls-assets (ro)
      /etc/prometheus/config_out from config-out (ro)
      /etc/prometheus/rules/prometheus-kube-prometheus-stack-prometheus-rulefiles-0 from prometheus-kube-prometheus-stack-prometheus-rulefiles-0 (rw)
      /etc/prometheus/web_config/web-config.yaml from web-config (ro,path="web-config.yaml")
      /prometheus from prometheus-kube-prometheus-stack-prometheus-db (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-kqwfl (ro)
  config-reloader:
    Image:      quay.io/prometheus-operator/prometheus-config-reloader:v0.53.1
    Port:       8080/TCP
    Host Port:  0/TCP
    Command:
      /bin/prometheus-config-reloader
    Args:
      --listen-address=:8080
      --reload-url=http://127.0.0.1:9090/-/reload
      --config-file=/etc/prometheus/config/prometheus.yaml.gz
      --config-envsubst-file=/etc/prometheus/config_out/prometheus.env.yaml
      --watched-dir=/etc/prometheus/rules/prometheus-kube-prometheus-stack-prometheus-rulefiles-0
    Limits:
      cpu:     100m
      memory:  50Mi
    Requests:
      cpu:     100m
      memory:  50Mi
    Environment:
      POD_NAME:  prometheus-kube-prometheus-stack-prometheus-0 (v1:metadata.name)
      SHARD:     0
    Mounts:
      /etc/prometheus/config from config (rw)
      /etc/prometheus/config_out from config-out (rw)
      /etc/prometheus/rules/prometheus-kube-prometheus-stack-prometheus-rulefiles-0 from prometheus-kube-prometheus-stack-prometheus-rulefiles-0 (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-kqwfl (ro)
Conditions:
  Type           Status
  PodScheduled   True 
Volumes:
  config:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  prometheus-kube-prometheus-stack-prometheus
    Optional:    false
  tls-assets:
    Type:                Projected (a volume that contains injected data from multiple sources)
    SecretName:          prometheus-kube-prometheus-stack-prometheus-tls-assets-0
    SecretOptionalName:  <nil>
  config-out:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  prometheus-kube-prometheus-stack-prometheus-rulefiles-0:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      prometheus-kube-prometheus-stack-prometheus-rulefiles-0
    Optional:  false
  web-config:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  prometheus-kube-prometheus-stack-prometheus-web-config
    Optional:    false
  prometheus-kube-prometheus-stack-prometheus-db:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  kube-api-access-kqwfl:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  1s    default-scheduler  Successfully assigned monitoring-helm/prometheus-kube-prometheus-stack-prometheus-0 to gke-strimzi-prometheus-default-pool-38ca804d-pf4j

描述没有显示任何错误,所以不确定为什么没有创建 Pod。正在等待分配 PVC ?

此外,运营商部署的日志 - 我没有看到任何具体错误

level=info ts=2022-01-18T00:49:13.758142527Z caller=operator.go:741 component=alertmanageroperator key=monitoring-helm/kube-prometheus-stack-alertmanager msg="sync alertmanager"
level=warn ts=2022-01-18T00:49:13.762728184Z caller=amcfg.go:1326 component=alertmanageroperator alertmanager=kube-prometheus-stack-alertmanager namespace=monitoring-helm receiver="null" msg="'matchers' field is using a deprecated syntax which will be removed in future versions" match="unsupported value type" match_re="unsupported value type"
level=info ts=2022-01-18T00:49:13.881836251Z caller=operator.go:741 component=alertmanageroperator key=monitoring-helm/kube-prometheus-stack-alertmanager msg="sync alertmanager"
level=warn ts=2022-01-18T00:49:13.886328208Z caller=amcfg.go:1326 component=alertmanageroperator alertmanager=kube-prometheus-stack-alertmanager namespace=monitoring-helm receiver="null" msg="'matchers' field is using a deprecated syntax which will be removed in future versions" match="unsupported value type" match_re="unsupported value type"
level=info ts=2022-01-18T00:49:13.934270881Z caller=operator.go:741 component=alertmanageroperator key=monitoring-helm/kube-prometheus-stack-alertmanager msg="sync alertmanager"
level=warn ts=2022-01-18T00:49:13.937757934Z caller=amcfg.go:1326 component=alertmanageroperator alertmanager=kube-prometheus-stack-alertmanager namespace=monitoring-helm receiver="null" msg="'matchers' field is using a deprecated syntax which will be removed in future versions" match="unsupported value type" match_re="unsupported value type"
level=info ts=2022-01-18T00:49:13.993406489Z caller=operator.go:741 component=alertmanageroperator key=monitoring-helm/kube-prometheus-stack-alertmanager msg="sync alertmanager"
level=warn ts=2022-01-18T00:49:14.002621303Z caller=amcfg.go:1326 component=alertmanageroperator alertmanager=kube-prometheus-stack-alertmanager namespace=monitoring-helm receiver="null" msg="'matchers' field is using a deprecated syntax which will be removed in future versions" match="unsupported value type" match_re="unsupported value type"
level=info ts=2022-01-18T00:49:14.022094422Z caller=operator.go:1218 component=prometheusoperator key=monitoring/prometheus msg="sync prometheus"
level=info ts=2022-01-18T00:49:14.120963578Z caller=operator.go:1218 component=prometheusoperator key=monitoring1/prometheus msg="sync prometheus"
level=info ts=2022-01-18T00:49:14.205667044Z caller=operator.go:1218 component=prometheusoperator key=monitoring/prometheus msg="sync prometheus"
level=info ts=2022-01-18T00:49:14.280614242Z caller=operator.go:1218 component=prometheusoperator key=monitoring1/prometheus msg="sync prometheus"
level=info ts=2022-01-18T00:49:14.354453215Z caller=operator.go:1218 component=prometheusoperator key=monitoring1/prometheus msg="sync prometheus"
level=info ts=2022-01-18T00:49:14.484993407Z caller=operator.go:741 component=alertmanageroperator key=monitoring-helm/kube-prometheus-stack-alertmanager msg="sync alertmanager"
level=warn ts=2022-01-18T00:49:14.490822619Z caller=amcfg.go:1326 component=alertmanageroperator alertmanager=kube-prometheus-stack-alertmanager namespace=monitoring-helm receiver="null" msg="'matchers' field is using a deprecated syntax which will be removed in future versions" match="unsupported value type" match_re="unsupported value type"
level=info ts=2022-01-18T00:49:14.562051245Z caller=operator.go:741 component=alertmanageroperator key=monitoring-helm/kube-prometheus-stack-alertmanager msg="sync alertmanager"
level=warn ts=2022-01-18T00:49:14.566919392Z caller=amcfg.go:1326 component=alertmanageroperator alertmanager=kube-prometheus-stack-alertmanager namespace=monitoring-helm receiver="null" msg="'matchers' field is using a deprecated syntax which will be removed in future versions" match="unsupported value type" match_re="unsupported value type"
level=info ts=2022-01-18T00:49:14.674276432Z caller=operator.go:741 component=alertmanageroperator key=monitoring-helm/kube-prometheus-stack-alertmanager msg="sync alertmanager"
level=warn ts=2022-01-18T00:49:14.679550548Z caller=amcfg.go:1326 component=alertmanageroperator alertmanager=kube-prometheus-stack-alertmanager namespace=monitoring-helm receiver="null" msg="'matchers' field is using a deprecated syntax which will be removed in future versions" match="unsupported value type" match_re="unsupported value type"
level=info ts=2022-01-18T00:49:14.711888417Z caller=operator.go:741 component=alertmanageroperator key=monitoring-helm/kube-prometheus-stack-alertmanager msg="sync alertmanager"
level=warn ts=2022-01-18T00:49:14.715543191Z caller=amcfg.go:1326 component=alertmanageroperator alertmanager=kube-prometheus-stack-alertmanager namespace=monitoring-helm receiver="null" msg="'matchers' field is using a deprecated syntax which will be removed in future versions" match="unsupported value type" match_re="unsupported value type"

我如何 debug/fix 这个? 蒂亚!

更新: 我在一个全新的集群中安装了图表 (prometheus-community/kube-prometheus-stack),pods 出现了。 在我当前的集群中(我看到了问题),我还使用不同命名空间中 strimzi github 上的文件安装了 prometheus/grafana, (https://github.com/strimzi/strimzi-kafka-operator/tree/main/examples/metrics/prometheus-install) 这会导致任何问题吗?

您可以在同一集群上检查 Prometheus 运行 的另一个实例:

kubectl get pods --all-namespaces --selector=app.kubernetes.io/name=prometheus

卸载 older/previous 安装应该可以解决这个问题。