为什么prometheus operator无法启动

Why is prometheus operator not able to start

我正在尝试在全新的 k8s 集群中使用 operator 创建 prometheus 我使用以下文件,

  1. 我正在创建名称空间监控
  2. 应用此文件,效果很好

apiVersion: apps/v1beta2
kind: Deployment
metadata:
  labels:
    k8s-app: prometheus-operator
  name: prometheus-operator
  namespace: monitoring
spec:
  replicas: 2
  selector:
    matchLabels:
      k8s-app: prometheus-operator
  template:
    metadata:
      labels:
        k8s-app: prometheus-operator
    spec:
      priorityClassName: "operator-critical"
      tolerations:
      - key: "WorkGroup"
        operator: "Equal"
        value: "operator"
        effect: "NoSchedule"
      - key: "WorkGroup"
        operator: "Equal"
        value: "operator"
        effect: "NoExecute"
      containers:
      - args:
        - --kubelet-service=kube-system/kubelet
        - --logtostderr=true
        - --config-reloader-image=quay.io/coreos/configmap-reload:v0.0.1
        - --prometheus-config-reloader=quay.io/coreos/prometheus-config-reloader:v0.29.0
        image: quay.io/coreos/prometheus-operator:v0.29.0
        name: prometheus-operator
        ports:
        - containerPort: 8080
          name: http
        securityContext:
          allowPrivilegeEscalation: false
          readOnlyRootFilesystem: true
      nodeSelector:
      serviceAccountName: prometheus-operator

现在我要应用这个文件 (CRD)

apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: prometheus
  namespace: monitoring
  labels: 
    prometheus: prometheus
spec:
  replica: 1
  priorityClassName: "operator-critical"
  serviceAccountName: prometheus
  nodeSelector:
        worker.garden.sapcloud.io/group: operator
  serviceMonitorNamespaceSelector: {}
  serviceMonitorSelector:
    matchLabels:
      role: observeable
  tolerations:
  - key: "WorkGroup"
    operator: "Equal"
    value: "operator"
    effect: "NoSchedule"
  - key: "WorkGroup"
    operator: "Equal"
    value: "operator"
    effect: "NoExecute"

在我创建那些 CRD 之前

https://github.com/coreos/prometheus-operator/tree/master/example/prometheus-operator-crd

pods无法启动的问题(0/2),见下图。可能是什么问题呢?请指教

更新

当我去参加舞会运营商的活动时,我看到以下错误 creating: pods "prometheus-operator-6944778645-" is forbidden: no PriorityClass with name operator-critical was found replicaset-controller,知道吗?

Prometheus 和警报管理器pods 需要持久卷来存储数据。确保这些 pv 存在并绑定到相应的 pods。或者你可以让那些 pods 短暂。它应该有效

您正在尝试引用 operator-critical priority class。优先级 class 确定 pods 的优先级及其资源分配。

要解决此问题,您可以删除两个文件中的显式优先级 class(priorityClassName: "operator-critical") 或创建 operator-critical class:

apiVersion: scheduling.k8s.io/v1beta1
kind: PriorityClass
metadata:
  name: operator-critical
value: 1000000
globalDefault: false
description: "Critical operator workloads"