为什么prometheus operator无法启动
Why is prometheus operator not able to start
我正在尝试在全新的 k8s 集群中使用 operator 创建 prometheus
我使用以下文件,
- 我正在创建名称空间监控
- 应用此文件,效果很好
apiVersion: apps/v1beta2
kind: Deployment
metadata:
labels:
k8s-app: prometheus-operator
name: prometheus-operator
namespace: monitoring
spec:
replicas: 2
selector:
matchLabels:
k8s-app: prometheus-operator
template:
metadata:
labels:
k8s-app: prometheus-operator
spec:
priorityClassName: "operator-critical"
tolerations:
- key: "WorkGroup"
operator: "Equal"
value: "operator"
effect: "NoSchedule"
- key: "WorkGroup"
operator: "Equal"
value: "operator"
effect: "NoExecute"
containers:
- args:
- --kubelet-service=kube-system/kubelet
- --logtostderr=true
- --config-reloader-image=quay.io/coreos/configmap-reload:v0.0.1
- --prometheus-config-reloader=quay.io/coreos/prometheus-config-reloader:v0.29.0
image: quay.io/coreos/prometheus-operator:v0.29.0
name: prometheus-operator
ports:
- containerPort: 8080
name: http
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
nodeSelector:
serviceAccountName: prometheus-operator
现在我要应用这个文件 (CRD)
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
name: prometheus
namespace: monitoring
labels:
prometheus: prometheus
spec:
replica: 1
priorityClassName: "operator-critical"
serviceAccountName: prometheus
nodeSelector:
worker.garden.sapcloud.io/group: operator
serviceMonitorNamespaceSelector: {}
serviceMonitorSelector:
matchLabels:
role: observeable
tolerations:
- key: "WorkGroup"
operator: "Equal"
value: "operator"
effect: "NoSchedule"
- key: "WorkGroup"
operator: "Equal"
value: "operator"
effect: "NoExecute"
在我创建那些 CRD 之前
https://github.com/coreos/prometheus-operator/tree/master/example/prometheus-operator-crd
pods无法启动的问题(0/2),见下图。可能是什么问题呢?请指教
更新
当我去参加舞会运营商的活动时,我看到以下错误 creating: pods "prometheus-operator-6944778645-" is forbidden: no PriorityClass with name operator-critical was found replicaset-controller
,知道吗?
Prometheus 和警报管理器pods 需要持久卷来存储数据。确保这些 pv 存在并绑定到相应的 pods。或者你可以让那些 pods 短暂。它应该有效
您正在尝试引用 operator-critical
priority class。优先级 class 确定 pods 的优先级及其资源分配。
要解决此问题,您可以删除两个文件中的显式优先级 class(priorityClassName: "operator-critical"
) 或创建 operator-critical
class:
apiVersion: scheduling.k8s.io/v1beta1
kind: PriorityClass
metadata:
name: operator-critical
value: 1000000
globalDefault: false
description: "Critical operator workloads"
我正在尝试在全新的 k8s 集群中使用 operator 创建 prometheus 我使用以下文件,
- 我正在创建名称空间监控
- 应用此文件,效果很好
apiVersion: apps/v1beta2
kind: Deployment
metadata:
labels:
k8s-app: prometheus-operator
name: prometheus-operator
namespace: monitoring
spec:
replicas: 2
selector:
matchLabels:
k8s-app: prometheus-operator
template:
metadata:
labels:
k8s-app: prometheus-operator
spec:
priorityClassName: "operator-critical"
tolerations:
- key: "WorkGroup"
operator: "Equal"
value: "operator"
effect: "NoSchedule"
- key: "WorkGroup"
operator: "Equal"
value: "operator"
effect: "NoExecute"
containers:
- args:
- --kubelet-service=kube-system/kubelet
- --logtostderr=true
- --config-reloader-image=quay.io/coreos/configmap-reload:v0.0.1
- --prometheus-config-reloader=quay.io/coreos/prometheus-config-reloader:v0.29.0
image: quay.io/coreos/prometheus-operator:v0.29.0
name: prometheus-operator
ports:
- containerPort: 8080
name: http
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
nodeSelector:
serviceAccountName: prometheus-operator
现在我要应用这个文件 (CRD)
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
name: prometheus
namespace: monitoring
labels:
prometheus: prometheus
spec:
replica: 1
priorityClassName: "operator-critical"
serviceAccountName: prometheus
nodeSelector:
worker.garden.sapcloud.io/group: operator
serviceMonitorNamespaceSelector: {}
serviceMonitorSelector:
matchLabels:
role: observeable
tolerations:
- key: "WorkGroup"
operator: "Equal"
value: "operator"
effect: "NoSchedule"
- key: "WorkGroup"
operator: "Equal"
value: "operator"
effect: "NoExecute"
在我创建那些 CRD 之前
https://github.com/coreos/prometheus-operator/tree/master/example/prometheus-operator-crd
pods无法启动的问题(0/2),见下图。可能是什么问题呢?请指教
更新
当我去参加舞会运营商的活动时,我看到以下错误 creating: pods "prometheus-operator-6944778645-" is forbidden: no PriorityClass with name operator-critical was found replicaset-controller
,知道吗?
Prometheus 和警报管理器pods 需要持久卷来存储数据。确保这些 pv 存在并绑定到相应的 pods。或者你可以让那些 pods 短暂。它应该有效
您正在尝试引用 operator-critical
priority class。优先级 class 确定 pods 的优先级及其资源分配。
要解决此问题,您可以删除两个文件中的显式优先级 class(priorityClassName: "operator-critical"
) 或创建 operator-critical
class:
apiVersion: scheduling.k8s.io/v1beta1
kind: PriorityClass
metadata:
name: operator-critical
value: 1000000
globalDefault: false
description: "Critical operator workloads"