Istio Prometheus pod 处于 CrashLoopBackOff 状态
Istio Prometheus pod in CrashLoopBackOff State
我正在尝试为 website 上提供的 bookinfo 示例设置 Istio (1.5.4)。我使用了演示配置文件。但是在验证 istio 安装时它失败了,因为 Prometheus pod 进入了 CrashLoopBackOff 状态。
NAME READY STATUS RESTARTS AGE
grafana-5f6f8cbf75-psk78 1/1 Running 0 21m
istio-egressgateway-7f9f45c966-g7k9j 1/1 Running 0 21m
istio-ingressgateway-968d69c8b-bhxk5 1/1 Running 0 21m
istio-tracing-9dd6c4f7c-7fm79 1/1 Running 0 21m
istiod-86884c8c45-sw96x 1/1 Running 0 21m
kiali-869c6894c5-wqgjb 1/1 Running 0 21m
prometheus-589c44dbfc-xkwmj 1/2 CrashLoopBackOff 8 21m
prometheus pod 的日志:
level=warn ts=2020-05-15T09:07:53.113Z caller=main.go:283 deprecation_notice="'storage.tsdb.retention' flag is deprecated use 'storage.tsdb.retention.time' instead."
level=info ts=2020-05-15T09:07:53.114Z caller=main.go:330 msg="Starting Prometheus" version="(version=2.15.1, branch=HEAD, revision=8744510c6391d3ef46d8294a7e1f46e57407ab13)"
level=info ts=2020-05-15T09:07:53.114Z caller=main.go:331 build_context="(go=go1.13.5, user=root@4b1e33c71b9d, date=20191225-01:04:15)"
level=info ts=2020-05-15T09:07:53.114Z caller=main.go:332 host_details="(Linux 4.15.0-52-generic #56-Ubuntu SMP Tue Jun 4 22:49:08 UTC 2019 x86_64 prometheus-589c44dbfc-xkwmj (none))"
level=info ts=2020-05-15T09:07:53.114Z caller=main.go:333 fd_limits="(soft=1048576, hard=1048576)"
level=info ts=2020-05-15T09:07:53.114Z caller=main.go:334 vm_limits="(soft=unlimited, hard=unlimited)"
level=error ts=2020-05-15T09:07:53.157Z caller=query_logger.go:107 component=activeQueryTracker msg="Failed to create directory for logging active queries"
level=error ts=2020-05-15T09:07:53.157Z caller=query_logger.go:85 component=activeQueryTracker msg="Error opening query log file" file=data/queries.active err="open data/queries.active: no such file or directory"
panic: Unable to create mmap-ed active query log
goroutine 1 [running]:
github.com/prometheus/prometheus/promql.NewActiveQueryTracker(0x24dda5b, 0x5, 0x14, 0x2c62100, 0xc0005f63c0, 0x2c62100)
/app/promql/query_logger.go:115 +0x48c
main.main()
/app/cmd/prometheus/main.go:362 +0x5229
描述 pod 输出:
Name: prometheus-589c44dbfc-xkwmj
Namespace: istio-system
Priority: 0
Node: inspiron-7577/192.168.0.9
Start Time: Fri, 15 May 2020 14:21:14 +0530
Labels: app=prometheus
pod-template-hash=589c44dbfc
release=istio
Annotations: sidecar.istio.io/inject: false
Status: Running
IP: 172.17.0.11
IPs:
IP: 172.17.0.11
Controlled By: ReplicaSet/prometheus-589c44dbfc
Containers:
prometheus:
Container ID: docker://b6820a000ab67a5ce31d3a38f6f0d510bd150794b2792147fc17ef8f730c03bb
Image: docker.io/prom/prometheus:v2.15.1
Image ID: docker-pullable://prom/prometheus@sha256:169b743ceb4452266915272f9c3409d36972e41cb52f3f28644e6c0609fc54e6
Port: 9090/TCP
Host Port: 0/TCP
Args:
--storage.tsdb.retention=6h
--config.file=/etc/prometheus/prometheus.yml
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 2
Started: Fri, 15 May 2020 14:37:50 +0530
Finished: Fri, 15 May 2020 14:37:53 +0530
Ready: False
Restart Count: 8
Requests:
cpu: 10m
Liveness: http-get http://:9090/-/healthy delay=0s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:9090/-/ready delay=0s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/etc/istio-certs from istio-certs (rw)
/etc/prometheus from config-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from prometheus-token-cgqbc (ro)
istio-proxy:
Container ID: docker://fa756c93510b6f402d7d88c31a5f5f066d4c254590eab70886e7835e7d3871be
Image: docker.io/istio/proxyv2:1.5.4
Image ID: docker-pullable://istio/proxyv2@sha256:e16e2801b7fd93154e8fcb5f4e2fb1240d73349d425b8be90691d48e8b9bb944
Port: 15090/TCP
Host Port: 0/TCP
Args:
proxy
sidecar
--domain
$(POD_NAMESPACE).svc.cluster.local
--configPath
/etc/istio/proxy
--binaryPath
/usr/local/bin/envoy
--serviceCluster
istio-proxy-prometheus
--drainDuration
45s
--parentShutdownDuration
1m0s
--discoveryAddress
istio-pilot.istio-system.svc:15012
--proxyLogLevel=warning
--proxyComponentLogLevel=misc:error
--connectTimeout
10s
--proxyAdminPort
15000
--controlPlaneAuthPolicy
NONE
--dnsRefreshRate
300s
--statusPort
15020
--trust-domain=cluster.local
--controlPlaneBootstrap=false
State: Running
Started: Fri, 15 May 2020 14:21:31 +0530
Ready: True
Restart Count: 0
Readiness: http-get http://:15020/healthz/ready delay=1s timeout=1s period=2s #success=1 #failure=30
Environment:
OUTPUT_CERTS: /etc/istio-certs
JWT_POLICY: first-party-jwt
PILOT_CERT_PROVIDER: istiod
CA_ADDR: istio-pilot.istio-system.svc:15012
POD_NAME: prometheus-589c44dbfc-xkwmj (v1:metadata.name)
POD_NAMESPACE: istio-system (v1:metadata.namespace)
INSTANCE_IP: (v1:status.podIP)
SERVICE_ACCOUNT: (v1:spec.serviceAccountName)
HOST_IP: (v1:status.hostIP)
ISTIO_META_POD_NAME: prometheus-589c44dbfc-xkwmj (v1:metadata.name)
ISTIO_META_CONFIG_NAMESPACE: istio-system (v1:metadata.namespace)
ISTIO_META_MESH_ID: cluster.local
ISTIO_META_CLUSTER_ID: Kubernetes
Mounts:
/etc/istio-certs/ from istio-certs (rw)
/etc/istio/proxy from istio-envoy (rw)
/var/run/secrets/istio from istiod-ca-cert (rw)
/var/run/secrets/kubernetes.io/serviceaccount from prometheus-token-cgqbc (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: prometheus
Optional: false
istio-certs:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium: Memory
SizeLimit: <unset>
istio-envoy:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium: Memory
SizeLimit: <unset>
istiod-ca-cert:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: istio-ca-root-cert
Optional: false
prometheus-token-cgqbc:
Type: Secret (a volume populated by a Secret)
SecretName: prometheus-token-cgqbc
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned istio-system/prometheus-589c44dbfc-xkwmj to inspiron-7577
Warning FailedMount 17m kubelet, inspiron-7577 MountVolume.SetUp failed for volume "prometheus-token-cgqbc" : failed to sync secret cache: timed out waiting for the condition
Warning FailedMount 17m kubelet, inspiron-7577 MountVolume.SetUp failed for volume "config-volume" : failed to sync configmap cache: timed out waiting for the condition
Normal Pulled 17m kubelet, inspiron-7577 Container image "docker.io/istio/proxyv2:1.5.4" already present on machine
Normal Created 17m kubelet, inspiron-7577 Created container istio-proxy
Normal Started 17m kubelet, inspiron-7577 Started container istio-proxy
Warning Unhealthy 17m kubelet, inspiron-7577 Readiness probe failed: HTTP probe failed with statuscode: 503
Normal Pulled 16m (x4 over 17m) kubelet, inspiron-7577 Container image "docker.io/prom/prometheus:v2.15.1" already present on machine
Normal Created 16m (x4 over 17m) kubelet, inspiron-7577 Created container prometheus
Normal Started 16m (x4 over 17m) kubelet, inspiron-7577 Started container prometheus
Warning BackOff 2m24s (x72 over 17m) kubelet, inspiron-7577 Back-off restarting failed container
无法创建日志目录。请提供任何想法。
由于 istio 1.5.4 刚刚发布,使用 istioctl manifest apply 安装的 minikube 上的 prometheus 存在一些问题。
我在 gcp 上检查过,一切正常。
作为一种变通方法,您可以使用 istio operator,我和 OP 对此进行了测试,正如他在评论中提到的那样,它正在运行。
Thanks a lot @jt97! It did work.
安装 istio operator 的步骤
To install the Istio demo configuration profile using the operator, run the following command:
kubectl create ns istio-system
kubectl apply -f - <<EOF
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
namespace: istio-system
name: example-istiocontrolplane
spec:
profile: demo
EOF
Could you tell me why the normal installation failed?
正如我在评论中提到的,我还不知道。如果我找到原因,我会更新这个问题。
我正在尝试为 website 上提供的 bookinfo 示例设置 Istio (1.5.4)。我使用了演示配置文件。但是在验证 istio 安装时它失败了,因为 Prometheus pod 进入了 CrashLoopBackOff 状态。
NAME READY STATUS RESTARTS AGE
grafana-5f6f8cbf75-psk78 1/1 Running 0 21m
istio-egressgateway-7f9f45c966-g7k9j 1/1 Running 0 21m
istio-ingressgateway-968d69c8b-bhxk5 1/1 Running 0 21m
istio-tracing-9dd6c4f7c-7fm79 1/1 Running 0 21m
istiod-86884c8c45-sw96x 1/1 Running 0 21m
kiali-869c6894c5-wqgjb 1/1 Running 0 21m
prometheus-589c44dbfc-xkwmj 1/2 CrashLoopBackOff 8 21m
prometheus pod 的日志:
level=warn ts=2020-05-15T09:07:53.113Z caller=main.go:283 deprecation_notice="'storage.tsdb.retention' flag is deprecated use 'storage.tsdb.retention.time' instead."
level=info ts=2020-05-15T09:07:53.114Z caller=main.go:330 msg="Starting Prometheus" version="(version=2.15.1, branch=HEAD, revision=8744510c6391d3ef46d8294a7e1f46e57407ab13)"
level=info ts=2020-05-15T09:07:53.114Z caller=main.go:331 build_context="(go=go1.13.5, user=root@4b1e33c71b9d, date=20191225-01:04:15)"
level=info ts=2020-05-15T09:07:53.114Z caller=main.go:332 host_details="(Linux 4.15.0-52-generic #56-Ubuntu SMP Tue Jun 4 22:49:08 UTC 2019 x86_64 prometheus-589c44dbfc-xkwmj (none))"
level=info ts=2020-05-15T09:07:53.114Z caller=main.go:333 fd_limits="(soft=1048576, hard=1048576)"
level=info ts=2020-05-15T09:07:53.114Z caller=main.go:334 vm_limits="(soft=unlimited, hard=unlimited)"
level=error ts=2020-05-15T09:07:53.157Z caller=query_logger.go:107 component=activeQueryTracker msg="Failed to create directory for logging active queries"
level=error ts=2020-05-15T09:07:53.157Z caller=query_logger.go:85 component=activeQueryTracker msg="Error opening query log file" file=data/queries.active err="open data/queries.active: no such file or directory"
panic: Unable to create mmap-ed active query log
goroutine 1 [running]:
github.com/prometheus/prometheus/promql.NewActiveQueryTracker(0x24dda5b, 0x5, 0x14, 0x2c62100, 0xc0005f63c0, 0x2c62100)
/app/promql/query_logger.go:115 +0x48c
main.main()
/app/cmd/prometheus/main.go:362 +0x5229
描述 pod 输出:
Name: prometheus-589c44dbfc-xkwmj
Namespace: istio-system
Priority: 0
Node: inspiron-7577/192.168.0.9
Start Time: Fri, 15 May 2020 14:21:14 +0530
Labels: app=prometheus
pod-template-hash=589c44dbfc
release=istio
Annotations: sidecar.istio.io/inject: false
Status: Running
IP: 172.17.0.11
IPs:
IP: 172.17.0.11
Controlled By: ReplicaSet/prometheus-589c44dbfc
Containers:
prometheus:
Container ID: docker://b6820a000ab67a5ce31d3a38f6f0d510bd150794b2792147fc17ef8f730c03bb
Image: docker.io/prom/prometheus:v2.15.1
Image ID: docker-pullable://prom/prometheus@sha256:169b743ceb4452266915272f9c3409d36972e41cb52f3f28644e6c0609fc54e6
Port: 9090/TCP
Host Port: 0/TCP
Args:
--storage.tsdb.retention=6h
--config.file=/etc/prometheus/prometheus.yml
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 2
Started: Fri, 15 May 2020 14:37:50 +0530
Finished: Fri, 15 May 2020 14:37:53 +0530
Ready: False
Restart Count: 8
Requests:
cpu: 10m
Liveness: http-get http://:9090/-/healthy delay=0s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:9090/-/ready delay=0s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/etc/istio-certs from istio-certs (rw)
/etc/prometheus from config-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from prometheus-token-cgqbc (ro)
istio-proxy:
Container ID: docker://fa756c93510b6f402d7d88c31a5f5f066d4c254590eab70886e7835e7d3871be
Image: docker.io/istio/proxyv2:1.5.4
Image ID: docker-pullable://istio/proxyv2@sha256:e16e2801b7fd93154e8fcb5f4e2fb1240d73349d425b8be90691d48e8b9bb944
Port: 15090/TCP
Host Port: 0/TCP
Args:
proxy
sidecar
--domain
$(POD_NAMESPACE).svc.cluster.local
--configPath
/etc/istio/proxy
--binaryPath
/usr/local/bin/envoy
--serviceCluster
istio-proxy-prometheus
--drainDuration
45s
--parentShutdownDuration
1m0s
--discoveryAddress
istio-pilot.istio-system.svc:15012
--proxyLogLevel=warning
--proxyComponentLogLevel=misc:error
--connectTimeout
10s
--proxyAdminPort
15000
--controlPlaneAuthPolicy
NONE
--dnsRefreshRate
300s
--statusPort
15020
--trust-domain=cluster.local
--controlPlaneBootstrap=false
State: Running
Started: Fri, 15 May 2020 14:21:31 +0530
Ready: True
Restart Count: 0
Readiness: http-get http://:15020/healthz/ready delay=1s timeout=1s period=2s #success=1 #failure=30
Environment:
OUTPUT_CERTS: /etc/istio-certs
JWT_POLICY: first-party-jwt
PILOT_CERT_PROVIDER: istiod
CA_ADDR: istio-pilot.istio-system.svc:15012
POD_NAME: prometheus-589c44dbfc-xkwmj (v1:metadata.name)
POD_NAMESPACE: istio-system (v1:metadata.namespace)
INSTANCE_IP: (v1:status.podIP)
SERVICE_ACCOUNT: (v1:spec.serviceAccountName)
HOST_IP: (v1:status.hostIP)
ISTIO_META_POD_NAME: prometheus-589c44dbfc-xkwmj (v1:metadata.name)
ISTIO_META_CONFIG_NAMESPACE: istio-system (v1:metadata.namespace)
ISTIO_META_MESH_ID: cluster.local
ISTIO_META_CLUSTER_ID: Kubernetes
Mounts:
/etc/istio-certs/ from istio-certs (rw)
/etc/istio/proxy from istio-envoy (rw)
/var/run/secrets/istio from istiod-ca-cert (rw)
/var/run/secrets/kubernetes.io/serviceaccount from prometheus-token-cgqbc (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: prometheus
Optional: false
istio-certs:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium: Memory
SizeLimit: <unset>
istio-envoy:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium: Memory
SizeLimit: <unset>
istiod-ca-cert:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: istio-ca-root-cert
Optional: false
prometheus-token-cgqbc:
Type: Secret (a volume populated by a Secret)
SecretName: prometheus-token-cgqbc
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned istio-system/prometheus-589c44dbfc-xkwmj to inspiron-7577
Warning FailedMount 17m kubelet, inspiron-7577 MountVolume.SetUp failed for volume "prometheus-token-cgqbc" : failed to sync secret cache: timed out waiting for the condition
Warning FailedMount 17m kubelet, inspiron-7577 MountVolume.SetUp failed for volume "config-volume" : failed to sync configmap cache: timed out waiting for the condition
Normal Pulled 17m kubelet, inspiron-7577 Container image "docker.io/istio/proxyv2:1.5.4" already present on machine
Normal Created 17m kubelet, inspiron-7577 Created container istio-proxy
Normal Started 17m kubelet, inspiron-7577 Started container istio-proxy
Warning Unhealthy 17m kubelet, inspiron-7577 Readiness probe failed: HTTP probe failed with statuscode: 503
Normal Pulled 16m (x4 over 17m) kubelet, inspiron-7577 Container image "docker.io/prom/prometheus:v2.15.1" already present on machine
Normal Created 16m (x4 over 17m) kubelet, inspiron-7577 Created container prometheus
Normal Started 16m (x4 over 17m) kubelet, inspiron-7577 Started container prometheus
Warning BackOff 2m24s (x72 over 17m) kubelet, inspiron-7577 Back-off restarting failed container
无法创建日志目录。请提供任何想法。
由于 istio 1.5.4 刚刚发布,使用 istioctl manifest apply 安装的 minikube 上的 prometheus 存在一些问题。
我在 gcp 上检查过,一切正常。
作为一种变通方法,您可以使用 istio operator,我和 OP 对此进行了测试,正如他在评论中提到的那样,它正在运行。
Thanks a lot @jt97! It did work.
安装 istio operator 的步骤
To install the Istio demo configuration profile using the operator, run the following command:
kubectl create ns istio-system
kubectl apply -f - <<EOF
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
namespace: istio-system
name: example-istiocontrolplane
spec:
profile: demo
EOF
Could you tell me why the normal installation failed?
正如我在评论中提到的,我还不知道。如果我找到原因,我会更新这个问题。