指标服务器在 Kubernetes 集群中不工作
Metrics server not working in Kubernetes cluster
我已经在 ubuntu 18+ 上设置了 kubernetes 集群。它工作正常。现在我已经添加了指标服务器,但它不工作。
# kubectl get apiservices
v1beta1.metrics.k8s.io kube-system/metrics-server False (FailedDiscoveryCheck) 2d1h
# kubectl describe apiservice v1beta1.metrics.k8s.io
Message: failing or missing response from https://10.106.145.77:443/apis/metrics.k8s.io/v1beta1: Get https://10.106.145.77:443/apis/metrics.k8s.io/v1beta1: dial tcp 10.106.145.77:443: connect: connection refused
Reason: FailedDiscoveryCheck
我不知道为什么连接被拒绝。任何人都可以帮助我或给我一些提示来解决这个问题。
我在集群中添加了 RBAC,这是问题吗?我已经尝试了很多来自网络的解决方案,但没有人能帮助我。我曾尝试使用 args 和不安全的 TLS 编辑度量服务器的部署 yaml,但没有帮助。
其他详情
# kubectl get all --all-namespaces | grep -i metrics-server
kube-system pod/metrics-server-7f55d7ccbb-th9w9 1/1 Running 0 21s
kube-system service/metrics-server ClusterIP 10.106.145.77 <none> 443/TCP 26m
kube-system deployment.apps/metrics-server 1/1 1 1 25m
kube-system replicaset.apps/metrics-server-694db48df9 0 0 0 25m
kube-system replicaset.apps/metrics-server-7f55d7ccbb 1 1 1 21s
# kubectl get -n kube-system deployment metrics-server -o yaml | grep -i args -A 10
- args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
image: k8s.gcr.io/metrics-server-amd64:v0.3.6
imagePullPolicy: Always
name: metrics-server
ports:
- containerPort: 4443
hostPort: 4443
Yml 文件:-
# kubectl get -n kube-system deployment metr ics-server -o yaml
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "2"
creationTimestamp: "2020-01-29T14:49:06Z"
generation: 2
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
resourceVersion: "951901"
selfLink: /apis/apps/v1/namespaces/kube-system/deployments/metrics-server
uid: 54137f75-af0a-45a5-a508-f4c38ee9ea25
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
k8s-app: metrics-server
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
k8s-app: metrics-server
name: metrics-server
spec:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
image: k8s.gcr.io/metrics-server-amd64:v0.3.6
imagePullPolicy: Always
name: metrics-server
ports:
- containerPort: 4443
hostPort: 4443
name: main-port
protocol: TCP
resources: {}
securityContext:
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /tmp
name: tmp-dir
dnsPolicy: ClusterFirst
hostNetwork: true
nodeSelector:
beta.kubernetes.io/os: linux
kubernetes.io/arch: amd64
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: metrics-server
serviceAccountName: metrics-server
terminationGracePeriodSeconds: 30
volumes:
- emptyDir: {}
name: tmp-dir
status:
availableReplicas: 1
conditions:
- lastTransitionTime: "2020-01-29T14:49:15Z"
lastUpdateTime: "2020-01-29T14:49:15Z"
message: Deployment has minimum availability.
reason: MinimumReplicasAvailable
status: "True"
type: Available
- lastTransitionTime: "2020-01-29T14:49:06Z"
lastUpdateTime: "2020-01-29T15:14:26Z"
message: ReplicaSet "metrics-server-7f55d7ccbb" has successfully progressed.
reason: NewReplicaSetAvailable
status: "True"
type: Progressing
observedGeneration: 2
readyReplicas: 1
replicas: 1
updatedReplicas: 1
找到 args 部分并试试这个。添加命令和 /metrics 服务器解决了我的问题,同时更新了首选地址类型,然后重新启动 kubelet。
args:
- --cert-dir=/tmp
- --secure-port=4443
command:
- /metrics-server
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
遇到与 503 服务不可用错误消息类似的问题。通过进行以下更改设法解决了该问题。
在您的 components.yaml 文件上,确保证书路径正确:
-- cert-dir=/etc/kubernetes/pki
kubectl apply -f components.yaml
(将证书路径更改为该路径而不是默认路径 /tmp
。这可能取决于您的设置,因此请尝试找出您的 pki 证书在您的计算机上的位置。我的在 /etc/kubernetes/pki
上)
我已经在 ubuntu 18+ 上设置了 kubernetes 集群。它工作正常。现在我已经添加了指标服务器,但它不工作。
# kubectl get apiservices
v1beta1.metrics.k8s.io kube-system/metrics-server False (FailedDiscoveryCheck) 2d1h
# kubectl describe apiservice v1beta1.metrics.k8s.io
Message: failing or missing response from https://10.106.145.77:443/apis/metrics.k8s.io/v1beta1: Get https://10.106.145.77:443/apis/metrics.k8s.io/v1beta1: dial tcp 10.106.145.77:443: connect: connection refused
Reason: FailedDiscoveryCheck
我不知道为什么连接被拒绝。任何人都可以帮助我或给我一些提示来解决这个问题。 我在集群中添加了 RBAC,这是问题吗?我已经尝试了很多来自网络的解决方案,但没有人能帮助我。我曾尝试使用 args 和不安全的 TLS 编辑度量服务器的部署 yaml,但没有帮助。
其他详情
# kubectl get all --all-namespaces | grep -i metrics-server
kube-system pod/metrics-server-7f55d7ccbb-th9w9 1/1 Running 0 21s
kube-system service/metrics-server ClusterIP 10.106.145.77 <none> 443/TCP 26m
kube-system deployment.apps/metrics-server 1/1 1 1 25m
kube-system replicaset.apps/metrics-server-694db48df9 0 0 0 25m
kube-system replicaset.apps/metrics-server-7f55d7ccbb 1 1 1 21s
# kubectl get -n kube-system deployment metrics-server -o yaml | grep -i args -A 10
- args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
image: k8s.gcr.io/metrics-server-amd64:v0.3.6
imagePullPolicy: Always
name: metrics-server
ports:
- containerPort: 4443
hostPort: 4443
Yml 文件:-
# kubectl get -n kube-system deployment metr ics-server -o yaml
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "2"
creationTimestamp: "2020-01-29T14:49:06Z"
generation: 2
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
resourceVersion: "951901"
selfLink: /apis/apps/v1/namespaces/kube-system/deployments/metrics-server
uid: 54137f75-af0a-45a5-a508-f4c38ee9ea25
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
k8s-app: metrics-server
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
k8s-app: metrics-server
name: metrics-server
spec:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
image: k8s.gcr.io/metrics-server-amd64:v0.3.6
imagePullPolicy: Always
name: metrics-server
ports:
- containerPort: 4443
hostPort: 4443
name: main-port
protocol: TCP
resources: {}
securityContext:
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /tmp
name: tmp-dir
dnsPolicy: ClusterFirst
hostNetwork: true
nodeSelector:
beta.kubernetes.io/os: linux
kubernetes.io/arch: amd64
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: metrics-server
serviceAccountName: metrics-server
terminationGracePeriodSeconds: 30
volumes:
- emptyDir: {}
name: tmp-dir
status:
availableReplicas: 1
conditions:
- lastTransitionTime: "2020-01-29T14:49:15Z"
lastUpdateTime: "2020-01-29T14:49:15Z"
message: Deployment has minimum availability.
reason: MinimumReplicasAvailable
status: "True"
type: Available
- lastTransitionTime: "2020-01-29T14:49:06Z"
lastUpdateTime: "2020-01-29T15:14:26Z"
message: ReplicaSet "metrics-server-7f55d7ccbb" has successfully progressed.
reason: NewReplicaSetAvailable
status: "True"
type: Progressing
observedGeneration: 2
readyReplicas: 1
replicas: 1
updatedReplicas: 1
找到 args 部分并试试这个。添加命令和 /metrics 服务器解决了我的问题,同时更新了首选地址类型,然后重新启动 kubelet。
args:
- --cert-dir=/tmp
- --secure-port=4443
command:
- /metrics-server
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
遇到与 503 服务不可用错误消息类似的问题。通过进行以下更改设法解决了该问题。
在您的 components.yaml 文件上,确保证书路径正确:
-- cert-dir=/etc/kubernetes/pki
kubectl apply -f components.yaml
(将证书路径更改为该路径而不是默认路径 /tmp
。这可能取决于您的设置,因此请尝试找出您的 pki 证书在您的计算机上的位置。我的在 /etc/kubernetes/pki
上)