cert-manager 未能创建 TLS Secret

cert-manager fails to create TLS Secret

我们遇到了与 TLS 证书相关的 cert-manager 问题。当我们使用 helm 部署具有所有必需注释的应用程序时,不会创建 TLS 密码。

Ingress 显示以下错误:

我发现,从 Kubernetes 仪表板中,当我从入口资源获取有关机密的详细信息时,我收到 404 错误。创建入口资源时引用了一个不存在的秘密。

查看 cert-manager 命名空间,我发现似乎有两个部署:

带有 year-old 的似乎根本不会触发。一个 4 个月大的孩子似乎触发但连续失败并出现以下错误。并且对于失败的被逐出的 pod 显示红色,但它是 运行.

E1209 19:46:28.340854       1 reflector.go:138] external/io_k8s_client_go/tools/cache/reflector.go:167: Failed to watch *v1.Certificate: failed to list *v1.Certificate: the server could not find the requested resource (get certificates.cert-manager.io)
E1209 19:46:41.726643       1 reflector.go:138] external/io_k8s_client_go/tools/cache/reflector.go:167: Failed to watch *v1.CertificateRequest: failed to list *v1.CertificateRequest: the server could not find the requested resource (get certificaterequests.cert-manager.io)
E1209 19:46:42.842402       1 reflector.go:138] external/io_k8s_client_go/tools/cache/reflector.go:167: Failed to watch *v1.Issuer: failed to list *v1.Issuer: the server could not find the requested resource (get issuers.cert-manager.io)
E1209 19:46:43.581019       1 reflector.go:138] external/io_k8s_client_go/tools/cache/reflector.go:167: Failed to watch *v1.ClusterIssuer: failed to list *v1.ClusterIssuer: the server could not find the requested resource (get clusterissuers.cert-manager.io)
E1209 19:46:51.205804       1 reflector.go:138] external/io_k8s_client_go/tools/cache/reflector.go:167: Failed to watch *v1.Challenge: failed to list *v1.Challenge: the server could not find the requested resource (get challenges.acme.cert-manager.io)
E1209 19:46:51.819486       1 reflector.go:138] external/io_k8s_client_go/tools/cache/reflector.go:167: Failed to watch *v1.Order: failed to list *v1.Order: the server could not find the requested resource (get orders.acme.cert-manager.io)

这是我正在使用的新集群​​。我在 cert-manager 命名空间上发现总共有 473 个被驱逐 pods(我有清理它们的冲动,我应该对吧?)

无论如何,主要问题是 cert-manager 没有创建 TLS Secret。我可以提供大量额外信息,但其他一切看起来都很好。

最后,我通过缩放 cert-manager1 的副本集解决了这个问题。这导致 pods 重新启动,一切正常。

但是,经过进一步调查,必须部署是不正确的,因为它们很可能相互冲突。部分解决方案是删除其中一个,只有一个工作。另外,更新到最新版本:

NAME                    NAMESPACE       REVISION    UPDATED                                     STATUS      CHART              APP VERSION
cert-manager            cert-manager    1           2020-07-24 18:49:08.541265133 +0530 +0530   deployed    cert-manager-v0.15.v0.15.1    
cert-manager1-113bce5f  cert-manager    1           2021-08-03 14:54:54.351112781 +0000 UTC     deployed    cert-manager-0.1.101.4.2