在 AKS 上调试证书管理器证书创建失败
debugging cert-manager certificate creation failure on AKS
我正在 Azure AKS 上部署证书管理器并尝试让它请求 Let's Encrypt 证书。它因 certificate signed by unknown authority
错误而失败,我无法进一步对其进行故障排除。
不确定这是否是信任 LE 服务器、tunnelfront
pod 或内部 AKS 自生成 CA 的问题。所以我的问题是:
- 如何强制证书管理器调试(显示更多信息)关于它不信任的证书?
- 也许问题经常发生并且有已知的解决方案?
- 应该采取哪些步骤来进一步调试问题?
我在 jetstack/cert-manager
的 Github 页面上创建了一个问题,但没有人回答,所以我来了。
原话如下:
未创建证书。报告以下错误:
证书:
Error from server: conversion webhook for &{map[apiVersion:cert-manager.io/v1alpha2 kind:Certificate metadata:map[creationTimestamp:2020-05-13T17:30:48Z generation:1 name:xxx-tls namespace:test ownerReferences:[map[apiVersion:extensions/v1beta1 blockOwnerDeletion:true controller:true kind:Ingress name:xxx-ingress uid:6d73b182-bbce-4834-aee2-414d2b3aa802]] uid:d40bc037-aef7-4139-868f-bd615a423b38] spec:map[dnsNames:[xxx.test.domain.com] issuerRef:map[group:cert-manager.io kind:ClusterIssuer name:letsencrypt-prod] secretName:xxx-tls] status:map[conditions:[map[lastTransitionTime:2020-05-13T18:55:31Z message:Waiting for CertificateRequest "xxx-tls-1403681706" to complete reason:InProgress status:False type:Ready]]]]} failed: Post https://cert-manager-webhook.cert-manager.svc:443/convert?timeout=30s: x509: certificate signed by unknown authority
cert-manager-webhook 容器:
cert-manager 2020/05/15 14:22:58 http: TLS handshake error from 10.20.0.19:35350: remote error: tls: bad certificate
其中 10.20.0.19
是 tunnelfront
pod 的 IP。
在尝试 kubectl describe order...
作为 kubectl describe certificaterequest...
returns CSR 内容时使用 https://cert-manager.io/docs/faq/acme/ 类型的 "fails" 进行调试(如上),但不是订单编号。
环境详细信息:
- Kubernetes 版本:
1.15.10
- Cloud-provider/provisioner :
Azure (AKS)
- 证书管理器版本:
0.14.3
- 安装方法:静态清单(见下文)+ 集群发行者(见下文)+ 常规 CRD(非遗留)
集群发行者:
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
namespace: cert-manager
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: x
privateKeySecretRef:
name: letsencrypt-prod
solvers:
- dns01:
azuredns:
clientID: x
clientSecretSecretRef:
name: cert-manager-stage
key: CLIENT_SECRET
subscriptionID: x
tenantID: x
resourceGroupName: dns-stage
hostedZoneName: x
清单:
imagePullSecrets: []
isOpenshift: false
priorityClassName: ""
rbac:
create: true
podSecurityPolicy:
enabled: false
logLevel: 2
leaderElection:
namespace: "kube-system"
replicaCount: 1
strategy: {}
image:
repository: quay.io/jetstack/cert-manager-controller
pullPolicy: IfNotPresent
tag: v0.14.3
clusterResourceNamespace: ""
serviceAccount:
create: true
name:
annotations: {}
extraArgs: []
extraEnv: []
resources: {}
securityContext:
enabled: false
fsGroup: 1001
runAsUser: 1001
podAnnotations: {}
podLabels: {}
nodeSelector: {}
ingressShim:
defaultIssuerName: letsencrypt-prod
defaultIssuerKind: ClusterIssuer
prometheus:
enabled: true
servicemonitor:
enabled: false
prometheusInstance: default
targetPort: 9402
path: /metrics
interval: 60s
scrapeTimeout: 30s
labels: {}
affinity: {}
tolerations: []
webhook:
enabled: true
replicaCount: 1
strategy: {}
podAnnotations: {}
extraArgs: []
resources: {}
nodeSelector: {}
affinity: {}
tolerations: []
image:
repository: quay.io/jetstack/cert-manager-webhook
pullPolicy: IfNotPresent
tag: v0.14.3
injectAPIServerCA: true
securePort: 10250
cainjector:
replicaCount: 1
strategy: {}
podAnnotations: {}
extraArgs: []
resources: {}
nodeSelector: {}
affinity: {}
tolerations: []
image:
repository: quay.io/jetstack/cert-manager-cainjector
pullPolicy: IfNotPresent
tag: v0.14.3
似乎 v0.14.3
有某种错误。 v0.15.0
.
不会出现此问题
我正在 Azure AKS 上部署证书管理器并尝试让它请求 Let's Encrypt 证书。它因 certificate signed by unknown authority
错误而失败,我无法进一步对其进行故障排除。
不确定这是否是信任 LE 服务器、tunnelfront
pod 或内部 AKS 自生成 CA 的问题。所以我的问题是:
- 如何强制证书管理器调试(显示更多信息)关于它不信任的证书?
- 也许问题经常发生并且有已知的解决方案?
- 应该采取哪些步骤来进一步调试问题?
我在 jetstack/cert-manager
的 Github 页面上创建了一个问题,但没有人回答,所以我来了。
原话如下:
未创建证书。报告以下错误:
证书:
Error from server: conversion webhook for &{map[apiVersion:cert-manager.io/v1alpha2 kind:Certificate metadata:map[creationTimestamp:2020-05-13T17:30:48Z generation:1 name:xxx-tls namespace:test ownerReferences:[map[apiVersion:extensions/v1beta1 blockOwnerDeletion:true controller:true kind:Ingress name:xxx-ingress uid:6d73b182-bbce-4834-aee2-414d2b3aa802]] uid:d40bc037-aef7-4139-868f-bd615a423b38] spec:map[dnsNames:[xxx.test.domain.com] issuerRef:map[group:cert-manager.io kind:ClusterIssuer name:letsencrypt-prod] secretName:xxx-tls] status:map[conditions:[map[lastTransitionTime:2020-05-13T18:55:31Z message:Waiting for CertificateRequest "xxx-tls-1403681706" to complete reason:InProgress status:False type:Ready]]]]} failed: Post https://cert-manager-webhook.cert-manager.svc:443/convert?timeout=30s: x509: certificate signed by unknown authority
cert-manager-webhook 容器:
cert-manager 2020/05/15 14:22:58 http: TLS handshake error from 10.20.0.19:35350: remote error: tls: bad certificate
其中 10.20.0.19
是 tunnelfront
pod 的 IP。
在尝试 kubectl describe order...
作为 kubectl describe certificaterequest...
returns CSR 内容时使用 https://cert-manager.io/docs/faq/acme/ 类型的 "fails" 进行调试(如上),但不是订单编号。
环境详细信息:
- Kubernetes 版本:
1.15.10
- Cloud-provider/provisioner :
Azure (AKS)
- 证书管理器版本:
0.14.3
- 安装方法:静态清单(见下文)+ 集群发行者(见下文)+ 常规 CRD(非遗留)
集群发行者:
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
namespace: cert-manager
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: x
privateKeySecretRef:
name: letsencrypt-prod
solvers:
- dns01:
azuredns:
clientID: x
clientSecretSecretRef:
name: cert-manager-stage
key: CLIENT_SECRET
subscriptionID: x
tenantID: x
resourceGroupName: dns-stage
hostedZoneName: x
清单:
imagePullSecrets: []
isOpenshift: false
priorityClassName: ""
rbac:
create: true
podSecurityPolicy:
enabled: false
logLevel: 2
leaderElection:
namespace: "kube-system"
replicaCount: 1
strategy: {}
image:
repository: quay.io/jetstack/cert-manager-controller
pullPolicy: IfNotPresent
tag: v0.14.3
clusterResourceNamespace: ""
serviceAccount:
create: true
name:
annotations: {}
extraArgs: []
extraEnv: []
resources: {}
securityContext:
enabled: false
fsGroup: 1001
runAsUser: 1001
podAnnotations: {}
podLabels: {}
nodeSelector: {}
ingressShim:
defaultIssuerName: letsencrypt-prod
defaultIssuerKind: ClusterIssuer
prometheus:
enabled: true
servicemonitor:
enabled: false
prometheusInstance: default
targetPort: 9402
path: /metrics
interval: 60s
scrapeTimeout: 30s
labels: {}
affinity: {}
tolerations: []
webhook:
enabled: true
replicaCount: 1
strategy: {}
podAnnotations: {}
extraArgs: []
resources: {}
nodeSelector: {}
affinity: {}
tolerations: []
image:
repository: quay.io/jetstack/cert-manager-webhook
pullPolicy: IfNotPresent
tag: v0.14.3
injectAPIServerCA: true
securePort: 10250
cainjector:
replicaCount: 1
strategy: {}
podAnnotations: {}
extraArgs: []
resources: {}
nodeSelector: {}
affinity: {}
tolerations: []
image:
repository: quay.io/jetstack/cert-manager-cainjector
pullPolicy: IfNotPresent
tag: v0.14.3
似乎 v0.14.3
有某种错误。 v0.15.0
.