AKS 升级到 v1.22 后 Nginx-ingress-controller 启动失败
Nginx-ingress-controller fails to start after AKS upgrade to v1.22
我们将 kubernetes 集群从 v1.21 升级到 v1.22。执行此操作后,我们发现我们的 nginx-ingress-controller 部署的 pods 无法启动,并显示以下错误消息:
pkg/mod/k8s.io/client-go@v0.18.5/tools/cache/reflector.go:125: Failed to list *v1beta1.Ingress: the server could not find the requested resource
我们发现此问题已在此处跟踪:https://github.com/bitnami/charts/issues/7264
因为 azure 不允许将集群降级回 1.21,您能帮我们修复 nginx-ingress-controller 部署吗?由于我们对 helm
.
不是很熟悉,您能否具体说明应该做什么以及从何处(本地计算机或 azure cli 等)
这是我们的部署当前 yaml:
kind: Deployment
apiVersion: apps/v1
metadata:
name: nginx-ingress-controller
namespace: ingress
uid: 575c7699-1fd5-413e-a81d-b183f8822324
resourceVersion: '166482672'
generation: 16
creationTimestamp: '2020-10-10T10:20:07Z'
labels:
app: nginx-ingress
app.kubernetes.io/component: controller
app.kubernetes.io/managed-by: Helm
chart: nginx-ingress-1.41.1
heritage: Helm
release: nginx-ingress
annotations:
deployment.kubernetes.io/revision: '2'
meta.helm.sh/release-name: nginx-ingress
meta.helm.sh/release-namespace: ingress
managedFields:
- manager: kube-controller-manager
operation: Update
apiVersion: apps/v1
fieldsType: FieldsV1
fieldsV1:
f:spec:
f:replicas: {}
subresource: scale
- manager: Go-http-client
operation: Update
apiVersion: apps/v1
time: '2020-10-10T10:20:07Z'
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.: {}
f:meta.helm.sh/release-name: {}
f:meta.helm.sh/release-namespace: {}
f:labels:
.: {}
f:app: {}
f:app.kubernetes.io/component: {}
f:app.kubernetes.io/managed-by: {}
f:chart: {}
f:heritage: {}
f:release: {}
f:spec:
f:progressDeadlineSeconds: {}
f:revisionHistoryLimit: {}
f:selector: {}
f:strategy:
f:rollingUpdate:
.: {}
f:maxSurge: {}
f:maxUnavailable: {}
f:type: {}
f:template:
f:metadata:
f:labels:
.: {}
f:app: {}
f:app.kubernetes.io/component: {}
f:component: {}
f:release: {}
f:spec:
f:containers:
k:{"name":"nginx-ingress-controller"}:
.: {}
f:args: {}
f:env:
.: {}
k:{"name":"POD_NAME"}:
.: {}
f:name: {}
f:valueFrom:
.: {}
f:fieldRef: {}
k:{"name":"POD_NAMESPACE"}:
.: {}
f:name: {}
f:valueFrom:
.: {}
f:fieldRef: {}
f:image: {}
f:imagePullPolicy: {}
f:livenessProbe:
.: {}
f:failureThreshold: {}
f:httpGet:
.: {}
f:path: {}
f:port: {}
f:scheme: {}
f:initialDelaySeconds: {}
f:periodSeconds: {}
f:successThreshold: {}
f:timeoutSeconds: {}
f:name: {}
f:ports:
.: {}
k:{"containerPort":80,"protocol":"TCP"}:
.: {}
f:containerPort: {}
f:name: {}
f:protocol: {}
k:{"containerPort":443,"protocol":"TCP"}:
.: {}
f:containerPort: {}
f:name: {}
f:protocol: {}
f:readinessProbe:
.: {}
f:failureThreshold: {}
f:httpGet:
.: {}
f:path: {}
f:port: {}
f:scheme: {}
f:initialDelaySeconds: {}
f:periodSeconds: {}
f:successThreshold: {}
f:timeoutSeconds: {}
f:resources:
.: {}
f:limits: {}
f:requests: {}
f:securityContext:
.: {}
f:allowPrivilegeEscalation: {}
f:capabilities:
.: {}
f:add: {}
f:drop: {}
f:runAsUser: {}
f:terminationMessagePath: {}
f:terminationMessagePolicy: {}
f:dnsPolicy: {}
f:restartPolicy: {}
f:schedulerName: {}
f:securityContext: {}
f:serviceAccount: {}
f:serviceAccountName: {}
f:terminationGracePeriodSeconds: {}
- manager: kube-controller-manager
operation: Update
apiVersion: apps/v1
time: '2022-01-24T01:23:22Z'
fieldsType: FieldsV1
fieldsV1:
f:status:
f:conditions:
.: {}
k:{"type":"Available"}:
.: {}
f:type: {}
k:{"type":"Progressing"}:
.: {}
f:type: {}
- manager: Mozilla
operation: Update
apiVersion: apps/v1
time: '2022-01-28T23:18:41Z'
fieldsType: FieldsV1
fieldsV1:
f:spec:
f:template:
f:spec:
f:containers:
k:{"name":"nginx-ingress-controller"}:
f:resources:
f:limits:
f:cpu: {}
f:memory: {}
f:requests:
f:cpu: {}
f:memory: {}
- manager: kube-controller-manager
operation: Update
apiVersion: apps/v1
time: '2022-01-28T23:29:49Z'
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
f:deployment.kubernetes.io/revision: {}
f:status:
f:conditions:
k:{"type":"Available"}:
f:lastTransitionTime: {}
f:lastUpdateTime: {}
f:message: {}
f:reason: {}
f:status: {}
k:{"type":"Progressing"}:
f:lastTransitionTime: {}
f:lastUpdateTime: {}
f:message: {}
f:reason: {}
f:status: {}
f:observedGeneration: {}
f:replicas: {}
f:unavailableReplicas: {}
f:updatedReplicas: {}
subresource: status
spec:
replicas: 2
selector:
matchLabels:
app: nginx-ingress
app.kubernetes.io/component: controller
release: nginx-ingress
template:
metadata:
creationTimestamp: null
labels:
app: nginx-ingress
app.kubernetes.io/component: controller
component: controller
release: nginx-ingress
spec:
containers:
- name: nginx-ingress-controller
image: us.gcr.io/k8s-artifacts-prod/ingress-nginx/controller:v0.34.1
args:
- /nginx-ingress-controller
- '--default-backend-service=ingress/nginx-ingress-default-backend'
- '--election-id=ingress-controller-leader'
- '--ingress-class=nginx'
- '--configmap=ingress/nginx-ingress-controller'
ports:
- name: http
containerPort: 80
protocol: TCP
- name: https
containerPort: 443
protocol: TCP
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
resources:
limits:
cpu: 300m
memory: 512Mi
requests:
cpu: 200m
memory: 256Mi
livenessProbe:
httpGet:
path: /healthz
port: 10254
scheme: HTTP
initialDelaySeconds: 10
timeoutSeconds: 1
periodSeconds: 10
successThreshold: 1
failureThreshold: 3
readinessProbe:
httpGet:
path: /healthz
port: 10254
scheme: HTTP
initialDelaySeconds: 10
timeoutSeconds: 1
periodSeconds: 10
successThreshold: 1
failureThreshold: 3
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
imagePullPolicy: IfNotPresent
securityContext:
capabilities:
add:
- NET_BIND_SERVICE
drop:
- ALL
runAsUser: 101
allowPrivilegeEscalation: true
restartPolicy: Always
terminationGracePeriodSeconds: 60
dnsPolicy: ClusterFirst
serviceAccountName: nginx-ingress
serviceAccount: nginx-ingress
securityContext: {}
schedulerName: default-scheduler
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 25%
maxSurge: 25%
revisionHistoryLimit: 10
progressDeadlineSeconds: 600
status:
observedGeneration: 16
replicas: 3
updatedReplicas: 2
unavailableReplicas: 3
conditions:
- type: Available
status: 'False'
lastUpdateTime: '2022-01-28T22:58:07Z'
lastTransitionTime: '2022-01-28T22:58:07Z'
reason: MinimumReplicasUnavailable
message: Deployment does not have minimum availability.
- type: Progressing
status: 'False'
lastUpdateTime: '2022-01-28T23:29:49Z'
lastTransitionTime: '2022-01-28T23:29:49Z'
reason: ProgressDeadlineExceeded
message: >-
ReplicaSet "nginx-ingress-controller-59d9f94677" has timed out
progressing.
仅 NGINX Ingress Controller 1.0.0 及更高版本支持 Kubernetes 1.22 = https://github.com/kubernetes/ingress-nginx#support-versions-table
您需要在 Chart.yaml
中将您的 nginx-ingress-controller
Bitnami Helm Chart 升级到版本 9.0.0。然后运行一个helm upgrade nginx-ingress-controller bitnami/nginx-ingress-controller
.
您还应该定期特别更新您的入口控制器,因为 v0.34.1 版本非常旧,因为入口通常是从外部指定到您的集群的唯一入口。
@Philip Welz 的回答当然是正确的。由于在 Kubernetes v1.22 中删除了 v1beta1
Ingress API 版本,因此有必要升级入口控制器。但这不是我们面临的唯一问题,所以我决定制作一个“非常非常简短”的指南,说明我们如何最终得到一个健康的 运行 集群(5 天后),这样它可能会拯救其他人奋斗。
1。正在升级 YAML 文件中的 nginx-ingress-controller 版本。
这里我们只是把yaml文件中的版本改成了:
image: us.gcr.io/k8s-artifacts-prod/ingress-nginx/controller:v0.34.1
至
image: us.gcr.io/k8s-artifacts-prod/ingress-nginx/controller:v1.1.1
此操作后,在 v1.1.1 中生成了一个新的 pod。它开始很好,并且 运行 健康。不幸的是,这并没有让我们的微服务重新上线。现在我知道这可能是因为必须对现有的入口 yaml 文件进行一些更改,以使其与新版本的入口控制器兼容。所以现在直接进入第2步(下面两个headers)。
暂时不要执行此步骤,只有在第 2 步对您失败时才执行:重新安装 nginx-ingress-controller
我们决定在这种情况下,我们将按照微软的官方文档从头开始重新安装控制器:https://docs.microsoft.com/en-us/azure/aks/ingress-basic?tabs=azure-cli。请注意,这可能会更改入口控制器的外部 IP 地址。在我们的案例中,最简单的方法是删除整个 ingress
命名空间:
kubectl delete namespace ingress
不幸的是,这并没有删除入口 class,因此需要额外的:
kubectl delete ingressclass nginx --all-namespaces
然后安装新控制器:
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update
helm install ingress-nginx ingress-nginx/ingress-nginx --create-namespace --namespace ingress
如果您在步骤 1 中升级后重新安装 nginx-ingress-controller 或更改了 IP 地址:更新您的网络安全组、负载平衡器和域 DNS
在您的 AKS 资源组中应该有 Network security group
类型的资源。它包含入站和出站安全规则(我知道它用作防火墙)。应该有一个由 Kubernetes 自动管理的默认网络安全组,IP 地址应该在那里自动刷新。
不幸的是,我们还有一个额外的自定义。我们不得不在那里手动更新规则。
同一个资源组中应该有一个Load balancer
类型的资源。在 Frontend IP configuration
选项卡中仔细检查 IP 地址是否反映了您的新 IP 地址。作为奖励,您可以在 Backend pools
选项卡中仔细检查那里的地址是否与您的内部节点 IP 匹配。
最后别忘了调整您的域 DNS 记录。
2。升级您的入口 yaml 配置文件以匹配语法更改
我们花了一些时间来确定一个工作模板,但实际上从上面提到的 Microsoft 教程中安装 helloworld 应用程序对我们帮助很大。我们从这里开始:
kind: Ingress
apiVersion: networking.k8s.io/v1
metadata:
name: hello-world-ingress
namespace: services
annotations:
kubernetes.io/ingress.class: nginx
nginx.ingress.kubernetes.io/rewrite-target: /
nginx.ingress.kubernetes.io/ssl-redirect: 'false'
nginx.ingress.kubernetes.io/use-regex: 'true'
rules:
- http:
paths:
- path: /hello-world-one(/|$)(.*)
pathType: Prefix
backend:
service:
name: aks-helloworld-one
port:
number: 80
在逐步引入更改后,我们终于做到了下面的内容。但我很确定问题是我们缺少 nginx.ingress.kubernetes.io/use-regex: 'true'
条目:
kind: Ingress
apiVersion: networking.k8s.io/v1
metadata:
name: example-api
namespace: services
annotations:
kubernetes.io/ingress.class: nginx
nginx.ingress.kubernetes.io/configuration-snippet: |
more_set_headers "X-Forwarded-By: example-api";
nginx.ingress.kubernetes.io/rewrite-target: /example-api
nginx.ingress.kubernetes.io/ssl-redirect: 'true'
nginx.ingress.kubernetes.io/use-regex: 'true'
spec:
tls:
- hosts:
- services.example.com
secretName: tls-secret
rules:
- host: services.example.com
http:
paths:
- path: /example-api
pathType: ImplementationSpecific
backend:
service:
name: example-api
port:
number: 80
以防万一有人想安装,出于测试目的,helloworld 应用程序然后 yamls 如下所示:
apiVersion: apps/v1
kind: Deployment
metadata:
name: aks-helloworld-one
spec:
replicas: 1
selector:
matchLabels:
app: aks-helloworld-one
template:
metadata:
labels:
app: aks-helloworld-one
spec:
containers:
- name: aks-helloworld-one
image: mcr.microsoft.com/azuredocs/aks-helloworld:v1
ports:
- containerPort: 80
env:
- name: TITLE
value: "Welcome to Azure Kubernetes Service (AKS)"
---
apiVersion: v1
kind: Service
metadata:
name: aks-helloworld-one
spec:
type: ClusterIP
ports:
- port: 80
selector:
app: aks-helloworld-one
3。处理其他崩溃的应用程序...
另一个在我们集群中崩溃的应用程序是 cert-manager
。这是 1.0.1 版,所以,首先,我们将其升级到 1.1.1 版:
helm repo add jetstack https://charts.jetstack.io
helm repo update
helm upgrade --namespace cert-manager --version 1.1 cert-manager jetstack/cert-manager
这创造了一个全新的健康豆荚。我们很高兴并决定继续使用 v1.1,因为我们有点担心升级到更高版本时必须采取的额外措施(查看本页底部 https://cert-manager.io/docs/installation/upgrading/)。
集群现在终于修复了。是吧?
4。 ...但一定要检查兼容性图表!
嗯.. 现在我们知道 cert-manager 仅从 1.5 版开始与 Kubernetes v1.22 兼容。我们很不走运,就在那天晚上,我们的 SSL 证书从到期日起超过了 30 天的门槛,所以 cert-manager 决定续订证书!操作失败,cert-manager 崩溃。 Kubernetes 回退到“Kubernetes 假证书”。由于证书无效,浏览器终止了流量,网页再次关闭。
修复是升级到 1.5 并同时升级 CRD:
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.5.4/cert-manager.crds.yaml
helm upgrade --namespace cert-manager --version 1.5 cert-manager jetstack/cert-manager
在此之后,cert-manager 的新实例成功刷新了我们的证书。群集再次保存。
如果你需要强制续订,你可以看看这个问题:https://github.com/jetstack/cert-manager/issues/2641
@ajcann 建议在证书中添加 renewBefore
属性:
kubectl get certs --no-headers=true | awk '{print }' | xargs -n 1 kubectl patch certificate --patch '
- op: replace
path: /spec/renewBefore
value: 1440h
' --type=json
然后等待证书更新,然后删除 属性:
kubectl get certs --no-headers=true | awk '{print }' | xargs -n 1 kubectl patch certificate --patch '
- op: remove
path: /spec/renewBefore
' --type=json
我们将 kubernetes 集群从 v1.21 升级到 v1.22。执行此操作后,我们发现我们的 nginx-ingress-controller 部署的 pods 无法启动,并显示以下错误消息:
pkg/mod/k8s.io/client-go@v0.18.5/tools/cache/reflector.go:125: Failed to list *v1beta1.Ingress: the server could not find the requested resource
我们发现此问题已在此处跟踪:https://github.com/bitnami/charts/issues/7264
因为 azure 不允许将集群降级回 1.21,您能帮我们修复 nginx-ingress-controller 部署吗?由于我们对 helm
.
这是我们的部署当前 yaml:
kind: Deployment
apiVersion: apps/v1
metadata:
name: nginx-ingress-controller
namespace: ingress
uid: 575c7699-1fd5-413e-a81d-b183f8822324
resourceVersion: '166482672'
generation: 16
creationTimestamp: '2020-10-10T10:20:07Z'
labels:
app: nginx-ingress
app.kubernetes.io/component: controller
app.kubernetes.io/managed-by: Helm
chart: nginx-ingress-1.41.1
heritage: Helm
release: nginx-ingress
annotations:
deployment.kubernetes.io/revision: '2'
meta.helm.sh/release-name: nginx-ingress
meta.helm.sh/release-namespace: ingress
managedFields:
- manager: kube-controller-manager
operation: Update
apiVersion: apps/v1
fieldsType: FieldsV1
fieldsV1:
f:spec:
f:replicas: {}
subresource: scale
- manager: Go-http-client
operation: Update
apiVersion: apps/v1
time: '2020-10-10T10:20:07Z'
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.: {}
f:meta.helm.sh/release-name: {}
f:meta.helm.sh/release-namespace: {}
f:labels:
.: {}
f:app: {}
f:app.kubernetes.io/component: {}
f:app.kubernetes.io/managed-by: {}
f:chart: {}
f:heritage: {}
f:release: {}
f:spec:
f:progressDeadlineSeconds: {}
f:revisionHistoryLimit: {}
f:selector: {}
f:strategy:
f:rollingUpdate:
.: {}
f:maxSurge: {}
f:maxUnavailable: {}
f:type: {}
f:template:
f:metadata:
f:labels:
.: {}
f:app: {}
f:app.kubernetes.io/component: {}
f:component: {}
f:release: {}
f:spec:
f:containers:
k:{"name":"nginx-ingress-controller"}:
.: {}
f:args: {}
f:env:
.: {}
k:{"name":"POD_NAME"}:
.: {}
f:name: {}
f:valueFrom:
.: {}
f:fieldRef: {}
k:{"name":"POD_NAMESPACE"}:
.: {}
f:name: {}
f:valueFrom:
.: {}
f:fieldRef: {}
f:image: {}
f:imagePullPolicy: {}
f:livenessProbe:
.: {}
f:failureThreshold: {}
f:httpGet:
.: {}
f:path: {}
f:port: {}
f:scheme: {}
f:initialDelaySeconds: {}
f:periodSeconds: {}
f:successThreshold: {}
f:timeoutSeconds: {}
f:name: {}
f:ports:
.: {}
k:{"containerPort":80,"protocol":"TCP"}:
.: {}
f:containerPort: {}
f:name: {}
f:protocol: {}
k:{"containerPort":443,"protocol":"TCP"}:
.: {}
f:containerPort: {}
f:name: {}
f:protocol: {}
f:readinessProbe:
.: {}
f:failureThreshold: {}
f:httpGet:
.: {}
f:path: {}
f:port: {}
f:scheme: {}
f:initialDelaySeconds: {}
f:periodSeconds: {}
f:successThreshold: {}
f:timeoutSeconds: {}
f:resources:
.: {}
f:limits: {}
f:requests: {}
f:securityContext:
.: {}
f:allowPrivilegeEscalation: {}
f:capabilities:
.: {}
f:add: {}
f:drop: {}
f:runAsUser: {}
f:terminationMessagePath: {}
f:terminationMessagePolicy: {}
f:dnsPolicy: {}
f:restartPolicy: {}
f:schedulerName: {}
f:securityContext: {}
f:serviceAccount: {}
f:serviceAccountName: {}
f:terminationGracePeriodSeconds: {}
- manager: kube-controller-manager
operation: Update
apiVersion: apps/v1
time: '2022-01-24T01:23:22Z'
fieldsType: FieldsV1
fieldsV1:
f:status:
f:conditions:
.: {}
k:{"type":"Available"}:
.: {}
f:type: {}
k:{"type":"Progressing"}:
.: {}
f:type: {}
- manager: Mozilla
operation: Update
apiVersion: apps/v1
time: '2022-01-28T23:18:41Z'
fieldsType: FieldsV1
fieldsV1:
f:spec:
f:template:
f:spec:
f:containers:
k:{"name":"nginx-ingress-controller"}:
f:resources:
f:limits:
f:cpu: {}
f:memory: {}
f:requests:
f:cpu: {}
f:memory: {}
- manager: kube-controller-manager
operation: Update
apiVersion: apps/v1
time: '2022-01-28T23:29:49Z'
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
f:deployment.kubernetes.io/revision: {}
f:status:
f:conditions:
k:{"type":"Available"}:
f:lastTransitionTime: {}
f:lastUpdateTime: {}
f:message: {}
f:reason: {}
f:status: {}
k:{"type":"Progressing"}:
f:lastTransitionTime: {}
f:lastUpdateTime: {}
f:message: {}
f:reason: {}
f:status: {}
f:observedGeneration: {}
f:replicas: {}
f:unavailableReplicas: {}
f:updatedReplicas: {}
subresource: status
spec:
replicas: 2
selector:
matchLabels:
app: nginx-ingress
app.kubernetes.io/component: controller
release: nginx-ingress
template:
metadata:
creationTimestamp: null
labels:
app: nginx-ingress
app.kubernetes.io/component: controller
component: controller
release: nginx-ingress
spec:
containers:
- name: nginx-ingress-controller
image: us.gcr.io/k8s-artifacts-prod/ingress-nginx/controller:v0.34.1
args:
- /nginx-ingress-controller
- '--default-backend-service=ingress/nginx-ingress-default-backend'
- '--election-id=ingress-controller-leader'
- '--ingress-class=nginx'
- '--configmap=ingress/nginx-ingress-controller'
ports:
- name: http
containerPort: 80
protocol: TCP
- name: https
containerPort: 443
protocol: TCP
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
resources:
limits:
cpu: 300m
memory: 512Mi
requests:
cpu: 200m
memory: 256Mi
livenessProbe:
httpGet:
path: /healthz
port: 10254
scheme: HTTP
initialDelaySeconds: 10
timeoutSeconds: 1
periodSeconds: 10
successThreshold: 1
failureThreshold: 3
readinessProbe:
httpGet:
path: /healthz
port: 10254
scheme: HTTP
initialDelaySeconds: 10
timeoutSeconds: 1
periodSeconds: 10
successThreshold: 1
failureThreshold: 3
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
imagePullPolicy: IfNotPresent
securityContext:
capabilities:
add:
- NET_BIND_SERVICE
drop:
- ALL
runAsUser: 101
allowPrivilegeEscalation: true
restartPolicy: Always
terminationGracePeriodSeconds: 60
dnsPolicy: ClusterFirst
serviceAccountName: nginx-ingress
serviceAccount: nginx-ingress
securityContext: {}
schedulerName: default-scheduler
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 25%
maxSurge: 25%
revisionHistoryLimit: 10
progressDeadlineSeconds: 600
status:
observedGeneration: 16
replicas: 3
updatedReplicas: 2
unavailableReplicas: 3
conditions:
- type: Available
status: 'False'
lastUpdateTime: '2022-01-28T22:58:07Z'
lastTransitionTime: '2022-01-28T22:58:07Z'
reason: MinimumReplicasUnavailable
message: Deployment does not have minimum availability.
- type: Progressing
status: 'False'
lastUpdateTime: '2022-01-28T23:29:49Z'
lastTransitionTime: '2022-01-28T23:29:49Z'
reason: ProgressDeadlineExceeded
message: >-
ReplicaSet "nginx-ingress-controller-59d9f94677" has timed out
progressing.
仅 NGINX Ingress Controller 1.0.0 及更高版本支持 Kubernetes 1.22 = https://github.com/kubernetes/ingress-nginx#support-versions-table
您需要在 Chart.yaml
中将您的 nginx-ingress-controller
Bitnami Helm Chart 升级到版本 9.0.0。然后运行一个helm upgrade nginx-ingress-controller bitnami/nginx-ingress-controller
.
您还应该定期特别更新您的入口控制器,因为 v0.34.1 版本非常旧,因为入口通常是从外部指定到您的集群的唯一入口。
@Philip Welz 的回答当然是正确的。由于在 Kubernetes v1.22 中删除了 v1beta1
Ingress API 版本,因此有必要升级入口控制器。但这不是我们面临的唯一问题,所以我决定制作一个“非常非常简短”的指南,说明我们如何最终得到一个健康的 运行 集群(5 天后),这样它可能会拯救其他人奋斗。
1。正在升级 YAML 文件中的 nginx-ingress-controller 版本。
这里我们只是把yaml文件中的版本改成了:
image: us.gcr.io/k8s-artifacts-prod/ingress-nginx/controller:v0.34.1
至
image: us.gcr.io/k8s-artifacts-prod/ingress-nginx/controller:v1.1.1
此操作后,在 v1.1.1 中生成了一个新的 pod。它开始很好,并且 运行 健康。不幸的是,这并没有让我们的微服务重新上线。现在我知道这可能是因为必须对现有的入口 yaml 文件进行一些更改,以使其与新版本的入口控制器兼容。所以现在直接进入第2步(下面两个headers)。
暂时不要执行此步骤,只有在第 2 步对您失败时才执行:重新安装 nginx-ingress-controller
我们决定在这种情况下,我们将按照微软的官方文档从头开始重新安装控制器:https://docs.microsoft.com/en-us/azure/aks/ingress-basic?tabs=azure-cli。请注意,这可能会更改入口控制器的外部 IP 地址。在我们的案例中,最简单的方法是删除整个 ingress
命名空间:
kubectl delete namespace ingress
不幸的是,这并没有删除入口 class,因此需要额外的:
kubectl delete ingressclass nginx --all-namespaces
然后安装新控制器:
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update
helm install ingress-nginx ingress-nginx/ingress-nginx --create-namespace --namespace ingress
如果您在步骤 1 中升级后重新安装 nginx-ingress-controller 或更改了 IP 地址:更新您的网络安全组、负载平衡器和域 DNS
在您的 AKS 资源组中应该有 Network security group
类型的资源。它包含入站和出站安全规则(我知道它用作防火墙)。应该有一个由 Kubernetes 自动管理的默认网络安全组,IP 地址应该在那里自动刷新。
不幸的是,我们还有一个额外的自定义。我们不得不在那里手动更新规则。
同一个资源组中应该有一个Load balancer
类型的资源。在 Frontend IP configuration
选项卡中仔细检查 IP 地址是否反映了您的新 IP 地址。作为奖励,您可以在 Backend pools
选项卡中仔细检查那里的地址是否与您的内部节点 IP 匹配。
最后别忘了调整您的域 DNS 记录。
2。升级您的入口 yaml 配置文件以匹配语法更改
我们花了一些时间来确定一个工作模板,但实际上从上面提到的 Microsoft 教程中安装 helloworld 应用程序对我们帮助很大。我们从这里开始:
kind: Ingress
apiVersion: networking.k8s.io/v1
metadata:
name: hello-world-ingress
namespace: services
annotations:
kubernetes.io/ingress.class: nginx
nginx.ingress.kubernetes.io/rewrite-target: /
nginx.ingress.kubernetes.io/ssl-redirect: 'false'
nginx.ingress.kubernetes.io/use-regex: 'true'
rules:
- http:
paths:
- path: /hello-world-one(/|$)(.*)
pathType: Prefix
backend:
service:
name: aks-helloworld-one
port:
number: 80
在逐步引入更改后,我们终于做到了下面的内容。但我很确定问题是我们缺少 nginx.ingress.kubernetes.io/use-regex: 'true'
条目:
kind: Ingress
apiVersion: networking.k8s.io/v1
metadata:
name: example-api
namespace: services
annotations:
kubernetes.io/ingress.class: nginx
nginx.ingress.kubernetes.io/configuration-snippet: |
more_set_headers "X-Forwarded-By: example-api";
nginx.ingress.kubernetes.io/rewrite-target: /example-api
nginx.ingress.kubernetes.io/ssl-redirect: 'true'
nginx.ingress.kubernetes.io/use-regex: 'true'
spec:
tls:
- hosts:
- services.example.com
secretName: tls-secret
rules:
- host: services.example.com
http:
paths:
- path: /example-api
pathType: ImplementationSpecific
backend:
service:
name: example-api
port:
number: 80
以防万一有人想安装,出于测试目的,helloworld 应用程序然后 yamls 如下所示:
apiVersion: apps/v1
kind: Deployment
metadata:
name: aks-helloworld-one
spec:
replicas: 1
selector:
matchLabels:
app: aks-helloworld-one
template:
metadata:
labels:
app: aks-helloworld-one
spec:
containers:
- name: aks-helloworld-one
image: mcr.microsoft.com/azuredocs/aks-helloworld:v1
ports:
- containerPort: 80
env:
- name: TITLE
value: "Welcome to Azure Kubernetes Service (AKS)"
---
apiVersion: v1
kind: Service
metadata:
name: aks-helloworld-one
spec:
type: ClusterIP
ports:
- port: 80
selector:
app: aks-helloworld-one
3。处理其他崩溃的应用程序...
另一个在我们集群中崩溃的应用程序是 cert-manager
。这是 1.0.1 版,所以,首先,我们将其升级到 1.1.1 版:
helm repo add jetstack https://charts.jetstack.io
helm repo update
helm upgrade --namespace cert-manager --version 1.1 cert-manager jetstack/cert-manager
这创造了一个全新的健康豆荚。我们很高兴并决定继续使用 v1.1,因为我们有点担心升级到更高版本时必须采取的额外措施(查看本页底部 https://cert-manager.io/docs/installation/upgrading/)。
集群现在终于修复了。是吧?
4。 ...但一定要检查兼容性图表!
嗯.. 现在我们知道 cert-manager 仅从 1.5 版开始与 Kubernetes v1.22 兼容。我们很不走运,就在那天晚上,我们的 SSL 证书从到期日起超过了 30 天的门槛,所以 cert-manager 决定续订证书!操作失败,cert-manager 崩溃。 Kubernetes 回退到“Kubernetes 假证书”。由于证书无效,浏览器终止了流量,网页再次关闭。 修复是升级到 1.5 并同时升级 CRD:
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.5.4/cert-manager.crds.yaml
helm upgrade --namespace cert-manager --version 1.5 cert-manager jetstack/cert-manager
在此之后,cert-manager 的新实例成功刷新了我们的证书。群集再次保存。
如果你需要强制续订,你可以看看这个问题:https://github.com/jetstack/cert-manager/issues/2641
@ajcann 建议在证书中添加 renewBefore
属性:
kubectl get certs --no-headers=true | awk '{print }' | xargs -n 1 kubectl patch certificate --patch '
- op: replace
path: /spec/renewBefore
value: 1440h
' --type=json
然后等待证书更新,然后删除 属性:
kubectl get certs --no-headers=true | awk '{print }' | xargs -n 1 kubectl patch certificate --patch '
- op: remove
path: /spec/renewBefore
' --type=json