AWS EKS,我在所有 ArgoCD 中遇到了容忍错误 pods

AWS EKS, I encountered a tolerations error in all ArgoCD pods

我使用以下命令在 EKS fargate 集群中安装 Argo CD。

$ VERSION=$(curl --silent "https://api.github.com/repos/argoproj/argo-cd/releases/latest" | grep '"tag_name"' | sed -E 's/.*"([^"]+)".*//')
$ sudo curl --silent --location -o /usr/local/bin/argocd https://github.com/argoproj/argo-cd/releases/download/$VERSION/argocd-linux-amd64 
$ sudo chmod +x /usr/local/bin/argocd
$ kubectl create namespace argocd 
$ kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml

然后我查了namespace资源,pad状态ping如下。

NAME                                      READY   STATUS    RESTARTS   AGE
pod/argocd-application-controller-0       0/1     Pending   0          3m47s
pod/argocd-dex-server-65bf5f4fc7-dh4ql    0/1     Pending   0          3m47s
pod/argocd-redis-d486999b7-rn9g8          0/1     Pending   0          3m47s
pod/argocd-repo-server-8465d84869-jwj47   0/1     Pending   0          3m47s
pod/argocd-server-87b47d787-29mcp         0/1     Pending   0          3m47s

NAME                            TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                      AGE
service/argocd-dex-server       ClusterIP   172.20.30.228    <none>        5556/TCP,5557/TCP,5558/TCP   3m47s
service/argocd-metrics          ClusterIP   172.20.229.100   <none>        8082/TCP                     3m47s
service/argocd-redis            ClusterIP   172.20.240.101   <none>        6379/TCP                     3m47s
service/argocd-repo-server      ClusterIP   172.20.133.210   <none>        8081/TCP,8084/TCP            3m47s
service/argocd-server           ClusterIP   172.20.39.79     <none>        80/TCP,443/TCP               3m47s
service/argocd-server-metrics   ClusterIP   172.20.32.2      <none>        8083/TCP                     3m47s

NAME                                 READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/argocd-dex-server    0/1     1            0           3m47s
deployment.apps/argocd-redis         0/1     1            0           3m47s
deployment.apps/argocd-repo-server   0/1     1            0           3m47s
deployment.apps/argocd-server        0/1     1            0           3m47s

NAME                                            DESIRED   CURRENT   READY   AGE
replicaset.apps/argocd-dex-server-65bf5f4fc7    1         1         0       3m47s
replicaset.apps/argocd-redis-d486999b7          1         1         0       3m47s
replicaset.apps/argocd-repo-server-8465d84869   1         1         0       3m47s
replicaset.apps/argocd-server-87b47d787         1         1         0       3m47s

NAME                                             READY   AGE
statefulset.apps/argocd-application-controller   0/1     3m47s

然后我查找其中一个pods,出现以下错误。

Events:
  Type     Reason            Age                 From               Message
  ----     ------            ----                ----               -------
  Warning  FailedScheduling  70s (x4 over 4m2s)  default-scheduler  0/5 nodes are available: 5 node(s) had taint {eks.amazonaws.com/compute-type: fargate}, that the pod didn't tolerate.

所有pad都有同样的错误。

我该怎么办?

您的 Fargate 已创建 运行 部署到“dev-cluster”命名空间:

{
  "fargateProfile": {
    "status": "ACTIVE",
    ...
    "selectors": [{
        "namespace": "dev-cluster"
    }]
  ...

但是您将 argocd 部署到另一个命名空间:

$ kubectl create namespace argocd

$ kubectl apply -n argocd ...

尝试将您的 argocd 应用到“dev-cluster”命名空间。