无法 sts:AssumeRole 使用 CDK 生成的 EKS 集群的服务帐户

Cannot sts:AssumeRole with a service account for CDK-generated EKS cluster

使用 CDK 部署了 EKS 1.21 集群,然后使用 https://cert-manager.io/docs/installation/ 作为指导,我尝试安装证书管理器,最终目标是使用 Let's Encrypt 证书来支持启用 TLS 的服务。

在我的 Stack 代码中创建 IAM 策略:

...
        var externalDnsPolicy = new PolicyDocument(
            new PolicyDocumentProps
            {
                Statements = new[]
                {
                    new PolicyStatement(
                        new PolicyStatementProps
                        {
                            Actions = new[] { "route53:ChangeResourceRecordSets", },
                            Resources = new[] { "arn:aws:route53:::hostedzone/*", },
                            Effect = Effect.ALLOW,
                        }
                    ),
                    new PolicyStatement(
                        new PolicyStatementProps
                        {
                            Actions = new[]
                            {
                                "route53:ListHostedZones",
                                "route53:ListResourceRecordSets",
                            },
                            Resources = new[] { "*", },
                            Effect = Effect.ALLOW,
                        }
                    ),
                }
            }
        );
        var AllowExternalDNSUpdatesRole = new Role(
            this,
            "AllowExternalDNSUpdatesRole",
            new RoleProps
            {
                Description = "Route53 External DNS Role",
                InlinePolicies = new Dictionary<string, PolicyDocument>
                {
                    ["AllowExternalDNSUpdates"] = externalDnsPolicy
                },
                RoleName = "AllowExternalDNSUpdatesRole",
                AssumedBy = new ServicePrincipal("eks.amazonaws.com"),
            }
        );

        var certManagerPolicy = new PolicyDocument(new PolicyDocumentProps {
          Statements = new []
          {
            new PolicyStatement(new PolicyStatementProps 
            {
              Effect = Effect.ALLOW,
              Actions = new []
              {
                "route53:GetChange",
              },
              Resources = new []
              {
                "arn:aws:route53:::change/*",
              }
            }),
            new PolicyStatement(new PolicyStatementProps
            {
              Effect = Effect.ALLOW,
              Actions = new []
              {
                "route53:ChangeResourceRecordSets",
                "route53:ListResourceRecordSets"
              },
              Resources = new []
              {
                "arn:aws:route53:::hostedzone/*",
              },
            }),
          },
        });
        var AllowCertManagerRole = new Role(
            this,
            "AllowCertManagerRole",
            new RoleProps
            {
                Description = "Route53 Cert Manager Role",
                InlinePolicies = new Dictionary<string, PolicyDocument>
                {
                    ["AllowCertManager"] = certManagerPolicy
                },
                RoleName = "AllowCertManagerRole",
                AssumedBy = new ServicePrincipal("eks.amazonaws.com"),
            }
        );
...

我的证书颁发者清单:

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: cert-issuer
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::XREMOVEDX:role/AllowCertManagerRole
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: cert-issuer-viewer
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cert-issuer
subjects:
- kind: ServiceAccount
  name: cert-issuer
  namespace: default
---
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: sometls-net-letsencrypt
spec:
  acme:
    email: domain@sometls.net
    preferredChain: ""
    server: https://acme-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      name: sometls-net-letsencrypt-account-key
    solvers:
    - dns01:
        route53:
          hostedZoneID: Z999999999999
          region: us-east-2
          role: arn:aws:iam::XREMOVEDX:role/AllowExternalDNSUpdatesRole
      selector:
        dnsZones:
        - sometls.net
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: sometls-cluster-lets-encrypt
spec:
  secretName: somtls-cluster-lets-encrypt
  issuerRef:
    name: sometls-net-letsencrypt
    kind: ClusterIssuer
    group: cert-manager.io
  subject:
    organizations:
      - sometls
  dnsNames:
    - "*.sometls.net"

但是我收到了这些错误的垃圾邮件,而且 cert-manager 不起作用:

(combined from similar events): Error presenting challenge: error instantiating route53 challenge solver: unable to assume role: AccessDenied: User: arn:aws:sts::XREMOVEDX:assumed-role/EksStackEast-EksClusterNodegroupDefaultC-U7IJ1PNZ2123/i-007c425b7a5e39123 is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::XREMOVEDX:role/AllowCertManagerRole status code: 403, request id: 2bd885a2-97a0-4a21-b017-40e099cb4123

我对 IAM 角色如何允许 Kubernetes ServiceAccount 承担它们非常怀疑。我一定是遗漏了一些连接件,它让服务账户 (IRSA) 的 EKS IAM 角色的魔力发生了。

请帮忙!

更新:使用 CfnJson 我可以创建角色,它看起来像这样:

{
    "Role": {
        "Path": "/",
        "RoleName": "AllowCertManagerRole",
        "RoleId": "REDACTED",
        "Arn": "arn:aws:iam::REDACTED:role/AllowCertManagerRole",
        "CreateDate": "2022-03-24T21:42:32+00:00",
        "AssumeRolePolicyDocument": {
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Effect": "Allow",
                    "Principal": {
                        "Federated": "arn:aws:iam::REDACTED:oidc-provider/oidc.eks.us-east-2.amazonaws.com/id/REDACTED"
                    },
                    "Action": "sts:AssumeRoleWithWebIdentity",
                    "Condition": {
                        "StringLike": {
                            "oidc.eks.us-east-2.amazonaws.com/id/REDACTED:sub": "system:serviceaccount:*:cert-issuer"
                        }
                    }
                }
            ]
        },
        "Description": "Route53 Cert Manager Role",
        "MaxSessionDuration": 3600,
        "Tags": [
            {
                "Key": "dynasty",
                "Value": "sometls-1.0"
            }
        ],
        "RoleLastUsed": {}
    }
}

我仍然遇到同样的错误。新角色中的条件使用“StringLike”运算符。不确定这是否正确,而且我不确定在为条件设置 IDictionary 时如何避免使用非派生左值。另外 - 错误消息是相同的,因为它期望能够 sts:AssumeRole 而不是 sts:AssumeRoleWithWebIdentity ...我尝试将角色中的操作更改为 sts:AssumeRole 具有相同的效果.

更新#2:

cert-manager 的实际问题是我错过了 AWS IRSA 工作所需的安装清单修改。 https://cert-manager.io/docs/configuration/acme/dns01/route53/#service-annotation ...事实证明这真的很重要。

对于任何想了解如何在 C# 中使用条件将 OIDC 提供程序添加为 AssumedBy 主体的人,请参阅下面的片段。我原以为 AWS CDK 中会有一个方便的方法来自动处理这些阴谋。我没找到...

...
        var Cluster = new Cluster(this,"EksCluster", new ClusterProps
        { ... });
...
        var CertIssuerCondition = new CfnJson(this, "CertIssuerCondition", new CfnJsonProps 
        {
            Value = new Dictionary<string, object>
            {
                {$"{Cluster.ClusterOpenIdConnectIssuer}:sub", "system:serviceaccount:*:cert-manager"},
            }
        });

        var certManagerPolicy = new PolicyDocument(new PolicyDocumentProps {
          Statements = new []
          {
            new PolicyStatement(new PolicyStatementProps 
            {
              Effect = Effect.ALLOW,
              Actions = new []
              {
                "route53:GetChange",
              },
              Resources = new []
              {
                "arn:aws:route53:::change/*",
              }
            }),
            new PolicyStatement(new PolicyStatementProps
            {
              Effect = Effect.ALLOW,
              Actions = new []
              {
                "route53:ChangeResourceRecordSets",
                "route53:ListResourceRecordSets"
              },
              Resources = new []
              {
                "arn:aws:route53:::hostedzone/*",
              },
            }),
            new PolicyStatement(new PolicyStatementProps
            {
                Effect = Effect.ALLOW,
                Actions = new[]
                {
                    "route53:ListHostedZonesByName",
                },
                Resources = new[]
                {
                    "*",
                }
            }),
          },
        });
        var AllowCertManagerRole = new Role(
            this,
            "AllowCertManagerRole",
            new RoleProps
            {
                Description = "Route53 Cert Manager Role",
                InlinePolicies = new Dictionary<string, PolicyDocument>
                {
                    ["AllowCertManager"] = certManagerPolicy
                },
                RoleName = "AllowCertManagerRole",
                AssumedBy = new FederatedPrincipal(Cluster.OpenIdConnectProvider.OpenIdConnectProviderArn, new Dictionary<string, object>
                { 
                    ["StringLike"] = CertIssuerCondition,
                },"sts:AssumeRoleWithWebIdentity")
            }
        );

你的 IAM 角色的信任关系在我看来是错误的。

您需要使用指向 EKS 集群的 OIDC 提供商的联合委托人,最好使用能够正确反映您的服务帐户和命名空间名称的条件。

校长必须看起来像这样:

const namespaceName = 'cert-manager'
const serviceAccountName = 'cert-issuer'

// If you're deploying EKS with CloudFormation/CDK you could for example export the OIDC provider ARN and get it with Fn.importValue(...) in your stack.
const oidcProviderUrl = 'oidc.eks.YOUR-REGION.amazonaws.com/id/REDACTED';

// You can use wildcards for the namespace name and/or service account name if you want to have a less restrictive condition.
const conditionValue = `system:serviceaccount:${namespaceName}:${serviceAccountName}`;
const roleCondition = new CfnJson(this.stack, `CertIssuerRoleCondition`, {
    value: { [`${oidcProviderUrl}:sub`]: conditionValue }
});

// If you're deploying EKS with CloudFormation/CDK you could for example export the OIDC provider ARN and get it with Fn.importValue(...) in your stack.
const oidcProviderArn = 'arn:aws:iam::REDACTED:oidc-provider/oidc.eks.YOUR-REGION.amazonaws.com/id/REDACTED';

const principal = new FederatedPrincipal(oidcProviderArn, roleCondition, 'sts:AssumeRoleWithWebIdentity');

// Now use that principal for your IAM role.