ECS 任务无法提取机密或注册表身份验证

ECS task unable to pull secrets or registry auth

我有一个 CDK 项目,它创建了一个在 ECS 上部署应用程序的 CodePipeline。我之前都可以使用,但 VPC 使用的是 NAT 网关,最终导致成本太高。所以现在我试图在不需要 NAT 网关的情况下重新创建项目。我快到了,但是当 ECS 服务尝试启动任务时,我现在 运行 遇到了问题。所有任务都无法启动,出现以下错误:

ResourceInitializationError: unable to pull secrets or registry auth: execution resource retrieval failed: unable to retrieve secret from asm: service call has been retried 5 time(s): failed to fetch secret

在这一点上,我有点忘记了我尝试过的不同事情,但我会 post 这里的相关部分以及我的一些尝试。

const repository = ECR.Repository.fromRepositoryAttributes(
  this,
  "ecr-repository",
  {
    repositoryArn: props.repository.arn,
    repositoryName: props.repository.name,
  }
);

// vpc
const vpc = new EC2.Vpc(this, this.resourceName(props, "vpc"), {
  maxAzs: 2,
  natGateways: 0,
  enableDnsSupport: true,
});
const vpcSecurityGroup = new SecurityGroup(this, "vpc-security-group", {
  vpc: vpc,
  allowAllOutbound: true,
});
// tried this to allow the task to access secrets manager
const vpcEndpoint = new EC2.InterfaceVpcEndpoint(this, "secrets-manager-task-vpc-endpoint", {
  vpc: vpc,
  service: EC2.InterfaceVpcEndpointAwsService.SSM,
});

const secrets = SecretsManager.Secret.fromSecretCompleteArn(
  this,
  "secrets",
  props.secrets.arn
);

const cluster = new ECS.Cluster(this, this.resourceName(props, "cluster"), {
  vpc: vpc,
  clusterName: `api-cluster`,
});

const ecsService = new EcsPatterns.ApplicationLoadBalancedFargateService(
  this,
  "ecs-service",
  {
    taskSubnets: {
      subnetType: SubnetType.PUBLIC,
    },
    securityGroups: [vpcSecurityGroup],
    serviceName: "api-service",
    cluster: cluster,
    cpu: 256,
    desiredCount: props.scaling.desiredCount,
    taskImageOptions: {
      image: ECS.ContainerImage.fromEcrRepository(
        repository,
        this.ecrTagNameParameter.stringValue
      ),
      secrets: getApplicationSecrets(secrets), // returns 
      logDriver: LogDriver.awsLogs({
        streamPrefix: "api",
        logGroup: new LogGroup(this, "ecs-task-log-group", {
          logGroupName: `${props.environment}-api`,
        }),
        logRetention: RetentionDays.TWO_MONTHS,
      }),
    },
    memoryLimitMiB: 512,
    publicLoadBalancer: true,
    domainZone: this.hostedZone,
    certificate: this.certificate,
    redirectHTTP: true,
  }
);

const scalableTarget = ecsService.service.autoScaleTaskCount({
  minCapacity: props.scaling.desiredCount,
  maxCapacity: props.scaling.maxCount,
});

scalableTarget.scaleOnCpuUtilization("cpu-scaling", {
  targetUtilizationPercent: props.scaling.cpuPercentage,
});
scalableTarget.scaleOnMemoryUtilization("memory-scaling", {
  targetUtilizationPercent: props.scaling.memoryPercentage,
});

secrets.grantRead(ecsService.taskDefinition.taskRole);
repository.grantPull(ecsService.taskDefinition.taskRole);

我在某处读到它可能与 Fargate 1.4.0 版和 1.3.0 版有关,但我不确定我需要更改什么以允许任务访问它们需要的内容 运行.

您需要为 Secrets Manager、ECR(两种类型的端点)、CloudWatch 创建一个接口端点,并为 S3 创建一个网关端点。

参考documentation on the topic.

这是 Python 中的示例,它在 TS 中的作用相同:

vpc.add_interface_endpoint(
    "secretsmanager_endpoint",
    service=ec2.InterfaceVpcEndpointAwsService.SECRETS_MANAGER,
)
vpc.add_interface_endpoint(
    "ecr_docker_endpoint",
    service=ec2.InterfaceVpcEndpointAwsService.ECR_DOCKER,
)
vpc.add_interface_endpoint(
    "ecr_endpoint",
    service=ec2.InterfaceVpcEndpointAwsService.ECR,
)
vpc.add_interface_endpoint(
    "cloudwatch_logs_endpoint",
    service=ec2.InterfaceVpcEndpointAwsService.CLOUDWATCH_LOGS,
)
vpc.add_gateway_endpoint(
    "s3_endpoint",
    service=ec2.GatewayVpcEndpointAwsService.S3
)

请记住,接口端点也需要花钱,而且可能不会比 NAT 便宜。