在 CloudFormation 上使用 JdbcTargets 指定 Glue::Crawler

Specifying Glue::Crawler with JdbcTargets on CloudFormation

我正在尝试设置 AWS Glue 以使用 CloudFormation 从 RDS Postgres 读取数据。为此,我需要使用 JdbcTarget 选项创建一个爬虫。 (或者我没有?)

  Records:
    Type: 'AWS::Glue::Crawler'
    Properties:
      DatabaseName: transact
      Targets:
        JdbcTargets:
          - Path: "jdbc:postgresql://host:5432/database"
      Role: !Ref ETLAgent

但是在 CloudFormation 上创建堆栈将失败并显示:

CREATE_FAILED | AWS::Glue::Crawler | Records | Connection name cannot be equal to null or empty. (Service: AWSGlue; Status Code: 400; Error Code: InvalidInputException;

尽管 the docs 说:

ConnectionName

The name of the connection to use for the JDBC target.

Required: No

允许我从 RDS 读取的使用 CloudFormation 的正确 AWS Glue 设置是什么?

您确实缺少 ConnectionName 属性,它应该包含您缺少的连接资源的名称。您设置的 Path 属性 用于 select schemas/tables 抓取(dbname/%/% 包括所有)。详情请咨询 CloudFormation docs on Crawler JDBCTarget

您的模板应该类似于

  MyDbConnection:
    Type: "AWS::Glue::Connection"
    Properties:
      CatalogId: !Ref 'AWS::AccountId'
      ConnectionInput:
        Description: "JDBC Connection to my RDS DB"
        PhysicalConnectionRequirements:
          AvailabilityZone: "eu-central-1a"
          SecurityGroupIdList:
           - my-sec-group-id
          SubnetId: my-subnet-id
        ConnectionType: "JDBC"
        ConnectionProperties:
          "JDBC_CONNECTION_URL": "jdbc:postgresql://host:5432/database"
          "USERNAME": "my-db-username"
          "PASSWORD": "my-password"
  Records:
    Type: 'AWS::Glue::Crawler'
    Properties:
      DatabaseName: transact
      Targets:
        JdbcTargets:
          - ConnectionName: !Ref MyDbConnection
            Path: "database/%/%"
      Role: !Ref ETLAgent