无法访问 AWS Glue 以在 DataBrew 中建立连接

Unable to reach AWS Glue to get connection in DataBrew

我正在尝试使用与 Redshift 的连接开始使用 AWS Databrew。我确实添加了与 AWS Glue 的连接,它在测试时可以正常工作。当 databrew 尝试使用此连接时,会出现以下错误。 databrew 和 glue 都在同一个区域。

{"error":"Failure reading from input connection AwsGlueDataBrew-databrew-to-redshift with \"public.table\": Unable to reach AWS Glue to get connection AwsGlueDataBrew-databrew-to-redshift. Exception: Connect timeout on endpoint URL: \"https://glue.us-west-2.amazonaws.com/\""}

项目附带的政策是这样的:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "glue:GetDatabases",
                "glue:GetPartitions",
                "glue:GetTable",
                "glue:GetTables",
                "glue:GetConnection"
            ],
            "Resource": [
                "*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket",
                "s3:GetObject"
            ],
            "Resource": [
                "arn:aws:s3:::databrew-public-datasets-*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeVpcEndpoints",
                "ec2:DescribeRouteTables",
                "ec2:DescribeNetworkInterfaces",
                "ec2:DescribeSecurityGroups",
                "ec2:DescribeSubnets",
                "ec2:DescribeVpcAttribute",
                "ec2:CreateNetworkInterface"
            ],
            "Resource": [
                "*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": "ec2:DeleteNetworkInterface",
            "Condition": {
                "StringLike": {
                    "aws:ResourceTag/aws-glue-service-resource": "*"
                }
            },
            "Resource": [
                "*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "ec2:CreateTags",
                "ec2:DeleteTags"
            ],
            "Condition": {
                "ForAllValues:StringEquals": {
                    "aws:TagKeys": [
                        "aws-glue-service-resource"
                    ]
                }
            },
            "Resource": [
                "arn:aws:ec2:*:*:network-interface/*",
                "arn:aws:ec2:*:*:security-group/*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "logs:CreateLogGroup",
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Resource": [
                "arn:aws:logs:*:*:log-group:/aws-glue-databrew/*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "lakeformation:GetDataAccess"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "secretsmanager:GetSecretValue"
            ],
            "Resource": "arn:aws:secretsmanager:*:*:secret:databrew!default-*"
        }
    ]
}

有人可以帮我解决这个问题吗?

谢谢。

发生这种情况是因为当您尝试使用 project/job 时,DataBrew 服务正试图到达 AWS Glue 服务端点。 (AWS Glue 测试连接功能的工作方式不同)

您有两种方法可以解决这个问题

  1. 在您的 VPC 中为 AWS Glue 服务附加一个 VPC 端点。这将确保能够安全地访问 Glue 服务。
  2. 将您的 VPC 开放到 public 互联网,这样来自您的 VPC 的任何流量都可以通过互联网传输,并且 API 对 Glue 服务的调用会成功。

我推荐选项 #1,因为它更安全(也更简单),但会带来一些开销。