通过私有子网访问跨区域 s3 端点

Access cross region s3 endpoint through private subnet

我有一个在 eu-west-1 私有子网 中旋转的 EMR。我在路由 table 中为 S3 定义了一个网关端点。我必须访问 AWS 公开的 public bucket/location:s3://us-east-1.elasticmapreduce/libs/script-runner/script-runner.jar 给出以下错误。我认为这是因为不允许通过网关端点进行跨区域访问。 我可以访问同一区域的其他存储桶。是否有解决方法来访问它,也许是通过 NAT?路由 table 已经有 NAT,但请求无法通过它。

2019-04-10T05:17:06.849Z INFO Ensure step 1 jar file s3://us-east-1.elasticmapreduce/libs/script-runner/script-runner.jar
INFO Failed to download: s3://us-east-1.elasticmapreduce/libs/script-runner/script-runner.jar
java.lang.RuntimeException: Error whilst fetching 's3://us-east-1.elasticmapreduce/libs/script-runner/script-runner.jar'
    at aws157.instancecontroller.util.S3Wrapper.fetchS3HadoopFileToLocal(S3Wrapper.java:412)
    at aws157.instancecontroller.util.S3Wrapper.fetchHadoopFileToLocal(S3Wrapper.java:351)
    at aws157.instancecontroller.master.steprunner.HadoopJarStepRunner$Runner.<init>(HadoopJarStepRunner.java:243)
    at aws157.instancecontroller.master.steprunner.HadoopJarStepRunner.createRunner(HadoopJarStepRunner.java:152)
    at aws157.instancecontroller.master.steprunner.HadoopJarStepRunner.createRunner(HadoopJarStepRunner.java:146)
    at aws157.instancecontroller.master.steprunner.StepExecutor.runStep(StepExecutor.java:136)
    at aws157.instancecontroller.master.steprunner.StepExecutor.run(StepExecutor.java:70)
    at aws157.instancecontroller.master.steprunner.StepExecutionManager.enqueueStep(StepExecutionManager.java:248)
    at aws157.instancecontroller.master.steprunner.StepExecutionManager.doRun(StepExecutionManager.java:195)
    at aws157.instancecontroller.master.steprunner.StepExecutionManager.access[=10=]0(StepExecutionManager.java:33)
    at aws157.instancecontroller.master.steprunner.StepExecutionManager.run(StepExecutionManager.java:94)
Caused by: com.amazonaws.AmazonClientException: Unable to execute HTTP request: connect timed out
    at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:618)
    at com.amazonaws.http.AmazonHttpClient.doExecute(AmazonHttpClient.java:376)
    at com.amazonaws.http.AmazonHttpClient.executeWithTimer(AmazonHttpClient.java:338)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:287)
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3826)
    at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1143)
    at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1021)
    at aws157.instancecontroller.util.S3Wrapper.copyS3ObjectToFile(S3Wrapper.java:303)
    at aws157.instancecontroller.util.S3Wrapper.getFile(S3Wrapper.java:287)
    at aws157.instancecontroller.util.S3Wrapper.fetchS3HadoopFileToLocal(S3Wrapper.java:399)
    ... 10 more

S3 网关端点永远不会尝试路由跨区域流量,但 NAT 网关应该自动处理此流量。如果断言 NAT 网关就位,那么 Unable to execute HTTP request: connect timed out 意味着 NAT 网关(或与其关联的设置)配置错误。

如评论中所述,这里的具体问题是 NAT 网关是在它打算服务的同一子网上配置的。这不是一个有效的配置,因为在这种情况下,NAT 网关会尝试访问互联网...通过自身...因为它从其部署的子网获取默认路由。

To create a NAT gateway, you must specify the public subnet in which the NAT gateway should reside.

...

After you've created a NAT gateway, you must update the route table associated with one or more of your private subnets to point Internet-bound traffic to the NAT gateway. This enables instances in your private subnets to communicate with the internet. (emphasis added)

https://docs.aws.amazon.com/vpc/latest/userguide/vpc-nat-gateway.html#nat-gateway-basics