无法使用 Java Sdk 创建 Aws Emr 集群

Unable to create Aws Emr Cluster using Java Sdk

我正在使用 Java SDK(下面是一个代码片段)启动一个 AWS ERM 集群,它运行得非常好。

BasicAWSCredentials awsCreds = new BasicAWSCredentials(accessKeyId, secretAccessKeyId);
AmazonElasticMapReduce emrClient = AmazonElasticMapReduceClientBuilder.standard()
                    .withCredentials(new AWSStaticCredentialsProvider(awsCreds))
                    .withRegion(region)
                    .build();

JobFlowInstancesConfig jobFlowInstanceConfig = new JobFlowInstancesConfig()
                .withEc2SubnetId("subnetId")
                .withEc2KeyName("ec2KeyName") 
                .withInstanceCount(3) 
                .withKeepJobFlowAliveWhenNoSteps(true)    
                .withMasterInstanceType(c5.4xlarge)
                .withSlaveInstanceType(c5.4xlarge); 


        // create the cluster
        RunJobFlowRequest request = new RunJobFlowRequest()
                .withName("clusterName")
                .withReleaseLabel("emr-5.23.0")
                .withApplications("<Added following in applications Hadoop,Spark,Ganglia,Zeppelin>")
                .withLogUri("s3 path")
                .withServiceRole("EMR_DefaultRole")
                .withJobFlowRole("EMR_EC2_DefaultRole")
                .withInstances(jobFlowInstanceConfig);

RunJobFlowResult runJobFlowResult = emrClient.runJobFlow(request); 

稍后在另一个 AWS 环境中,我们的 AWS 团队创建了一个角色,用于从特定的 EC2 实例创建集群。但是我无法创建集群。 下面是带有额外配置的代码片段,以及我注意到的关于我以前的配置的更改。

  1. 没有accessKeyId和secretAccessKeyId
  2. EMR_EC2_DefaultRole 更改配置的角色
  3. 已添加安全配置

    AmazonElasticMapReduce emrClient = AmazonElasticMapReduceClientBuilder.standard()
                    .withRegion(region)
                    .build();
    
    JobFlowInstancesConfig jobFlowInstanceConfig = new JobFlowInstancesConfig()
                .withEc2SubnetId("subnetId")
                .withEc2KeyName("ec2KeyName") 
                .withInstanceCount(3) 
                .withKeepJobFlowAliveWhenNoSteps(true)    
                .withMasterInstanceType(c5.4xlarge)
                .withSlaveInstanceType(c5.4xlarge); 
    
    RunJobFlowRequest request = new RunJobFlowRequest()
                .withName("clusterName")
                .withReleaseLabel("emr-5.23.0")
                .withApplications("<Added following in applications Hadoop,Spark,Ganglia,Zeppelin>")
                .withLogUri("s3 path")
                .withServiceRole("EMR_DefaultRole")
                .withJobFlowRole("name-of-role-created")
                .withInstances(jobFlowInstanceConfig)
                .withSecurityConfiguration("Security configuration Name");
    
    RunJobFlowResult runJobFlowResult = emrClient.runJobFlow(request);
    

我收到以下错误:

com.amazonaws.services.elasticmapreduce.model.AmazonElasticMapReduceException: Role '' is not well-formed. (Service: AmazonElasticMapReduce; Status Code: 400; Error Code: ValidationException; Request ID: 0d5ed77e-ed0e-49fd-bd33-f88213ce08c3)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1701)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1356)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1102)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:759)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:733)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:715)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access0(AmazonHttpClient.java:675)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:657)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:521)
    at com.amazonaws.services.elasticmapreduce.AmazonElasticMapReduceClient.doInvoke(AmazonElasticMapReduceClient.java:2043)
    at com.amazonaws.services.elasticmapreduce.AmazonElasticMapReduceClient.invoke(AmazonElasticMapReduceClient.java:2010)
    at com.amazonaws.services.elasticmapreduce.AmazonElasticMapReduceClient.invoke(AmazonElasticMapReduceClient.java:1999)
    at com.amazonaws.services.elasticmapreduce.AmazonElasticMapReduceClient.executeRunJobFlow(AmazonElasticMapReduceClient.java:1770)
    at com.amazonaws.services.elasticmapreduce.AmazonElasticMapReduceClient.runJobFlow(AmazonElasticMapReduceClient.java:1742)

由于上面的错误说角色格式不正确,我尝试了不同的格式,但仍然遇到同样的问题。以下是我在 .withJobFlowRole("name-of-role-created")

中添加的不同格式
arn:aws:iam::639116131780:role/name-of-role-created
arn:aws:iam::639116131780:instance-profile/name-of-role-created
arn:aws:iam::639116131780:role/name-of-role-created/*
arn:aws:iam::639116131780:instance-profile/name-of-role-created/*
arn:aws:sts::639116131780:assumed-role/name-of-role-created
arn:aws:sts::639116131780:assumed-role/name-of-role-created/*

我每次都遇到同样的错误。

com.amazonaws.services.elasticmapreduce.model.AmazonElasticMapReduceException: Role 'arn:aws:iam::639116131780:role/name-of-role-created' is not well-formed. (Service: AmazonElasticMapReduce; Status Code: 400; Error Code: ValidationException; Request ID: 0d5ed77e-ed0e-49fd-bd33-f88213ce08c3)

根据 docsJobFlowRole 参数不是 ARN,而只是一个字符串,如 EMR_EC2_DefaultRole(默认值)。使用类似的格式。

JobFlowRole 是应用于 EMR 实例的角色,它不是创建 EMR 时使用的角色。我认为你误读了选项。

如果您想申请不使用 API 密钥的角色,那么您必须挖掘您的 AWS 凭证。例如在 S3 中,

S3Client s3 = S3Client.builder()
              .credentialsProvider(InstanceProfileCredentialsProvider.builder().build())
              .build();

哪里

InstanceProfileCredentialsProvider.builder().build()

使用实例的角色。