使用 java 代码将 spark 作业提交到 AWS EMR 并等待执行并获得最终状态
Submit spark job to AWS EMR with java code and wait for the execution and get final status
我正在尝试通过 AWS EMR SDK API 将 Spark 作业提交到 AWS EMR。
我希望进程提交作业,然后等待作业 complete/fail 并获得相应的状态。
代码:
AWSCredentials credentials = new BasicAWSCredentials(accessKey, secretKey);
AmazonElasticMapReduce emr =
AmazonElasticMapReduceClientBuilder
.standard()
.withCredentials(new AWSStaticCredentialsProvider(credentials))
.build();
HadoopJarStepConfig sparkStepConf =
new HadoopJarStepConfig()
.withJar("command-runner.jar")
.withArgs("spark-submit")
.withArgs("--master", "yarn")
.withArgs(sparkJarPath)
.withArgs(args);
StepConfig sparkStep =
new StepConfig().withName("Spark Step").withActionOnFailure(ActionOnFailure.CONTINUE).withHadoopJarStep(
sparkStepConf);
AddJobFlowStepsRequest req =
new AddJobFlowStepsRequest().withJobFlowId(clusterId).withSteps(Collections.singletonList(sparkStep));
emr.addJobFlowSteps(req);
找不到可获取已提交作业状态的内容
这是一个例子(请检查某些代码区域是否为空):
ListStepsResult stepsResult = emr.listSteps(new ListStepsRequest().withClusterId(clusterId).withStepIds(req.getStepIds()));
List<StepSummary> stepsList = stepsResult.getSteps();
StepSummary stepSummary = stepsList.get(0);
StepStatus stepSummaryStatus = stepSummary.getStatus();
String stepStatus = stepSummaryStatus.getState();
StepExecutionState stepState = StepExecutionState.valueOf(stepStatus);
stepState有你想要的。
我正在尝试通过 AWS EMR SDK API 将 Spark 作业提交到 AWS EMR。 我希望进程提交作业,然后等待作业 complete/fail 并获得相应的状态。
代码:
AWSCredentials credentials = new BasicAWSCredentials(accessKey, secretKey);
AmazonElasticMapReduce emr =
AmazonElasticMapReduceClientBuilder
.standard()
.withCredentials(new AWSStaticCredentialsProvider(credentials))
.build();
HadoopJarStepConfig sparkStepConf =
new HadoopJarStepConfig()
.withJar("command-runner.jar")
.withArgs("spark-submit")
.withArgs("--master", "yarn")
.withArgs(sparkJarPath)
.withArgs(args);
StepConfig sparkStep =
new StepConfig().withName("Spark Step").withActionOnFailure(ActionOnFailure.CONTINUE).withHadoopJarStep(
sparkStepConf);
AddJobFlowStepsRequest req =
new AddJobFlowStepsRequest().withJobFlowId(clusterId).withSteps(Collections.singletonList(sparkStep));
emr.addJobFlowSteps(req);
找不到可获取已提交作业状态的内容
这是一个例子(请检查某些代码区域是否为空):
ListStepsResult stepsResult = emr.listSteps(new ListStepsRequest().withClusterId(clusterId).withStepIds(req.getStepIds()));
List<StepSummary> stepsList = stepsResult.getSteps();
StepSummary stepSummary = stepsList.get(0);
StepStatus stepSummaryStatus = stepSummary.getStatus();
String stepStatus = stepSummaryStatus.getState();
StepExecutionState stepState = StepExecutionState.valueOf(stepStatus);
stepState有你想要的。