线程异常 "main" java.nio.file.AccessDeniedException: s3a://......................: innerMkdirs

Exception in thread "main" java.nio.file.AccessDeniedException: s3a://.....................: innerMkdirs

技术堆栈 -

Spark - 2.4.7,

Scala - 2.11.8,

Running On AWS EMR

所以我正在尝试将 Kinesis Stream 写入特定的 s3 位置,但由于某些 S3 问题我无法这样做。另一个有趣的观察是,在 Databricks 上,相同的代码工作得很好。

我的代码-

transformations.materialization()
      .writeStream
      .trigger(Trigger.ProcessingTime(triggerMatInterval))
      .outputMode(OutputMode.Append())
      .format(DataFormat)
      .option("path", Path)
      .option("checkpointLocation", CheckpointPath) \Error Takes place here
      .start()

S3 位置格式 - s3a://abcd/ef/in/gh/checkpoints/ijk

错误-

Exception in thread "main" java.nio.file.AccessDeniedException: s3a://<S3Location>: innerMkdirs on <Same S3 Location>: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 77EA9D8F5F71C60C; S3 Extended Request ID: EH3Su6H/hcQjW/5E6+vEPjWyyy62OKb+CgoU1SCJKzMTt41IfVm8sJLDJTOYo8iIZwrN7GvD4wU=; Proxy: null), S3 Extended Request ID: EH3Su6H/hcQjW/5E6+vEPjWyyy62OKb+CgoU1SCJKzMTt41IfVm8sJLDJTOYo8iIZwrN7GvD4wU=
...
...
...
...
...
...
...
...
...
Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 77EA9D8F5F71C60C; S3 Extended Request ID: EH3Su6H/hcQjW/5E6+vEPjWyyy62OKb+CgoU1SCJKzMTt41IfVm8sJLDJTOYo8iIZwrN7GvD4wU=; Proxy: null), S3 Extended Request ID: EH3Su6H/hcQjW/5E6+vEPjWyyy62OKb+CgoU1SCJKzMTt41IfVm8sJLDJTOYo8iIZwrN7GvD4wU=
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1828)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1412)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1374)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1145)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:802)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access0(AmazonHttpClient.java:704)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:686)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:550)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:530)
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5219)
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5165)
    at com.amazonaws.services.s3.AmazonS3Client.access0(AmazonS3Client.java:405)
    at com.amazonaws.services.s3.AmazonS3Client$PutObjectStrategy.invokeServiceCall(AmazonS3Client.java:6180)
    at com.amazonaws.services.s3.AmazonS3Client.uploadObject(AmazonS3Client.java:1824)
    at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1784)
    at com.amazonaws.services.s3.transfer.internal.UploadCallable.uploadInOneChunk(UploadCallable.java:168)
    at com.amazonaws.services.s3.transfer.internal.UploadCallable.call(UploadCallable.java:148)
    at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:115)
    at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:45)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Command exiting with ret '1'

提前致谢。

  1. 尝试将 s3a://<> 更改为 s3://<>
  2. 确保您可以从 EMR 所在的同一帐户访问存储桶

无论您以何种身份登录,都没有存储桶的写入权限,或者至少没有它试图写入的路径