使用私钥连接到 GCS 时出错
Error Connecting to GCS using Private Keys
场景是我们从我们试图访问 Project2 GCS 的地方有 Project1。
我们正在将项目 2 的私钥传递给 SparkSession,项目 1 中的作业是 运行,但它提供了无效的 PKCS8 数据。
Dataproc 版本 - 1.4
session.sparkContext().hadoopConfiguration().set("fs.gs.auth.service.account.private.key.id","<private-key-id>");
session.sparkContext().hadoopConfiguration().set("fs.gs.auth.service.account.private.key",<private-key>");
session.sparkContext().hadoopConfiguration().set("fs.gs.auth.service.account.email","<client-email>");
错误:
2022-02-17T16:19:09.231359147Z DEFAULT Invalid PKCS8 data. at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.util.CredentialFactory.privateKeyFromPkcs8(CredentialFactory.java:346) at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.util.CredentialFactory.getCredentialsFromSAParameters(CredentialFactory.java:310) at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.util.CredentialFactory.getCredential(CredentialFactory.java:393) at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.getCredential(GoogleHadoopFileSystemBase.java:1324) at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.createGcsFs(GoogleHadoopFileSystemBase.java:1459) at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.configure(GoogleHadoopFileSystemBase.java:1443) at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.initialize(GoogleHadoopFileSystemBase.java:467) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3242) at org.apache.hadoop.fs.FileSystem.access0(FileSystem.java:121) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3291) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3259) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:470) at com.gcp.util.Day2Util.deleteGCSPartFile(Day2Util.java:430) at com.gcp.ReadGCSWithSA.main(ReadGCSWithSA.java:42) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:855) at org.apache.spark.deploy.SparkSubmit.doRunMain(SparkSubmit.scala:161) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) at org.apache.spark.deploy.SparkSubmit$$anon.doSubmit(SparkSubmit.scala:939) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:948) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
如果有任何其他方式传递 SA 详细信息,请告诉我。请注意,我们无权访问传递服务帐户凭据文件。
它与上述属性配合得很好。问题是我之前从 private_key 中删除了 -----BEGIN PRIVATE KEY----- 和 -----END PRIVATE KEY----- 因此它不起作用
场景是我们从我们试图访问 Project2 GCS 的地方有 Project1。 我们正在将项目 2 的私钥传递给 SparkSession,项目 1 中的作业是 运行,但它提供了无效的 PKCS8 数据。
Dataproc 版本 - 1.4
session.sparkContext().hadoopConfiguration().set("fs.gs.auth.service.account.private.key.id","<private-key-id>");
session.sparkContext().hadoopConfiguration().set("fs.gs.auth.service.account.private.key",<private-key>");
session.sparkContext().hadoopConfiguration().set("fs.gs.auth.service.account.email","<client-email>");
错误:
2022-02-17T16:19:09.231359147Z DEFAULT Invalid PKCS8 data. at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.util.CredentialFactory.privateKeyFromPkcs8(CredentialFactory.java:346) at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.util.CredentialFactory.getCredentialsFromSAParameters(CredentialFactory.java:310) at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.util.CredentialFactory.getCredential(CredentialFactory.java:393) at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.getCredential(GoogleHadoopFileSystemBase.java:1324) at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.createGcsFs(GoogleHadoopFileSystemBase.java:1459) at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.configure(GoogleHadoopFileSystemBase.java:1443) at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.initialize(GoogleHadoopFileSystemBase.java:467) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3242) at org.apache.hadoop.fs.FileSystem.access0(FileSystem.java:121) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3291) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3259) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:470) at com.gcp.util.Day2Util.deleteGCSPartFile(Day2Util.java:430) at com.gcp.ReadGCSWithSA.main(ReadGCSWithSA.java:42) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:855) at org.apache.spark.deploy.SparkSubmit.doRunMain(SparkSubmit.scala:161) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) at org.apache.spark.deploy.SparkSubmit$$anon.doSubmit(SparkSubmit.scala:939) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:948) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
如果有任何其他方式传递 SA 详细信息,请告诉我。请注意,我们无权访问传递服务帐户凭据文件。
它与上述属性配合得很好。问题是我之前从 private_key 中删除了 -----BEGIN PRIVATE KEY----- 和 -----END PRIVATE KEY----- 因此它不起作用