Scala Spark code for creating GCP Publisher throws: java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument

Scala Spark code for creating GCP Publisher throws: java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument

我正在尝试使用 IntelliJ 使用 Spark Scala 将消息发布到 GCP Pub/Sub 中的主题。这是代码: GcpPublish.scala

val publisher = Publisher.newBuilder(s"projects/projectid/topics/test")
                .setCredentialsProvider(FixedCredentialsProvider
                .create(ServiceAccountCredentials
                .fromStream(new FileInputStream("gs://credsfiles/projectid.json"))))
                .build()

publisher.publish(PubsubMessage
         .newBuilder
         .setData(ByteString.copyFromUtf8(JSONData.toString()))
         .build())

这是 build.sbt:

name := "TryingSomething"

version := "1.0"

scalaVersion := "2.11.12"

val sparkVersion = "2.3.2"

libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % "2.3.2" % "provided",
  "org.apache.spark" %% "spark-sql" % "2.3.2" ,
  "com.google.cloud" % "google-cloud-bigquery" % "1.106.0",
  "org.apache.beam" % "beam-sdks-java-core" % "2.19.0" ,
  "org.apache.beam" % "beam-runners-google-cloud-dataflow-java" % "2.19.0",
  "com.typesafe.scala-logging" %% "scala-logging" % "3.1.0" ,
  "org.apache.beam" % "beam-sdks-java-extensions-google-cloud-platform-core" % "2.19.0" ,
  "org.apache.beam" % "beam-sdks-java-io-google-cloud-platform" % "2.19.0" ,
  "com.google.apis" % "google-api-services-bigquery" % "v2-rev456-1.25.0" ,
  "com.google.cloud" % "google-cloud-pubsub" % "1.102.1",
  "com.google.guava" % "guava" % "28.2-jre",
  "org.apache.httpcomponents" % "httpclient" % "4.5.11"
)

assemblyMergeStrategy in assembly := {
  case PathList("META-INF", xs @ _*) => MergeStrategy.discard
  case _ => MergeStrategy.first
}

但是当我在 Dataprocs 集群上创建 fat jar 和 运行 时,我收到以下错误:

Exception in thread "main" java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;I)V
    at com.google.api.gax.grpc.InstantiatingGrpcChannelProvider$Builder.setPoolSize(InstantiatingGrpcChannelProvider.java:527)
    at com.google.api.gax.grpc.InstantiatingGrpcChannelProvider$Builder.setChannelsPerCpu(InstantiatingGrpcChannelProvider.java:546)
    at com.google.api.gax.grpc.InstantiatingGrpcChannelProvider$Builder.setChannelsPerCpu(InstantiatingGrpcChannelProvider.java:535)
    at com.google.cloud.pubsub.v1.Publisher$Builder.<init>(Publisher.java:633)
    at com.google.cloud.pubsub.v1.Publisher$Builder.<init>(Publisher.java:588)
    at com.google.cloud.pubsub.v1.Publisher.newBuilder(Publisher.java:584)

我遵循了 所述的解决方案并添加了 guava 和 httpcomponents 依赖项,但我仍然遇到相同的异常。

我什至将代码更改为实例化 Publisher 为:

val publisher = Publisher.newBuilder(s"projects/projectid/topics/test").build()

但这也给出了同样的错误。

任何可能导致此错误的建议?

问题是 Spark 和 Hadoop 都注入了​​它们自己的 guava 版本,该版本也存在于 Google Pubsub 包中。我通过在 build.sbt 文件中添加阴影规则来解决这个问题:

assemblyShadeRules in assembly := Seq(
  ShadeRule.rename("com.google.common.**" -> "repackaged.com.google.common.@1").inAll,
  ShadeRule.rename("com.google.protobuf.**" -> "repackaged.com.google.protobuf.@1").inAll,
  ShadeRule.rename("io.netty.**" -> "repackaged.io.netty.@1").inAll
)

com.google.common 和 com.google.protobuf 的阴影规则是解决 guava 依赖关系的规则。其他的我加进去是为了后面路上遇到的依赖冲突