在 Apache spark 1.6 中使用 commons-configuration2 和 commons-beanutils-1.9

Using commons-configuration2 and commons-beanutils-1.9 with Apache spark 1.6

我的应用程序正在使用 commons-configuration2 和 commons-beanutils1.9,但是当我尝试将我的应用程序 jar 用于 spark 流作业时,它抛出以下异常。

java.lang.NoSuchMethodError: org.apache.commons.beanutils.PropertyUtilsBean.addBeanIntrospector(Lorg/apache/commons/beanutils/BeanIntrospector;)V 在 org.apache.commons.configuration2.beanutils.BeanHelper.initBeanUtilsBean(BeanHelper.java:631) 在 org.apache.commons.configuration2.beanutils.BeanHelper.(BeanHelper.java:89) 在 java.lang.Class.forName0(本机方法) 在 java.lang.Class.forName(Class.java:264) 在 com.sun.proxy.$Proxy23.(未知来源) 在 sun.reflect.NativeConstructorAccessorImpl.newInstance0(本机方法) 在 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) 在 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 在 java.lang.reflect.Constructor.newInstance(Constructor.java:423) 在 java.lang.reflect.Proxy.newProxyInstance(Proxy.java:739) 在 org.apache.commons.configuration2.builder.fluent.Parameters.createParametersProxy(Parameters.java:294) 在 org.apache.commons.configuration2.builder.fluent.Parameters.fileBased(Parameters.java:185)

这是我的 build.sbt

    libraryDependencies ++= Seq(
      "org.apache.commons" % "commons-configuration2" % "2.0",
      "commons-beanutils" % "commons-beanutils" % "1.9.2",
      "com.databricks" % "spark-avro_2.10" % "2.0.1",
      "com.databricks" % "spark-csv_2.10" % "1.4.0",
      "org.apache.spark" % "spark-sql_2.10" % "1.5.0" % "provided",
      "org.apache.spark" % "spark-hive_2.10" % "1.4.1" % "provided",
      "org.apache.spark" % "spark-core_2.10" % "1.4.1" % "provided",
      "com.amazonaws" % "aws-java-sdk" % "1.10.61",
      "org.apache.logging.log4j" % "log4j-api" % "2.6.2",
      "org.jasypt" % "jasypt" % "1.9.2",
      "commons-codec" % "commons-codec" % "1.8",
      "org.apache.kafka" % "kafka-clients" % "0.10.0.0",
      "org.apache.spark" % "spark-streaming-kafka_2.10" % "1.6.3",
      "org.apache.spark" % "spark-streaming_2.10" % "1.6.3" excludeAll(ExclusionRule(organization = "commons-beanutils"))

    )

    dependencyOverrides ++= Set(
      "com.fasterxml.jackson.core" % "jackson-databind" % "2.4.4",
      "org.apache.logging.log4j" % "log4j-api" % "2.6.2",
      "org.apache.logging.log4j" % "log4j-core" % "2.6.2",
      "org.apache.commons" % "commons-configuration2" % "2.0",
      "commons-beanutils" % "commons-beanutils" % "1.9.2"
    )

assemblyMergeStrategy in assembly := {
  case PathList("META-INF", xs @ _*) => MergeStrategy.discard
  case x => MergeStrategy.first
}

我如何确保它使用的是 commons-beanutils-1.9.2 而不是 commons-beanutils-1.7 或 commons-beanutils-core-1.8,它们是 hadoop-common 的一部分?

在项目设置中排除不需要的 jar 对我有用:

...
.settings(assemblyExcludedJars in assembly := {
    val cp = (fullClasspath in assembly).value

    val excludes = Set(
      "commons-beanutils-core-1.8.0.jar",
      "commons-beanutils-1.7.0.jar",
      "commons-beanutils-1.8.0.jar"
    )
    cp.filter{jar => excludes.contains(jar.data.getName)}
  })