为什么 Maven 程序集在 SBT 程序集发现冲突时工作

Why Maven assembly works when SBT assembly find conflicts

标题也可以是:
Maven 和 SBT 汇编插件有什么区别。

我在将项目从 Maven 迁移到 SBT 时发现这是一个问题。

为了描述这个问题,我创建了一个示例项目,我发现它的行为因构建工具而异。

https://github.com/atais/mvn-sbt-assembly


唯一的依赖项是(sbt 风格)

"com.netflix.astyanax" % "astyanax-cassandra" % "3.9.0",
"org.apache.cassandra" % "cassandra-all" % "3.4",

我不明白的是,为什么 mvn package 成功创建了 fat jar,而 sbt assembly 却产生了冲突:

[error] 39 errors were encountered during merge
[error] java.lang.RuntimeException: deduplicate: different file contents found in the following:
[error] /home/siatkowskim/.ivy2/cache/org.slf4j/jcl-over-slf4j/jars/jcl-over-slf4j-1.7.7.jar:org/apache/commons/logging/<some classes>
[error] /home/siatkowskim/.ivy2/cache/commons-logging/commons-logging/jars/commons-logging-1.1.1.jar:org/apache/commons/logging/<some classes>
...
[error] /home/siatkowskim/.ivy2/cache/com.github.stephenc.high-scale-lib/high-scale-lib/jars/high-scale-lib-1.1.2.jar:org/cliffc/high_scale_lib/<some classes>
[error] /home/siatkowskim/.ivy2/cache/com.boundary/high-scale-lib/jars/high-scale-lib-1.0.6.jar:org/cliffc/high_scale_lib/<some classes>
...

build.sbt 我可以看出他们在您的构建中没有合并策略。另外还有一个 Rogue "," 在你的 libraryDependencies Key 中放置在 "org.apache.cassandra" % "cassandra-all" % "3.4" in your build.sbt in the project with the link you have shared above.

需要一个合并策略来处理所有重复的文件和 jar 以及版本。以下是如何在您的构建中安装一个的示例。

assemblyMergeStrategy in assembly := {
  case m if m.toLowerCase.endsWith("manifest.mf")       => MergeStrategy.discard
  case m if m.toLowerCase.matches("meta-inf.*\.sf$")   => MergeStrategy.discard
  case "reference.conf"                                 => MergeStrategy.concat
  case x: String if x.contains("UnusedStubClass.class") => MergeStrategy.first
  case _                                                => MergeStrategy.first
}

如果您的项目中没有子项目,您可以尝试编写一个简单的构建文件。您可以尝试以下 build.sbt.

name := "assembly-test",

version := "0.1",

scalaVersion := "2.12.4",

libraryDependencies ++= Seq(
      "com.netflix.astyanax" % "astyanax-cassandra" % "3.9.0",
      "org.apache.cassandra" % "cassandra-all" % "3.4"
)

mainClass in assembly := Some("com.atais.cassandra.MainClass")

assemblyMergeStrategy in assembly := {
      case m if m.toLowerCase.endsWith("manifest.mf")       => MergeStrategy.discard
      case m if m.toLowerCase.matches("meta-inf.*\.sf$")   => MergeStrategy.discard
      case "reference.conf"                                 => MergeStrategy.concat
      case x: String if x.contains("UnusedStubClass.class") => MergeStrategy.first
      case _                                                => MergeStrategy.first
    }

似乎 maven-assembly-plugin 解决冲突 等同于 MergeStrategy.first(不确定它是否 完全 等同)使用 jar-with-dependencies 时(自 it only has one phase 起),仅以未指定的方式选择其中一个文件:

If two or more elements (e.g., file, fileSet) select different sources for the same file for archiving, only one of the source files will be archived.

As per version 2.5.2 of the assembly plugin, the first phase to add the file to the archive "wins". The filtering is done solely based on name inside the archive, so the same source file can be added under different output names. The order of the phases is as follows: 1) FileItem 2) FileSets 3) ModuleSet 4) DepenedencySet and 5) Repository elements.

Elements of the same type will be processed in the order they appear in the descriptors. If you need to "overwrite" a file included by a previous set, the only way to do this is to exclude that file from the earlier set.

Note that this behaviour was slightly different in earlier versions of the assembly plugin.

即使其中一个冲突文件适用于您的所有依赖项(不一定如此),Maven 也不知道是哪一个,因此您可能会默默地得到错误的结果。我的意思是,在构建时默默地;在运行时你可以得到例如AbstractMethodError,或者又是一个错误的结果。

您可以通过编写自己的描述符来影响选择哪个文件,但它非常冗长,不等同于只写 MergeStrategy.first/last(并且 concat/discard 是不允许的).

SBT 插件可以做同样的事情:当你没有指定一个策略时默认为一个策略,但是,好吧,你可能会默默地得到错误的结果。

扩展到

我也更新了 my project 详细解释,所以你可能想看看。

听从建议

You can verify it for this case by unpacking the jar Maven produces and the dependency jars in SBT error message, then checking which .class file Maven used.

我将 mavensbt 生成的 fat-jars

进行了比较
  • MergeStrategy.first,显示了一些额外的文件
  • MergeStrategy.last,显示二进制差异和额外文件

我已采取下一步并检查 fat-jars 与依赖关系 sbt 发现冲突,特别是:

结论

maven-assembly-plugin 解决了 jar 级别的冲突。 当它发现任何冲突时,它会选择第一个 jar 并简单地忽略其他的所有内容。

sbt-assembly 混合所有 class 文件,在本地逐个文件解决冲突。

我的理论是,如果你用maven-assembly-plugin制作的fat-jar有效,你可以 为 sbt 中的所有冲突指定 MergeStrategy.first。 它们唯一的区别是,用 sbt 生成的 jar 会更大,包含被 maven.

忽略的额外 类