Apache Crunch Scala 构建中缺少依赖项

Missing dependencies in Apache Crunch Scala build

我正在尝试在我的 CentOS 7 机器上构建 Apache Crunch source code,但是当我执行 mvn package:[=17= 时,crunch-spark 项目出现以下错误]

[ERROR] /home/bwatson/programming/git/crunch/crunch-spark/src/it/scala/org/apache/crunch/scrunch/spark/PageRankClassTest.scala:71: error: bad symbolic reference. A signature in PTypeH.class refers to term protobuf
[ERROR] in package com.google which is not available.
[ERROR] It may be completely missing from the current classpath, or the version on
[ERROR] the classpath might be incompatible with the version used when compiling PTypeH.class.
[ERROR]       .map(line => { val urls = line.split("\t"); (urls(0), urls(1)) })
[ERROR]           ^

关于类似错误(here and here)的其他 SO 问题似乎涉及 PATH 或版本问题。我一直在搞乱,但似乎无法解决它们。为了完整起见:

[bwatson@ben-pc crunch]$ scala -version
Scala code runner version 2.11.5 -- Copyright 2002-2013, LAMP/EPFL

[bwatson@ben-pc crunch]$ java -version
java version "1.8.0_31"
Java(TM) SE Runtime Environment (build 1.8.0_31-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.31-b07, mixed mode)

[bwatson@ben-pc crunch]$ mvn -version
Apache Maven 3.0.5 (Red Hat 3.0.5-16)
Maven home: /usr/share/maven
Java version: 1.8.0_31, vendor: Oracle Corporation
Java home: /usr/java/jdk1.8.0_31/jre
Default locale: en_GB, platform encoding: UTF-8
OS name: "linux", version: "3.10.0-123.20.1.el7.x86_64", arch: "amd64", family: "unix"

有什么建议吗?我不太确定 Scala 在哪里寻找它的依赖项,但我认为 Maven 会处理它。

不幸的是Different versions of Scala are binary incompatible. Currently by default Apache Spark uses Scala 2.10.4, not Scala 2.11。 Apache Scrunch 依赖于 Spark。 Maven 对此一无所知,因此无能为力。有必要对 Scrunch 进行一些修改以使其针对 Scala 2.11 / JDK 1.8 进行编译。我目前正在处理这个问题,但我还没有解决方案。但是,如果我使用 JDK 1.8 而不是 Scala 2.11 编译 Scala 2.10.4,我会收到您报告的错误消息,所以我认为它没有按照您的意图进行。错误似乎来自 Protobuf 编译器或 jar,但我不知道为什么会这样。

等我自己解决了,我会反馈的!

事实证明,Crunch 的官方文档缺少一个 Maven 参数。该问题已通过构建解决:

mvn package -Dcrunch.platform=2