Why does "spark-shell --jars" with GraphFrames jar give "error: missing or invalid dependency detected while loading class file 'Logging.class'"?

Why does "spark-shell --jars" with GraphFrames jar give "error: missing or invalid dependency detected while loading class file 'Logging.class'"?

我有 运行 一个命令 spark-shell --jars /home/krishnamahi/graphframes-0.4.0-spark2.1-s_2.11.jar,它给我一个错误

error: missing or invalid dependency detected while loading class file 'Logging.class'. Could not access term typesafe in package com, because it (or its dependencies) are missing. Check your build definition for missing or conflicting dependencies. (Re-run with -Ylog-classpath to see the problematic classpath.) A full rebuild may help if 'Logging.class' was compiled against an incompatible version of com. error: missing or invalid dependency detected while loading class file 'Logging.class'. Could not access term scalalogging in value com.typesafe, because it (or its dependencies) are missing. Check your build definition for missing or conflicting dependencies. (Re-run with -Ylog-classpath to see the problematic classpath.) A full rebuild may help if 'Logging.class' was compiled against an incompatible version of com.typesafe. error: missing or invalid dependency detected while loading class file 'Logging.class'. Could not access type LazyLogging in value com.slf4j, because it (or its dependencies) are missing. Check your build definition for missing or conflicting dependencies. (Re-run with -Ylog-classpath to see the problematic classpath.) A full rebuild may help if 'Logging.class' was compiled against an incompatible version of com.slf4j.

我正在使用 Spark 2.1.1 版、Scala 2.11.8 版、JDK 1.8 版0_131、CentOS7 64 位、Hadoop 2.8.0。任何人都可以告诉我我应该为完美的 运行 程序提供什么额外的命令?提前致谢。

我已经安装了原始 Hadoop,所有组件 Hive、Pig、Spark 都是最新版本。然后它对我有用。我使用 Cent OS 7. 安装 Hadoop 组件的顺序是

  1. Anaconda3/Python3(因为 Spark 2.x 不支持 Python 2)
  2. Hadoop
  3. 蜂巢
  4. Hbase
  5. 火花

所有组件都应该在同一个终端中一次完成。 Spark 安装完成后,重启系统。

如果您想玩 GraphFrames,请改用 spark-shell--packages 命令行选项。

--packages Comma-separated list of maven coordinates of jars to include on the driver and executor classpaths. Will search the local maven repo, then maven central and any additional remote repositories given by --repositories. The format for the coordinates should be groupId:artifactId:version.

对于 graphframes-0.4.0-spark2.1-s_2.11.jar 如下:

$SPARK_HOME/bin/spark-shell --packages graphframes:graphframes:0.4.0-spark2.1-s_2.11

我从 GraphFrames 项目的 How to 部分逐字复制。

这样您就不必搜索 GraphFrames 库的所有(传递)依赖项,因为 Spark 会自动为您完成。