GraphX - Class 缺少 Graph 所需的文件
GraphX - Class file needed by Graph is missing
我是 Scala/Spark 的新手。我正在尝试编译 运行 示例 GraphX 代码。
原始文件 link:PageRank
我的代码,稍作编辑以避免出现问题:
// scalastyle:off println
package org.apache.spark.examples.graphx
// $example on$
import org.apache.spark.graphx.GraphLoader
// $example off$
import org.apache.spark.sql.SparkSession
/**
* A PageRank example on social network dataset
* Run with
* {{{
* bin/run-example graphx.PageRankExample
* }}}
*/
object PageRankExampl {
def main(args: Array[String]): Unit = {
// Creates a SparkSession.
val spark = SparkSession
.builder
.appName("PageRankExampl")
.getOrCreate()
val sc = spark.sparkContext
// $example on$
// Load the edges as a graph
val graph = GraphLoader.edgeListFile(sc, "data/graphx/followers.txt")
// Run PageRank
val ranks = graph.pageRank(0.0001).vertices
// Join the ranks with the usernames
val users = sc.textFile("data/graphx/users.txt").map { line =>
val fields = line.split(",")
(fields(0).toLong, fields(1))
}
val ranksByUsername = users.join(ranks).map {
case (id, (username, rank)) => (username, rank)
}
// Print the result
println(ranksByUsername.collect().mkString("\n"))
// $example off$
spark.stop()
}
}
// scalastyle:on println
构建文件:
name := "hello"
version := "1.0"
libraryDependencies ++= Seq(
"org.apache.spark" % "spark-core_2.11" % "2.2.1" % "provided",
"org.apache.spark" % "spark-sql_2.11" % "2.2.1" % "provided",
"org.apache.spark" % "spark-graphx_2.11" % "2.2.1" % "provided"
)
我得到的错误:
Starting sbt: invoke with -help for other options
[info] Set current project to hello (in build file:/usr/local/spark-2.2.1-bin-hadoop2.7/nofel_test/)
> run
[info] Compiling 1 Scala source to /usr/local/spark-2.2.1-bin-hadoop2.7/nofel_test/target/scala-2.9.1/classes...
[error] class file needed by Graph is missing.
[error] reference type ClassTag of package reflect refers to nonexisting symbol.
[error] one error found
[error] {file:/usr/local/spark-2.2.1-bin-hadoop2.7/nofel_test/}default-b08e19/compile:compile: Compilation failed
[error] Total time: 2 s, completed Mar 26, 2018 11:14:28 PM
我在构建文件中添加了一行,它起作用了。如果有人知道为什么需要这一行 (scalaVersion) 的原因,请告诉我。
name := "PageRank"
version := "1.0"
scalaVersion := "2.11.8"
libraryDependencies ++= Seq(
"org.apache.spark" % "spark-core_2.11" % "2.2.1" % "provided",
"org.apache.spark" % "spark-sql_2.11" % "2.2.1" % "provided",
"org.apache.spark" % "spark-graphx_2.11" % "2.2.1" % "provided"
)
我是 Scala/Spark 的新手。我正在尝试编译 运行 示例 GraphX 代码。 原始文件 link:PageRank
我的代码,稍作编辑以避免出现问题:
// scalastyle:off println
package org.apache.spark.examples.graphx
// $example on$
import org.apache.spark.graphx.GraphLoader
// $example off$
import org.apache.spark.sql.SparkSession
/**
* A PageRank example on social network dataset
* Run with
* {{{
* bin/run-example graphx.PageRankExample
* }}}
*/
object PageRankExampl {
def main(args: Array[String]): Unit = {
// Creates a SparkSession.
val spark = SparkSession
.builder
.appName("PageRankExampl")
.getOrCreate()
val sc = spark.sparkContext
// $example on$
// Load the edges as a graph
val graph = GraphLoader.edgeListFile(sc, "data/graphx/followers.txt")
// Run PageRank
val ranks = graph.pageRank(0.0001).vertices
// Join the ranks with the usernames
val users = sc.textFile("data/graphx/users.txt").map { line =>
val fields = line.split(",")
(fields(0).toLong, fields(1))
}
val ranksByUsername = users.join(ranks).map {
case (id, (username, rank)) => (username, rank)
}
// Print the result
println(ranksByUsername.collect().mkString("\n"))
// $example off$
spark.stop()
}
}
// scalastyle:on println
构建文件:
name := "hello"
version := "1.0"
libraryDependencies ++= Seq(
"org.apache.spark" % "spark-core_2.11" % "2.2.1" % "provided",
"org.apache.spark" % "spark-sql_2.11" % "2.2.1" % "provided",
"org.apache.spark" % "spark-graphx_2.11" % "2.2.1" % "provided"
)
我得到的错误:
Starting sbt: invoke with -help for other options
[info] Set current project to hello (in build file:/usr/local/spark-2.2.1-bin-hadoop2.7/nofel_test/)
> run [info] Compiling 1 Scala source to /usr/local/spark-2.2.1-bin-hadoop2.7/nofel_test/target/scala-2.9.1/classes...
[error] class file needed by Graph is missing.
[error] reference type ClassTag of package reflect refers to nonexisting symbol.
[error] one error found
[error] {file:/usr/local/spark-2.2.1-bin-hadoop2.7/nofel_test/}default-b08e19/compile:compile: Compilation failed
[error] Total time: 2 s, completed Mar 26, 2018 11:14:28 PM
我在构建文件中添加了一行,它起作用了。如果有人知道为什么需要这一行 (scalaVersion) 的原因,请告诉我。
name := "PageRank"
version := "1.0"
scalaVersion := "2.11.8"
libraryDependencies ++= Seq(
"org.apache.spark" % "spark-core_2.11" % "2.2.1" % "provided",
"org.apache.spark" % "spark-sql_2.11" % "2.2.1" % "provided",
"org.apache.spark" % "spark-graphx_2.11" % "2.2.1" % "provided"
)