使用 saveToPhoenix 方法 load/save Hbase 上的 RDD 时出现异常
Exception when using the saveToPhoenix method to load/save a RDD on Hbase
我想使用apache-phoenix 框架。
问题是我一直有异常告诉我找不到 class HBaseConfiguration。
这是我要使用的代码:
import org.apache.spark.SparkContext
import org.apache.spark.sql._
import org.apache.phoenix.spark._
// Load INPUT_TABLE
object MainTest2 extends App {
val sc = new SparkContext("local", "phoenix-test")
val sqlContext = new SQLContext(sc)
val df = sqlContext.load("org.apache.phoenix.spark", Map("table" -> "INPUT_TABLE",
"zkUrl" -> "localhost:3888"))
}
这是我正在使用的 SBT:
name := "spark-to-hbase"
version := "1.0"
scalaVersion := "2.11.8"
libraryDependencies ++= Seq(
"org.apache.hadoop" % "hadoop-mapreduce-client-core" % "2.3.0",
"org.apache.phoenix" % "phoenix-core" % "4.11.0-HBase-1.3",
"org.apache.spark" % "spark-core_2.11" % "2.1.1",
"org.apache.spark" % "spark-sql_2.11" % "2.1.1",
"org.apache.phoenix" % "phoenix-spark" % "4.11.0-HBase-1.3"
)
这里是个例外:
Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/hadoop/hbase/HBaseConfiguration at
org.apache.phoenix.query.ConfigurationFactory$ConfigurationFactoryImpl.call(ConfigurationFactory.java:49)
at
org.apache.phoenix.query.ConfigurationFactory$ConfigurationFactoryImpl.call(ConfigurationFactory.java:46)
at
org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76)
at
org.apache.phoenix.util.PhoenixContextExecutor.callWithoutPropagation(PhoenixContextExecutor.java:91)
at
org.apache.phoenix.query.ConfigurationFactory$ConfigurationFactoryImpl.getConfiguration(ConfigurationFactory.java:46)
at
org.apache.phoenix.jdbc.PhoenixDriver.initializeConnectionCache(PhoenixDriver.java:151)
at
org.apache.phoenix.jdbc.PhoenixDriver.(PhoenixDriver.java:142)
at
org.apache.phoenix.jdbc.PhoenixDriver.(PhoenixDriver.java:69)
at org.apache.phoenix.spark.PhoenixRDD.(PhoenixRDD.scala:43)
at
org.apache.phoenix.spark.PhoenixRelation.schema(PhoenixRelation.scala:52)
at
org.apache.spark.sql.execution.datasources.LogicalRelation.(LogicalRelation.scala:40)
at
org.apache.spark.sql.SparkSession.baseRelationToDataFrame(SparkSession.scala:389)
at
org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:146)
at
org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:125)
at org.apache.spark.sql.SQLContext.load(SQLContext.scala:965) at
MainTest2$.delayedEndpoint$MainTest2(MainTest2.scala:9) at
MainTest2$delayedInit$body.apply(MainTest2.scala:6) at
scala.Function0$class.apply$mcV$sp(Function0.scala:34) at
scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App$$anonfun$main.apply(App.scala:76) at
scala.App$$anonfun$main.apply(App.scala:76) at
scala.collection.immutable.List.foreach(List.scala:381) at
scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
at scala.App$class.main(App.scala:76) at
MainTest2$.main(MainTest2.scala:6) at MainTest2.main(MainTest2.scala)
Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.hbase.HBaseConfiguration at
java.net.URLClassLoader.findClass(URLClassLoader.java:381) at
java.lang.ClassLoader.loadClass(ClassLoader.java:424) at
sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at
java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 26 more
我已经尝试更改 hadoop-env.sh 中的 HADOOP_CLASSPATH,就像之前 post.
中所说的那样
我该怎么做才能克服这个问题?
我找到了解决问题的办法。如异常所述,我的编译器无法找到 class HBaseConfiguration。 HBaseConfiguration 在 org.apache.hadoop.hbase 库中使用,因此需要编译。我注意到 HBaseConfiguration class 并不像我想的那样出现在 org.apache.hadoop 库中。对于安装在我的 PC 计算机上的 hbase 1.3.1 版本,我设法在我的 HBASE_HOME/lib 文件夹中的 hbase-common-1.3.1 jar 中找到了这个 class。
然后我在 built.SBT 中包含此依赖项:
"org.apache.hbase" % "hbase-common" % "1.3.1"
异常消失了。
我想使用apache-phoenix 框架。 问题是我一直有异常告诉我找不到 class HBaseConfiguration。 这是我要使用的代码:
import org.apache.spark.SparkContext
import org.apache.spark.sql._
import org.apache.phoenix.spark._
// Load INPUT_TABLE
object MainTest2 extends App {
val sc = new SparkContext("local", "phoenix-test")
val sqlContext = new SQLContext(sc)
val df = sqlContext.load("org.apache.phoenix.spark", Map("table" -> "INPUT_TABLE",
"zkUrl" -> "localhost:3888"))
}
这是我正在使用的 SBT:
name := "spark-to-hbase"
version := "1.0"
scalaVersion := "2.11.8"
libraryDependencies ++= Seq(
"org.apache.hadoop" % "hadoop-mapreduce-client-core" % "2.3.0",
"org.apache.phoenix" % "phoenix-core" % "4.11.0-HBase-1.3",
"org.apache.spark" % "spark-core_2.11" % "2.1.1",
"org.apache.spark" % "spark-sql_2.11" % "2.1.1",
"org.apache.phoenix" % "phoenix-spark" % "4.11.0-HBase-1.3"
)
这里是个例外:
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration at org.apache.phoenix.query.ConfigurationFactory$ConfigurationFactoryImpl.call(ConfigurationFactory.java:49) at org.apache.phoenix.query.ConfigurationFactory$ConfigurationFactoryImpl.call(ConfigurationFactory.java:46) at org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76) at org.apache.phoenix.util.PhoenixContextExecutor.callWithoutPropagation(PhoenixContextExecutor.java:91) at org.apache.phoenix.query.ConfigurationFactory$ConfigurationFactoryImpl.getConfiguration(ConfigurationFactory.java:46) at org.apache.phoenix.jdbc.PhoenixDriver.initializeConnectionCache(PhoenixDriver.java:151) at org.apache.phoenix.jdbc.PhoenixDriver.(PhoenixDriver.java:142) at org.apache.phoenix.jdbc.PhoenixDriver.(PhoenixDriver.java:69) at org.apache.phoenix.spark.PhoenixRDD.(PhoenixRDD.scala:43) at org.apache.phoenix.spark.PhoenixRelation.schema(PhoenixRelation.scala:52) at org.apache.spark.sql.execution.datasources.LogicalRelation.(LogicalRelation.scala:40) at org.apache.spark.sql.SparkSession.baseRelationToDataFrame(SparkSession.scala:389) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:146) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:125) at org.apache.spark.sql.SQLContext.load(SQLContext.scala:965) at MainTest2$.delayedEndpoint$MainTest2(MainTest2.scala:9) at MainTest2$delayedInit$body.apply(MainTest2.scala:6) at scala.Function0$class.apply$mcV$sp(Function0.scala:34) at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12) at scala.App$$anonfun$main.apply(App.scala:76) at scala.App$$anonfun$main.apply(App.scala:76) at scala.collection.immutable.List.foreach(List.scala:381) at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35) at scala.App$class.main(App.scala:76) at MainTest2$.main(MainTest2.scala:6) at MainTest2.main(MainTest2.scala) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.HBaseConfiguration at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 26 more
我已经尝试更改 hadoop-env.sh 中的 HADOOP_CLASSPATH,就像之前 post.
中所说的那样我该怎么做才能克服这个问题?
我找到了解决问题的办法。如异常所述,我的编译器无法找到 class HBaseConfiguration。 HBaseConfiguration 在 org.apache.hadoop.hbase 库中使用,因此需要编译。我注意到 HBaseConfiguration class 并不像我想的那样出现在 org.apache.hadoop 库中。对于安装在我的 PC 计算机上的 hbase 1.3.1 版本,我设法在我的 HBASE_HOME/lib 文件夹中的 hbase-common-1.3.1 jar 中找到了这个 class。
然后我在 built.SBT 中包含此依赖项:
"org.apache.hbase" % "hbase-common" % "1.3.1"
异常消失了。