无法从 Spark 访问 sqlite 数据库
Can't access sqlite db from Spark
我有以下代码:
val conf = new SparkConf().setAppName("Spark Test")
val sc = new SparkContext(conf)
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
val data = sqlContext.read.format("jdbc").options(
Map(
"url" -> "jdbc:sqlite:/nv/pricing/ix_tri_pi.sqlite3",
"dbtable" -> "SELECT security_id FROM ix_tri_pi")).load()
data.foreach {
row => println(row.getInt(1))
}
然后我尝试提交:
spark-submit \
--class "com.novus.analytics.spark.SparkTest" \
--master "local[4]" \
/Users/smabie/workspace/analytics/analytics-spark/target/scala-2.10/analytics-spark.jar \
--conf spark.executer.extraClassPath=sqlite-jdbc-3.8.7.jar \
--conf spark.driver.extraClassPath=sqlite-jdbc-3.8.7.jar \
--driver-class-path sqlite-jdbc-3.8.7.jar \
--jars sqlite-jdbc-3.8.7.jar
但我得到以下异常:
Exception in thread "main" java.sql.SQLException: No suitable driver
我正在使用 Spark 1.6.1 版,如果有帮助的话。
谢谢!
您是否尝试在选项中明确指定驱动程序 class?
options(
Map(
"url" -> "jdbc:sqlite:/nv/pricing/ix_tri_pi.sqlite3",
"driver" -> "org.sqlite.JDBC",
"dbtable" -> "SELECT security_id FROM ix_tri_pi"))
我在尝试加载 PostgreSQL 时遇到了类似的问题 table。
此外,可能的原因可能在class加载中:
The JDBC driver class must be visible to the primordial class loader
on the client session and on all executors. This is because Java’s
DriverManager class does a security check that results in it ignoring
all drivers not visible to the primordial class loader when one goes
to open a connection. One convenient way to do this is to modify
compute_classpath.sh on all worker nodes to include your driver JARs.
http://spark.apache.org/docs/latest/sql-programming-guide.html#troubleshooting
尝试将您的 jar 定义为 spark-submit
的最后一个参数。
我有以下代码:
val conf = new SparkConf().setAppName("Spark Test")
val sc = new SparkContext(conf)
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
val data = sqlContext.read.format("jdbc").options(
Map(
"url" -> "jdbc:sqlite:/nv/pricing/ix_tri_pi.sqlite3",
"dbtable" -> "SELECT security_id FROM ix_tri_pi")).load()
data.foreach {
row => println(row.getInt(1))
}
然后我尝试提交:
spark-submit \
--class "com.novus.analytics.spark.SparkTest" \
--master "local[4]" \
/Users/smabie/workspace/analytics/analytics-spark/target/scala-2.10/analytics-spark.jar \
--conf spark.executer.extraClassPath=sqlite-jdbc-3.8.7.jar \
--conf spark.driver.extraClassPath=sqlite-jdbc-3.8.7.jar \
--driver-class-path sqlite-jdbc-3.8.7.jar \
--jars sqlite-jdbc-3.8.7.jar
但我得到以下异常:
Exception in thread "main" java.sql.SQLException: No suitable driver
我正在使用 Spark 1.6.1 版,如果有帮助的话。 谢谢!
您是否尝试在选项中明确指定驱动程序 class?
options(
Map(
"url" -> "jdbc:sqlite:/nv/pricing/ix_tri_pi.sqlite3",
"driver" -> "org.sqlite.JDBC",
"dbtable" -> "SELECT security_id FROM ix_tri_pi"))
我在尝试加载 PostgreSQL 时遇到了类似的问题 table。
此外,可能的原因可能在class加载中:
The JDBC driver class must be visible to the primordial class loader on the client session and on all executors. This is because Java’s DriverManager class does a security check that results in it ignoring all drivers not visible to the primordial class loader when one goes to open a connection. One convenient way to do this is to modify compute_classpath.sh on all worker nodes to include your driver JARs.
http://spark.apache.org/docs/latest/sql-programming-guide.html#troubleshooting
尝试将您的 jar 定义为 spark-submit
的最后一个参数。