运行 在 CHD5.4.1 NoClassDefFoundError 上引发 SQL
Run spark SQL on CHD5.4.1 NoClassDefFoundError
我将 CHD5.4.1
设置为 运行 在 Spark
上进行了一些测试 Spark SQL
。 Spark 运行良好,但 Spark SQL 存在一些问题。
我开始 pyspark
如下:
/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/spark/bin/pyspark --master yarn-client
我想 select table in Hive
with Spark SQL:
results = sqlCtx.sql("SELECT * FROM my_table").collect()
它打印错误日志:http://pastebin.com/u98psBG8
> Welcome to
> ____ __
> / __/__ ___ _____/ /__
> _\ \/ _ \/ _ `/ __/ '_/ /__ / .__/\_,_/_/ /_/\_\ version 1.3.0
> /_/
>
> Using Python version 2.7.6 (default, Mar 22 2014 22:59:56)
> SparkContext available as sc, HiveContext available as sqlCtx.
> >>> results = sqlCtx.sql("SELECT * FROM vt_call_histories").collect() 15/05/20 06:57:07 INFO HiveMetaStore: 0: Opening raw store with
> implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
> 15/05/20 06:57:07 INFO ObjectStore: ObjectStore, initialize called
> 15/05/20 06:57:08 WARN General: Plugin (Bundle) "org.datanucleus" is
> already registered. Ensure you dont have multiple JAR versions of the
> same plugin in the classpath. The URL
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-core-3.2.10.jar"
> is already registered, and you are trying to register an identical
> plugin located at URL
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-core-3.2.2.jar."
> 15/05/20 06:57:08 WARN General: Plugin (Bundle)
> "org.datanucleus.api.jdo" is already registered. Ensure you dont have
> multiple JAR versions of the same plugin in the classpath. The URL
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-api-jdo-3.2.6.jar"
> is already registered, and you are trying to register an identical
> plugin located at URL
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-api-jdo-3.2.1.jar."
> 15/05/20 06:57:08 WARN General: Plugin (Bundle)
> "org.datanucleus.store.rdbms" is already registered. Ensure you dont
> have multiple JAR versions of the same plugin in the classpath. The
> URL
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-rdbms-3.2.1.jar"
> is already registered, and you are trying to register an identical
> plugin located at URL
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-rdbms-3.2.9.jar."
> 15/05/20 06:57:08 INFO Persistence: Property datanucleus.cache.level2
> unknown - will be ignored 15/05/20 06:57:08 INFO Persistence: Property
> hive.metastore.integral.jdo.pushdown unknown - will be ignored
> 15/05/20 06:57:08 WARN HiveMetaStore: Retrying creating default
> database after error: Error creating transactional connection factory
> javax.jdo.JDOFatalInternalException: Error creating transactional
> connection factory at
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:587)
> at
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:788)
> at
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:333)
> at
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:202)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606) at
> javax.jdo.JDOHelper.run(JDOHelper.java:1965) at
> java.security.AccessController.doPrivileged(Native Method) at
> javax.jdo.JDOHelper.invoke(JDOHelper.java:1960) at
> javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1166)
> at
> javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808)
> at
> javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701)
> at
> org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:365)
> at
> org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:394)
> at
> org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:291)
> at
> org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:258)
> at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
> at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
> at
> org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:56)
> at
> org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:65)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:579)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:557)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:606)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:448)
> at
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:66)
> at
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5601)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:193)
> at
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method) at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1486)
> at
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:64)
> at
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:74)
> at
> org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2845)
> at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2864) at
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:453)
> at
> org.apache.spark.sql.hive.HiveContext.sessionState$lzycompute(HiveContext.scala:229)
> at
> org.apache.spark.sql.hive.HiveContext.sessionState(HiveContext.scala:225)
> at
> org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:241)
> at
> org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:240)
> at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:86)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606) at
> py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231) at
> py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
> at py4j.Gateway.invoke(Gateway.java:259) at
> py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
> at py4j.commands.CallCommand.execute(CallCommand.java:79) at
> py4j.GatewayConnection.run(GatewayConnection.java:207) at
> java.lang.Thread.run(Thread.java:745) NestedThrowablesStackTrace:
> java.lang.reflect.InvocationTargetException at
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at
> org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631)
> at
> org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:325)
> at
> org.datanucleus.store.AbstractStoreManager.registerConnectionFactory(AbstractStoreManager.java:282)
> at
> org.datanucleus.store.AbstractStoreManager.<init>(AbstractStoreManager.java:240)
> at
> org.datanucleus.store.rdbms.RDBMSStoreManager.<init>(RDBMSStoreManager.java:286)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method) at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at
> org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631)
> at
> org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:301)
> at
> org.datanucleus.NucleusContext.createStoreManagerForProperties(NucleusContext.java:1187)
> at org.datanucleus.NucleusContext.initialise(NucleusContext.java:356)
> at
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:775)
> at
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:333)
> at
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:202)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606) at
> javax.jdo.JDOHelper.run(JDOHelper.java:1965) at
> java.security.AccessController.doPrivileged(Native Method) at
> javax.jdo.JDOHelper.invoke(JDOHelper.java:1960) at
> javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1166)
> at
> javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808)
> at
> javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701)
> at
> org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:365)
> at
> org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:394)
> at
> org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:291)
> at
> org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:258)
> at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
> at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
> at
> org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:56)
> at
> org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:65)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:579)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:557)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:606)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:448)
> at
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:66)
> at
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5601)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:193)
> at
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method) at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1486)
> at
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:64)
> at
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:74)
> at
> org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2845)
> at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2864) at
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:453)
> at
> org.apache.spark.sql.hive.HiveContext.sessionState$lzycompute(HiveContext.scala:229)
> at
> org.apache.spark.sql.hive.HiveContext.sessionState(HiveContext.scala:225)
> at
> org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:241)
> at
> org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:240)
> at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:86)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606) at
> py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231) at
> py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
> at py4j.Gateway.invoke(Gateway.java:259) at
> py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
> at py4j.commands.CallCommand.execute(CallCommand.java:79) at
> py4j.GatewayConnection.run(GatewayConnection.java:207) at
> java.lang.Thread.run(Thread.java:745) Caused by:
> java.lang.ExceptionInInitializerError at
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at java.lang.Class.newInstance(Class.java:374) at
> org.datanucleus.store.rdbms.connectionpool.AbstractConnectionPoolFactory.loadDriver(AbstractConnectionPoolFactory.java:47)
> at
> org.datanucleus.store.rdbms.connectionpool.BoneCPConnectionPoolFactory.createConnectionPool(BoneCPConnectionPoolFactory.java:54)
> at
> org.datanucleus.store.rdbms.ConnectionFactoryImpl.generateDataSources(ConnectionFactoryImpl.java:238)
> at
> org.datanucleus.store.rdbms.ConnectionFactoryImpl.initialiseDataSources(ConnectionFactoryImpl.java:131)
> at
> org.datanucleus.store.rdbms.ConnectionFactoryImpl.<init>(ConnectionFactoryImpl.java:85)
> ... 73 more Caused by: java.lang.SecurityException: sealing violation:
> package org.apache.derby.impl.services.locks is sealed at
> java.net.URLClassLoader.getAndVerifyPackage(URLClassLoader.java:388)
> at java.net.URLClassLoader.defineClass(URLClassLoader.java:417) at
> java.net.URLClassLoader.access0(URLClassLoader.java:71) at
> java.net.URLClassLoader.run(URLClassLoader.java:361) at
> java.net.URLClassLoader.run(URLClassLoader.java:355) at
> java.security.AccessController.doPrivileged(Native Method) at
> java.net.URLClassLoader.findClass(URLClassLoader.java:354) at
> java.lang.ClassLoader.loadClass(ClassLoader.java:425) at
> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at
> java.lang.ClassLoader.loadClass(ClassLoader.java:358) at
> java.lang.ClassLoader.defineClass1(Native Method) at
> java.lang.ClassLoader.defineClass(ClassLoader.java:800) at
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
> at java.net.URLClassLoader.defineClass(URLClassLoader.java:449) at
> java.net.URLClassLoader.access0(URLClassLoader.java:71) at
> java.net.URLClassLoader.run(URLClassLoader.java:361) at
> java.net.URLClassLoader.run(URLClassLoader.java:355) at
> java.security.AccessController.doPrivileged(Native Method) at
> java.net.URLClassLoader.findClass(URLClassLoader.java:354) at
> java.lang.ClassLoader.loadClass(ClassLoader.java:425) at
> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at
> java.lang.ClassLoader.loadClass(ClassLoader.java:358) at
> java.lang.Class.forName0(Native Method) at
> java.lang.Class.forName(Class.java:190) at
> org.apache.derby.impl.services.monitor.BaseMonitor.getImplementations(Unknown
> Source) at
> org.apache.derby.impl.services.monitor.BaseMonitor.getDefaultImplementations(Unknown
> Source) at
> org.apache.derby.impl.services.monitor.BaseMonitor.runWithState(Unknown
> Source) at
> org.apache.derby.impl.services.monitor.FileMonitor.<init>(Unknown
> Source) at
> org.apache.derby.iapi.services.monitor.Monitor.startMonitor(Unknown
> Source) at org.apache.derby.iapi.jdbc.JDBCBoot.boot(Unknown Source)
> at org.apache.derby.jdbc.EmbeddedDriver.boot(Unknown Source) at
> org.apache.derby.jdbc.EmbeddedDriver.<clinit>(Unknown Source) ... 83
> more 15/05/20 06:57:08 INFO HiveMetaStore: 0: Opening raw store with
> implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
> 15/05/20 06:57:08 INFO ObjectStore: ObjectStore, initialize called
> 15/05/20 06:57:08 WARN General: Plugin (Bundle) "org.datanucleus" is
> already registered. Ensure you dont have multiple JAR versions of the
> same plugin in the classpath. The URL
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-core-3.2.10.jar"
> is already registered, and you are trying to register an identical
> plugin located at URL
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-core-3.2.2.jar."
> 15/05/20 06:57:08 WARN General: Plugin (Bundle)
> "org.datanucleus.api.jdo" is already registered. Ensure you dont have
> multiple JAR versions of the same plugin in the classpath. The URL
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-api-jdo-3.2.6.jar"
> is already registered, and you are trying to register an identical
> plugin located at URL
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-api-jdo-3.2.1.jar."
> 15/05/20 06:57:08 WARN General: Plugin (Bundle)
> "org.datanucleus.store.rdbms" is already registered. Ensure you dont
> have multiple JAR versions of the same plugin in the classpath. The
> URL
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-rdbms-3.2.1.jar"
> is already registered, and you are trying to register an identical
> plugin located at URL
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-rdbms-3.2.9.jar."
> 15/05/20 06:57:08 INFO Persistence: Property datanucleus.cache.level2
> unknown - will be ignored 15/05/20 06:57:08 INFO Persistence: Property
> hive.metastore.integral.jdo.pushdown unknown - will be ignored
> Traceback (most recent call last): File "<stdin>", line 1, in
> <module> File
> "/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/spark/python/pyspark/sql/context.py",
> line 528, in sql
> return DataFrame(self._ssql_ctx.sql(sqlQuery), self) File "/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py",
> line 538, in __call__ File
> "/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py",
> line 300, in get_return_value py4j.protocol.Py4JJavaError: An error
> occurred while calling o31.sql. : java.lang.RuntimeException:
> java.lang.RuntimeException: Unable to instantiate
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:472)
> at
> org.apache.spark.sql.hive.HiveContext.sessionState$lzycompute(HiveContext.scala:229)
> at
> org.apache.spark.sql.hive.HiveContext.sessionState(HiveContext.scala:225)
> at
> org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:241)
> at
> org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:240)
> at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:86)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606) at
> py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231) at
> py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
> at py4j.Gateway.invoke(Gateway.java:259) at
> py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
> at py4j.commands.CallCommand.execute(CallCommand.java:79) at
> py4j.GatewayConnection.run(GatewayConnection.java:207) at
> java.lang.Thread.run(Thread.java:745) Caused by:
> java.lang.RuntimeException: Unable to instantiate
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1488)
> at
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:64)
> at
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:74)
> at
> org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2845)
> at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2864) at
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:453)
> ... 16 more Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method) at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1486)
> ... 21 more Caused by: javax.jdo.JDOFatalInternalException: Error
> creating transactional connection factory NestedThrowables:
> java.lang.reflect.InvocationTargetException at
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:587)
> at
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:788)
> at
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:333)
> at
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:202)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606) at
> javax.jdo.JDOHelper.run(JDOHelper.java:1965) at
> java.security.AccessController.doPrivileged(Native Method) at
> javax.jdo.JDOHelper.invoke(JDOHelper.java:1960) at
> javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1166)
> at
> javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808)
> at
> javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701)
> at
> org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:365)
> at
> org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:394)
> at
> org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:291)
> at
> org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:258)
> at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
> at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
> at
> org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:56)
> at
> org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:65)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:579)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:557)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:610)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:448)
> at
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:66)
> at
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5601)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:193)
> at
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
> ... 26 more Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method) at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at
> org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631)
> at
> org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:325)
> at
> org.datanucleus.store.AbstractStoreManager.registerConnectionFactory(AbstractStoreManager.java:282)
> at
> org.datanucleus.store.AbstractStoreManager.<init>(AbstractStoreManager.java:240)
> at
> org.datanucleus.store.rdbms.RDBMSStoreManager.<init>(RDBMSStoreManager.java:286)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method) at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at
> org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631)
> at
> org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:301)
> at
> org.datanucleus.NucleusContext.createStoreManagerForProperties(NucleusContext.java:1187)
> at org.datanucleus.NucleusContext.initialise(NucleusContext.java:356)
> at
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:775)
> ... 55 more Caused by: java.lang.NoClassDefFoundError: Could not
> initialize class org.apache.derby.jdbc.EmbeddedDriver at
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
我找到了问题的答案。
安装CDH 5.3并通过Cloudera Manager启用Spark后,按照以下步骤启用Hive访问:
- 确保 hive 从 Hive CLI 和 JDBC 通过 HiveServer2 工作(默认情况下应该工作)。
- 将 hive-site.xml 复制到您的 SPARK_HOME/conf 文件夹。
- 将 Hive 库添加到 Spark 类路径 -> 编辑 SPARK_HOME/bin/compute-classpath.sh 文件并添加以下内容:
CLASSPATH="$CLASSPATH:/opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hive/lib/*"(CDH 特定示例,使用您的配置单元库位置)。
- 重启Spark集群使一切生效。
我遇到了这个问题,发现真正的问题是 hive 库与 spark 库冲突。如果您查看上面的日志 -
15/05/20 06:57:08 WARN General: Plugin (Bundle) "org.datanucleus" is
already registered. Ensure you dont have multiple JAR versions of the
same plugin in the classpath. The URL
"file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-core-3.2.10.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-core-3.2.2.jar.
这不是无害的警告。这就是问题的核心。我的 CLASSPATH 中已经有了蜂巢罐。我删除了它们并启动了 Spark,一切都很顺利。所以 - 首先尝试。参见 https://issues.apache.org/jira/browse/HIVE-9198
如果您遇到类似 Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
的错误,则需要有关复制配置单元的信息-site.xml
我最终得到了一个蜂巢-site.xml,如下所示
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
<description>blah</description>
</property>
<property>
<name>hive.exec.scratchdir</name>
<value>/tmp/<yourusername></value>
<description>blah</description>
</property>
<property>
<name>hive.exec.local.scratchdir</name>
<value>/tmp/<yourusername></value>
<description>blah</description>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/tmp/<yourusername></value>
<description>blah</description>
</property>
<property>
<name>hive.scratch.dir.permission</name>
<value>733</value>
<description>blah</description>
</property>
<property>
<name>hive.querylog.location</name>
<value>/tmp/<yourusername></value>
<description>blah</description>
</property>
<property>
<name>hive.server2.logging.operation.log.location</name>
<value>/tmp/<yourusername>/operation_logs</value>
<description>blah</description>
</property>
我从蜂巢中删除了所有其他东西-site.xml。参见 java.net.URISyntaxException when starting HIVE
你也可以在 运行 spark-submit 时设置配置单元配置
这对我来说在 cdh 5.4.5
中没问题
spark-submit \
--class com.xxx.main.TagHive \
--master yarn-client \
--name HiveTest \
--num-executors 3 \
--driver-memory 500m \
--executor-memory 500m \
--executor-cores 1 \
--conf "spark.executor.extraClassPath=/etc/hive/conf:/opt/cloudera/parcels/CDH/lib/hive/lib/*.jar" \
--files ./log4j-spark.properties \
--conf "spark.driver.extraJavaOptions=-Dlog4j.configuration=log4j-spark.properties" \
--conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=log4j-spark.properties" \
dmp-tag-etl-0.0.1-SNAPSHOT-jar-with-dependencies.jar
我将 CHD5.4.1
设置为 运行 在 Spark
上进行了一些测试 Spark SQL
。 Spark 运行良好,但 Spark SQL 存在一些问题。
我开始 pyspark
如下:
/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/spark/bin/pyspark --master yarn-client
我想 select table in Hive
with Spark SQL:
results = sqlCtx.sql("SELECT * FROM my_table").collect()
它打印错误日志:http://pastebin.com/u98psBG8
> Welcome to
> ____ __
> / __/__ ___ _____/ /__
> _\ \/ _ \/ _ `/ __/ '_/ /__ / .__/\_,_/_/ /_/\_\ version 1.3.0
> /_/
>
> Using Python version 2.7.6 (default, Mar 22 2014 22:59:56)
> SparkContext available as sc, HiveContext available as sqlCtx.
> >>> results = sqlCtx.sql("SELECT * FROM vt_call_histories").collect() 15/05/20 06:57:07 INFO HiveMetaStore: 0: Opening raw store with
> implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
> 15/05/20 06:57:07 INFO ObjectStore: ObjectStore, initialize called
> 15/05/20 06:57:08 WARN General: Plugin (Bundle) "org.datanucleus" is
> already registered. Ensure you dont have multiple JAR versions of the
> same plugin in the classpath. The URL
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-core-3.2.10.jar"
> is already registered, and you are trying to register an identical
> plugin located at URL
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-core-3.2.2.jar."
> 15/05/20 06:57:08 WARN General: Plugin (Bundle)
> "org.datanucleus.api.jdo" is already registered. Ensure you dont have
> multiple JAR versions of the same plugin in the classpath. The URL
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-api-jdo-3.2.6.jar"
> is already registered, and you are trying to register an identical
> plugin located at URL
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-api-jdo-3.2.1.jar."
> 15/05/20 06:57:08 WARN General: Plugin (Bundle)
> "org.datanucleus.store.rdbms" is already registered. Ensure you dont
> have multiple JAR versions of the same plugin in the classpath. The
> URL
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-rdbms-3.2.1.jar"
> is already registered, and you are trying to register an identical
> plugin located at URL
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-rdbms-3.2.9.jar."
> 15/05/20 06:57:08 INFO Persistence: Property datanucleus.cache.level2
> unknown - will be ignored 15/05/20 06:57:08 INFO Persistence: Property
> hive.metastore.integral.jdo.pushdown unknown - will be ignored
> 15/05/20 06:57:08 WARN HiveMetaStore: Retrying creating default
> database after error: Error creating transactional connection factory
> javax.jdo.JDOFatalInternalException: Error creating transactional
> connection factory at
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:587)
> at
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:788)
> at
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:333)
> at
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:202)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606) at
> javax.jdo.JDOHelper.run(JDOHelper.java:1965) at
> java.security.AccessController.doPrivileged(Native Method) at
> javax.jdo.JDOHelper.invoke(JDOHelper.java:1960) at
> javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1166)
> at
> javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808)
> at
> javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701)
> at
> org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:365)
> at
> org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:394)
> at
> org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:291)
> at
> org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:258)
> at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
> at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
> at
> org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:56)
> at
> org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:65)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:579)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:557)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:606)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:448)
> at
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:66)
> at
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5601)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:193)
> at
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method) at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1486)
> at
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:64)
> at
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:74)
> at
> org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2845)
> at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2864) at
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:453)
> at
> org.apache.spark.sql.hive.HiveContext.sessionState$lzycompute(HiveContext.scala:229)
> at
> org.apache.spark.sql.hive.HiveContext.sessionState(HiveContext.scala:225)
> at
> org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:241)
> at
> org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:240)
> at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:86)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606) at
> py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231) at
> py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
> at py4j.Gateway.invoke(Gateway.java:259) at
> py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
> at py4j.commands.CallCommand.execute(CallCommand.java:79) at
> py4j.GatewayConnection.run(GatewayConnection.java:207) at
> java.lang.Thread.run(Thread.java:745) NestedThrowablesStackTrace:
> java.lang.reflect.InvocationTargetException at
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at
> org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631)
> at
> org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:325)
> at
> org.datanucleus.store.AbstractStoreManager.registerConnectionFactory(AbstractStoreManager.java:282)
> at
> org.datanucleus.store.AbstractStoreManager.<init>(AbstractStoreManager.java:240)
> at
> org.datanucleus.store.rdbms.RDBMSStoreManager.<init>(RDBMSStoreManager.java:286)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method) at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at
> org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631)
> at
> org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:301)
> at
> org.datanucleus.NucleusContext.createStoreManagerForProperties(NucleusContext.java:1187)
> at org.datanucleus.NucleusContext.initialise(NucleusContext.java:356)
> at
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:775)
> at
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:333)
> at
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:202)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606) at
> javax.jdo.JDOHelper.run(JDOHelper.java:1965) at
> java.security.AccessController.doPrivileged(Native Method) at
> javax.jdo.JDOHelper.invoke(JDOHelper.java:1960) at
> javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1166)
> at
> javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808)
> at
> javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701)
> at
> org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:365)
> at
> org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:394)
> at
> org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:291)
> at
> org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:258)
> at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
> at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
> at
> org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:56)
> at
> org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:65)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:579)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:557)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:606)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:448)
> at
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:66)
> at
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5601)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:193)
> at
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method) at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1486)
> at
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:64)
> at
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:74)
> at
> org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2845)
> at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2864) at
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:453)
> at
> org.apache.spark.sql.hive.HiveContext.sessionState$lzycompute(HiveContext.scala:229)
> at
> org.apache.spark.sql.hive.HiveContext.sessionState(HiveContext.scala:225)
> at
> org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:241)
> at
> org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:240)
> at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:86)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606) at
> py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231) at
> py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
> at py4j.Gateway.invoke(Gateway.java:259) at
> py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
> at py4j.commands.CallCommand.execute(CallCommand.java:79) at
> py4j.GatewayConnection.run(GatewayConnection.java:207) at
> java.lang.Thread.run(Thread.java:745) Caused by:
> java.lang.ExceptionInInitializerError at
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at java.lang.Class.newInstance(Class.java:374) at
> org.datanucleus.store.rdbms.connectionpool.AbstractConnectionPoolFactory.loadDriver(AbstractConnectionPoolFactory.java:47)
> at
> org.datanucleus.store.rdbms.connectionpool.BoneCPConnectionPoolFactory.createConnectionPool(BoneCPConnectionPoolFactory.java:54)
> at
> org.datanucleus.store.rdbms.ConnectionFactoryImpl.generateDataSources(ConnectionFactoryImpl.java:238)
> at
> org.datanucleus.store.rdbms.ConnectionFactoryImpl.initialiseDataSources(ConnectionFactoryImpl.java:131)
> at
> org.datanucleus.store.rdbms.ConnectionFactoryImpl.<init>(ConnectionFactoryImpl.java:85)
> ... 73 more Caused by: java.lang.SecurityException: sealing violation:
> package org.apache.derby.impl.services.locks is sealed at
> java.net.URLClassLoader.getAndVerifyPackage(URLClassLoader.java:388)
> at java.net.URLClassLoader.defineClass(URLClassLoader.java:417) at
> java.net.URLClassLoader.access0(URLClassLoader.java:71) at
> java.net.URLClassLoader.run(URLClassLoader.java:361) at
> java.net.URLClassLoader.run(URLClassLoader.java:355) at
> java.security.AccessController.doPrivileged(Native Method) at
> java.net.URLClassLoader.findClass(URLClassLoader.java:354) at
> java.lang.ClassLoader.loadClass(ClassLoader.java:425) at
> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at
> java.lang.ClassLoader.loadClass(ClassLoader.java:358) at
> java.lang.ClassLoader.defineClass1(Native Method) at
> java.lang.ClassLoader.defineClass(ClassLoader.java:800) at
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
> at java.net.URLClassLoader.defineClass(URLClassLoader.java:449) at
> java.net.URLClassLoader.access0(URLClassLoader.java:71) at
> java.net.URLClassLoader.run(URLClassLoader.java:361) at
> java.net.URLClassLoader.run(URLClassLoader.java:355) at
> java.security.AccessController.doPrivileged(Native Method) at
> java.net.URLClassLoader.findClass(URLClassLoader.java:354) at
> java.lang.ClassLoader.loadClass(ClassLoader.java:425) at
> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at
> java.lang.ClassLoader.loadClass(ClassLoader.java:358) at
> java.lang.Class.forName0(Native Method) at
> java.lang.Class.forName(Class.java:190) at
> org.apache.derby.impl.services.monitor.BaseMonitor.getImplementations(Unknown
> Source) at
> org.apache.derby.impl.services.monitor.BaseMonitor.getDefaultImplementations(Unknown
> Source) at
> org.apache.derby.impl.services.monitor.BaseMonitor.runWithState(Unknown
> Source) at
> org.apache.derby.impl.services.monitor.FileMonitor.<init>(Unknown
> Source) at
> org.apache.derby.iapi.services.monitor.Monitor.startMonitor(Unknown
> Source) at org.apache.derby.iapi.jdbc.JDBCBoot.boot(Unknown Source)
> at org.apache.derby.jdbc.EmbeddedDriver.boot(Unknown Source) at
> org.apache.derby.jdbc.EmbeddedDriver.<clinit>(Unknown Source) ... 83
> more 15/05/20 06:57:08 INFO HiveMetaStore: 0: Opening raw store with
> implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
> 15/05/20 06:57:08 INFO ObjectStore: ObjectStore, initialize called
> 15/05/20 06:57:08 WARN General: Plugin (Bundle) "org.datanucleus" is
> already registered. Ensure you dont have multiple JAR versions of the
> same plugin in the classpath. The URL
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-core-3.2.10.jar"
> is already registered, and you are trying to register an identical
> plugin located at URL
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-core-3.2.2.jar."
> 15/05/20 06:57:08 WARN General: Plugin (Bundle)
> "org.datanucleus.api.jdo" is already registered. Ensure you dont have
> multiple JAR versions of the same plugin in the classpath. The URL
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-api-jdo-3.2.6.jar"
> is already registered, and you are trying to register an identical
> plugin located at URL
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-api-jdo-3.2.1.jar."
> 15/05/20 06:57:08 WARN General: Plugin (Bundle)
> "org.datanucleus.store.rdbms" is already registered. Ensure you dont
> have multiple JAR versions of the same plugin in the classpath. The
> URL
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-rdbms-3.2.1.jar"
> is already registered, and you are trying to register an identical
> plugin located at URL
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-rdbms-3.2.9.jar."
> 15/05/20 06:57:08 INFO Persistence: Property datanucleus.cache.level2
> unknown - will be ignored 15/05/20 06:57:08 INFO Persistence: Property
> hive.metastore.integral.jdo.pushdown unknown - will be ignored
> Traceback (most recent call last): File "<stdin>", line 1, in
> <module> File
> "/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/spark/python/pyspark/sql/context.py",
> line 528, in sql
> return DataFrame(self._ssql_ctx.sql(sqlQuery), self) File "/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py",
> line 538, in __call__ File
> "/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py",
> line 300, in get_return_value py4j.protocol.Py4JJavaError: An error
> occurred while calling o31.sql. : java.lang.RuntimeException:
> java.lang.RuntimeException: Unable to instantiate
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:472)
> at
> org.apache.spark.sql.hive.HiveContext.sessionState$lzycompute(HiveContext.scala:229)
> at
> org.apache.spark.sql.hive.HiveContext.sessionState(HiveContext.scala:225)
> at
> org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:241)
> at
> org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:240)
> at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:86)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606) at
> py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231) at
> py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
> at py4j.Gateway.invoke(Gateway.java:259) at
> py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
> at py4j.commands.CallCommand.execute(CallCommand.java:79) at
> py4j.GatewayConnection.run(GatewayConnection.java:207) at
> java.lang.Thread.run(Thread.java:745) Caused by:
> java.lang.RuntimeException: Unable to instantiate
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1488)
> at
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:64)
> at
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:74)
> at
> org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2845)
> at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2864) at
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:453)
> ... 16 more Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method) at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1486)
> ... 21 more Caused by: javax.jdo.JDOFatalInternalException: Error
> creating transactional connection factory NestedThrowables:
> java.lang.reflect.InvocationTargetException at
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:587)
> at
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:788)
> at
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:333)
> at
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:202)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606) at
> javax.jdo.JDOHelper.run(JDOHelper.java:1965) at
> java.security.AccessController.doPrivileged(Native Method) at
> javax.jdo.JDOHelper.invoke(JDOHelper.java:1960) at
> javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1166)
> at
> javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808)
> at
> javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701)
> at
> org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:365)
> at
> org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:394)
> at
> org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:291)
> at
> org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:258)
> at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
> at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
> at
> org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:56)
> at
> org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:65)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:579)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:557)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:610)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:448)
> at
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:66)
> at
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5601)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:193)
> at
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
> ... 26 more Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method) at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at
> org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631)
> at
> org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:325)
> at
> org.datanucleus.store.AbstractStoreManager.registerConnectionFactory(AbstractStoreManager.java:282)
> at
> org.datanucleus.store.AbstractStoreManager.<init>(AbstractStoreManager.java:240)
> at
> org.datanucleus.store.rdbms.RDBMSStoreManager.<init>(RDBMSStoreManager.java:286)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method) at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at
> org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631)
> at
> org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:301)
> at
> org.datanucleus.NucleusContext.createStoreManagerForProperties(NucleusContext.java:1187)
> at org.datanucleus.NucleusContext.initialise(NucleusContext.java:356)
> at
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:775)
> ... 55 more Caused by: java.lang.NoClassDefFoundError: Could not
> initialize class org.apache.derby.jdbc.EmbeddedDriver at
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
我找到了问题的答案。
安装CDH 5.3并通过Cloudera Manager启用Spark后,按照以下步骤启用Hive访问:
- 确保 hive 从 Hive CLI 和 JDBC 通过 HiveServer2 工作(默认情况下应该工作)。
- 将 hive-site.xml 复制到您的 SPARK_HOME/conf 文件夹。
- 将 Hive 库添加到 Spark 类路径 -> 编辑 SPARK_HOME/bin/compute-classpath.sh 文件并添加以下内容: CLASSPATH="$CLASSPATH:/opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hive/lib/*"(CDH 特定示例,使用您的配置单元库位置)。
- 重启Spark集群使一切生效。
我遇到了这个问题,发现真正的问题是 hive 库与 spark 库冲突。如果您查看上面的日志 -
15/05/20 06:57:08 WARN General: Plugin (Bundle) "org.datanucleus" is
already registered. Ensure you dont have multiple JAR versions of the
same plugin in the classpath. The URL
"file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-core-3.2.10.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-core-3.2.2.jar.
这不是无害的警告。这就是问题的核心。我的 CLASSPATH 中已经有了蜂巢罐。我删除了它们并启动了 Spark,一切都很顺利。所以 - 首先尝试。参见 https://issues.apache.org/jira/browse/HIVE-9198
如果您遇到类似 Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
我最终得到了一个蜂巢-site.xml,如下所示
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
<description>blah</description>
</property>
<property>
<name>hive.exec.scratchdir</name>
<value>/tmp/<yourusername></value>
<description>blah</description>
</property>
<property>
<name>hive.exec.local.scratchdir</name>
<value>/tmp/<yourusername></value>
<description>blah</description>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/tmp/<yourusername></value>
<description>blah</description>
</property>
<property>
<name>hive.scratch.dir.permission</name>
<value>733</value>
<description>blah</description>
</property>
<property>
<name>hive.querylog.location</name>
<value>/tmp/<yourusername></value>
<description>blah</description>
</property>
<property>
<name>hive.server2.logging.operation.log.location</name>
<value>/tmp/<yourusername>/operation_logs</value>
<description>blah</description>
</property>
我从蜂巢中删除了所有其他东西-site.xml。参见 java.net.URISyntaxException when starting HIVE
你也可以在 运行 spark-submit 时设置配置单元配置 这对我来说在 cdh 5.4.5
中没问题spark-submit \
--class com.xxx.main.TagHive \
--master yarn-client \
--name HiveTest \
--num-executors 3 \
--driver-memory 500m \
--executor-memory 500m \
--executor-cores 1 \
--conf "spark.executor.extraClassPath=/etc/hive/conf:/opt/cloudera/parcels/CDH/lib/hive/lib/*.jar" \
--files ./log4j-spark.properties \
--conf "spark.driver.extraJavaOptions=-Dlog4j.configuration=log4j-spark.properties" \
--conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=log4j-spark.properties" \
dmp-tag-etl-0.0.1-SNAPSHOT-jar-with-dependencies.jar