在 oozie 上的 spark 应用程序中创建 HiveContext

Creating HiveContext in spark application on oozie

我们正在 运行使用 oozie 创建一个 spark 应用程序。在应用程序中,我们正在创建 hiveContext。 当 运行 使用 spark-submit 应用程序时,一切正常,但是当 运行 使用 oozie 时,我们会得到以下异常:

2018-01-16 17:45:24,762 [Driver] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore  - 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
2018-01-16 17:45:24,795 [Driver] INFO  org.apache.hadoop.hive.metastore.ObjectStore  - ObjectStore, initialize called
2018-01-16 17:45:24,994 [Driver] INFO  DataNucleus.Persistence  - Property datanucleus.cache.level2 unknown - will be ignored
2018-01-16 17:45:24,995 [Driver] INFO  DataNucleus.Persistence  - Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
2018-01-16 17:45:25,439 [Driver] WARN  org.apache.hadoop.hive.metastore.HiveMetaStore  - Retrying creating default database after error: Error creating transactional connection factory
javax.jdo.JDOFatalInternalException: Error creating transactional connection factory
    at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:587)
    at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:781)
    at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:326)
    at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:195)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at javax.jdo.JDOHelper.run(JDOHelper.java:1965)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.jdo.JDOHelper.invoke(JDOHelper.java:1960)
    at javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1166)
    at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808)
    at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701)
    at org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:418)
    at org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:447)
    at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:342)
    at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:298)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
    at org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:60)
    at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:69)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:682)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:660)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:709)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:508)
    at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:78)
    at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:84)
    at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:6319)
    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:207)
    at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1560)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:67)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:82)
    at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3324)
    at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3343)
    at org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3568)
    at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:230)
    at org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:214)
    at org.apache.hadoop.hive.ql.metadata.Hive.<init>(Hive.java:337)
    at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:298)
    at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:273)
    at org.apache.spark.sql.hive.client.ClientWrapper.client(ClientWrapper.scala:272)
    at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$withHiveState.apply(ClientWrapper.scala:288)
    at org.apache.spark.sql.hive.client.ClientWrapper.liftedTree1(ClientWrapper.scala:239)
    at org.apache.spark.sql.hive.client.ClientWrapper.retryLocked(ClientWrapper.scala:238)
    at org.apache.spark.sql.hive.client.ClientWrapper.withHiveState(ClientWrapper.scala:281)
    at org.apache.spark.sql.hive.client.ClientWrapper.runHive(ClientWrapper.scala:488)
    at org.apache.spark.sql.hive.client.ClientWrapper.runSqlHive(ClientWrapper.scala:478)
    at org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:443)
    at org.apache.spark.sql.SQLContext$$anonfun.apply(SQLContext.scala:272)
    at org.apache.spark.sql.SQLContext$$anonfun.apply(SQLContext.scala:271)
    at scala.collection.Iterator$class.foreach(Iterator.scala:727)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
    at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
    at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
    at org.apache.spark.sql.SQLContext.<init>(SQLContext.scala:271)
    at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:90)
    at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:101)
    at com.frontline.shopfloor.integration.Integration$.main(Integration.scala:41)
    at com.frontline.shopfloor.integration.Integration.main(Integration.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anon.run(ApplicationMaster.scala:552)
NestedThrowablesStackTrace:
java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631)
    at org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:325)
    at org.datanucleus.store.AbstractStoreManager.registerConnectionFactory(AbstractStoreManager.java:281)
    at org.datanucleus.store.AbstractStoreManager.<init>(AbstractStoreManager.java:239)
    at org.datanucleus.store.rdbms.RDBMSStoreManager.<init>(RDBMSStoreManager.java:292)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631)
    at org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:301)
    at org.datanucleus.NucleusContext.createStoreManagerForProperties(NucleusContext.java:1069)
    at org.datanucleus.NucleusContext.initialise(NucleusContext.java:359)
    at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:768)
    at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:326)
    at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:195)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at javax.jdo.JDOHelper.run(JDOHelper.java:1965)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.jdo.JDOHelper.invoke(JDOHelper.java:1960)
    at javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1166)
    at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808)
    at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701)
    at org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:418)
    at org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:447)
    at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:342)
    at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:298)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
    at org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:60)
    at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:69)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:682)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:660)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:709)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:508)
    at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:78)
    at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:84)
    at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:6319)
    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:207)
    at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1560)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:67)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:82)
    at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3324)
    at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3343)
    at org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3568)
    at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:230)
    at org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:214)
    at org.apache.hadoop.hive.ql.metadata.Hive.<init>(Hive.java:337)
    at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:298)
    at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:273)
    at org.apache.spark.sql.hive.client.ClientWrapper.client(ClientWrapper.scala:272)
    at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$withHiveState.apply(ClientWrapper.scala:288)
    at org.apache.spark.sql.hive.client.ClientWrapper.liftedTree1(ClientWrapper.scala:239)
    at org.apache.spark.sql.hive.client.ClientWrapper.retryLocked(ClientWrapper.scala:238)
    at org.apache.spark.sql.hive.client.ClientWrapper.withHiveState(ClientWrapper.scala:281)
    at org.apache.spark.sql.hive.client.ClientWrapper.runHive(ClientWrapper.scala:488)
    at org.apache.spark.sql.hive.client.ClientWrapper.runSqlHive(ClientWrapper.scala:478)
    at org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:443)
    at org.apache.spark.sql.SQLContext$$anonfun.apply(SQLContext.scala:272)
    at org.apache.spark.sql.SQLContext$$anonfun.apply(SQLContext.scala:271)
    at scala.collection.Iterator$class.foreach(Iterator.scala:727)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
    at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
    at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
    at org.apache.spark.sql.SQLContext.<init>(SQLContext.scala:271)
    at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:90)
    at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:101)
    at com.frontline.shopfloor.integration.Integration$.main(Integration.scala:41)
    at com.frontline.shopfloor.integration.Integration.main(Integration.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anon.run(ApplicationMaster.scala:552)
Caused by: java.lang.IllegalAccessError: tried to access method com.google.common.collect.MapMaker.makeComputingMap(Lcom/google/common/base/Function;)Ljava/util/concurrent/ConcurrentMap; from class com.jolbox.bonecp.BoneCPDataSource
    at com.jolbox.bonecp.BoneCPDataSource.<init>(BoneCPDataSource.java:64)
    at org.datanucleus.store.rdbms.datasource.BoneCPDataSourceFactory.makePooledDataSource(BoneCPDataSourceFactory.java:73)
    at org.datanucleus.store.rdbms.ConnectionFactoryImpl.generateDataSources(ConnectionFactoryImpl.java:217)
    at org.datanucleus.store.rdbms.ConnectionFactoryImpl.initialiseDataSources(ConnectionFactoryImpl.java:110)
    at org.datanucleus.store.rdbms.ConnectionFactoryImpl.<init>(ConnectionFactoryImpl.java:82)
    ... 86 more

Oozie 工作流程:

<workflow-app name="Integration" xmlns="uri:oozie:workflow:0.5"> 
<start to="IntegrationLayer"/>
    <action name="IntegrationLayer">
       <spark xmlns="uri:oozie:spark-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>                  
            <master>${sparkMaster}</master>
            <mode>cluster</mode>
            <name>${appName}</name>
            <class>...</class>
            <jar>...</jar>
            <spark-opts>--executor-memory 3G --driver-memory 3G --conf "spark.yarn.historyServer.address=http://${nameServiceName}:18088" --conf "spark.eventLog.dir=hdfs://${nameServiceName}/user/spark/applicationHistory" --conf "spark.eventLog.enabled=true" --conf "spark.executor.extraJavaOptions=-Dlog4j.configurationog4j.properties" --conf "spark.driver.extraJavaOptions=-Dlog4j.configuration=log4j.properties"</spark-opts>
            <arg>${nameServiceName}</arg>
            <arg>${hdfsPath}</arg>
            <arg>${appName}</arg>
        </spark>
        <ok to="end-node"/>
        <error to="fail-node"/>
    </action>  
    <kill name="fail-node">
        <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <end name="end-node"/>
</workflow-app>

我们使用cloudera 5.8 我们是否需要为 运行 Oozie 上的应用程序配置其他内容?

您应该在 oozie 操作中添加 hive-site.xml。

原因:java.lang.IllegalAccessError:试图访问方法 com.google.common.collect.MapMaker.makeComputingMap(Lcom/google/common/base/Function;)Ljava/util/concurrent/ConcurrentMap;来自 class com.jolbox.bonecp.BoneCPDataSource

这表示访问makeComputngMap 将失败。自 guava 版本 15.0 以来,makeComputingMap 不再是 public 方法。番石榴版本的较新版本覆盖了 spark 使用的旧版本。您能否检查您的 spark classpath 并将 14.0.1 版本保留为 classpath 中的第一个条目或替换为旧版本。

http://google.github.io/guava/releases/14.0.1/api/docs/com/google/common/collect/MapMaker.html

http://google.github.io/guava/releases/15.0/api/docs/com/google/common/collect/MapMaker.html

谢谢 拉维