Spark 没有这个字段 METASTORE_CLIENT_FACTORY_CLASS
Spark no such field METASTORE_CLIENT_FACTORY_CLASS
我正在尝试在 Java 中使用 spark 查询配置单元 table。我的配置单元 table 在 EMR 集群 5.12 中。 Spark 版本是 2.2.1 和 Hive 2.3.2.
当我通过 ssh 进入机器并连接到 spark-shell 时,我可以毫无问题地查询配置单元 tables。
但是当我尝试使用自定义 jar 进行查询时,出现以下异常:
java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder':
at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1067)
at org.apache.spark.sql.SparkSession$$anonfun$sessionState.apply(SparkSession.scala:142)
at org.apache.spark.sql.SparkSession$$anonfun$sessionState.apply(SparkSession.scala:141)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:141)
at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:138)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:637)
at package.Session.executeQuery(Session.java:48)
at com.etl.cli.ETL.main(ETL.java:16)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon.run(ApplicationMaster.scala:635)
Caused by: org.apache.spark.sql.AnalysisException: java.lang.NoSuchFieldError: METASTORE_CLIENT_FACTORY_CLASS;
at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:106)
at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:194)
at org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:105)
at org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:93)
at org.apache.spark.sql.hive.HiveSessionStateBuilder.externalCatalog(HiveSessionStateBuilder.scala:39)
at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog$lzycompute(HiveSessionStateBuilder.scala:54)
at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:52)
at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:35)
at org.apache.spark.sql.internal.BaseSessionStateBuilder.build(BaseSessionStateBuilder.scala:289)
at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1064)
... 16 more
Caused by: java.lang.NoSuchFieldError: METASTORE_CLIENT_FACTORY_CLASS
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClientFactory(Hive.java:3011)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3006)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3042)
at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1235)
at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:175)
at org.apache.hadoop.hive.ql.metadata.Hive.<clinit>(Hive.java:167)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503)
at org.apache.spark.sql.hive.client.HiveClientImpl.<init>(HiveClientImpl.scala:191)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264)
at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:362)
at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:266)
at org.apache.spark.sql.hive.HiveExternalCatalog.client$lzycompute(HiveExternalCatalog.scala:66)
at org.apache.spark.sql.hive.HiveExternalCatalog.client(HiveExternalCatalog.scala:65)
at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists.apply$mcZ$sp(HiveExternalCatalog.scala:195)
at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists.apply(HiveExternalCatalog.scala:195)
at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists.apply(HiveExternalCatalog.scala:195)
at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)
... 25 more
我的 pom 看起来像这样:
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.2.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.2.1</version>
</dependency>
实例化会话:
SparkConf conf = new SparkConf();
conf.set("spark.sql.warehouse.dir", "/user/hive/warehouse");
conf.set("hive.metastore.client.factory.class",
"org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore.Client.Factory");
SparkSession sparkSession = SparkSession.builder()
.appName("Spark Application")
.master("yarn-cluster")
.config(conf)
.enableHiveSupport()
.getOrCreate();
你知道哪里出了问题吗?
谢谢
经过几天的调试,我意识到我的 pom 中的另一个依赖项导致了这个错误。当我删除 hive-jdbc 客户端时,应用程序工作正常。
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>2.3.3</version>
</dependency>
我正在尝试在 Java 中使用 spark 查询配置单元 table。我的配置单元 table 在 EMR 集群 5.12 中。 Spark 版本是 2.2.1 和 Hive 2.3.2.
当我通过 ssh 进入机器并连接到 spark-shell 时,我可以毫无问题地查询配置单元 tables。
但是当我尝试使用自定义 jar 进行查询时,出现以下异常:
java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder':
at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1067)
at org.apache.spark.sql.SparkSession$$anonfun$sessionState.apply(SparkSession.scala:142)
at org.apache.spark.sql.SparkSession$$anonfun$sessionState.apply(SparkSession.scala:141)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:141)
at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:138)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:637)
at package.Session.executeQuery(Session.java:48)
at com.etl.cli.ETL.main(ETL.java:16)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon.run(ApplicationMaster.scala:635)
Caused by: org.apache.spark.sql.AnalysisException: java.lang.NoSuchFieldError: METASTORE_CLIENT_FACTORY_CLASS;
at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:106)
at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:194)
at org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:105)
at org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:93)
at org.apache.spark.sql.hive.HiveSessionStateBuilder.externalCatalog(HiveSessionStateBuilder.scala:39)
at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog$lzycompute(HiveSessionStateBuilder.scala:54)
at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:52)
at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:35)
at org.apache.spark.sql.internal.BaseSessionStateBuilder.build(BaseSessionStateBuilder.scala:289)
at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1064)
... 16 more
Caused by: java.lang.NoSuchFieldError: METASTORE_CLIENT_FACTORY_CLASS
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClientFactory(Hive.java:3011)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3006)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3042)
at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1235)
at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:175)
at org.apache.hadoop.hive.ql.metadata.Hive.<clinit>(Hive.java:167)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503)
at org.apache.spark.sql.hive.client.HiveClientImpl.<init>(HiveClientImpl.scala:191)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264)
at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:362)
at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:266)
at org.apache.spark.sql.hive.HiveExternalCatalog.client$lzycompute(HiveExternalCatalog.scala:66)
at org.apache.spark.sql.hive.HiveExternalCatalog.client(HiveExternalCatalog.scala:65)
at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists.apply$mcZ$sp(HiveExternalCatalog.scala:195)
at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists.apply(HiveExternalCatalog.scala:195)
at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists.apply(HiveExternalCatalog.scala:195)
at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)
... 25 more
我的 pom 看起来像这样:
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.2.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.2.1</version>
</dependency>
实例化会话:
SparkConf conf = new SparkConf();
conf.set("spark.sql.warehouse.dir", "/user/hive/warehouse");
conf.set("hive.metastore.client.factory.class",
"org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore.Client.Factory");
SparkSession sparkSession = SparkSession.builder()
.appName("Spark Application")
.master("yarn-cluster")
.config(conf)
.enableHiveSupport()
.getOrCreate();
你知道哪里出了问题吗?
谢谢
经过几天的调试,我意识到我的 pom 中的另一个依赖项导致了这个错误。当我删除 hive-jdbc 客户端时,应用程序工作正常。
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>2.3.3</version>
</dependency>