Spark SQL 通过 Hive Metastore 查询“SHOW VIEWS IN”失败,“'IN' 处缺少 'FUNCTIONS'”

Spark SQL query `SHOW VIEWS IN` through Hive metastore fails with `missing 'FUNCTIONS' at 'IN'`

拥有带有 Hive 元存储的 Spark (2.4.4) 运行。通过 JDBC/ODBC 使用

之类的查询访问它时

SHOW VIEWS IN space1

我收到以下错误:

[2020-03-18T10:54:57,722][DEBUG][HiveServer2-Background-Pool: Thread-203][org.apache.spark.sql.execution.SparkSqlParser][][] Parsing command: SHOW VIEWS IN `space1` 
[2020-03-18T10:54:57,733][ERROR][HiveServer2-Background-Pool: Thread-203][org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation][][] Error executing query, currentState RUNNING,  
org.apache.spark.sql.catalyst.parser.ParseException: 
missing 'FUNCTIONS' at 'IN'(line 1, pos 11)

== SQL ==
SHOW VIEWS IN `space1`
-----------^^^

    at org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:241) ~[spark-catalyst_2.11-2.4.4.jar:2.4.4]
    at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:117) ~[spark-catalyst_2.11-2.4.4.jar:2.4.4]
    at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:48) ~[spark-sql_2.11-2.4.4.jar:2.4.4]
    at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(ParseDriver.scala:69) ~[spark-catalyst_2.11-2.4.4.jar:2.4.4]
    at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642) ~[spark-sql_2.11-2.4.4.jar:2.4.4]
    at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694) ~[spark-sql_2.11-2.4.4.jar:2.4.4]
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:232) [spark-hive-thriftserver_2.11-2.4.4.jar:2.4.4]
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$$anon.run(SparkExecuteStatementOperation.scala:175) [spark-hive-thriftserver_2.11-2.4.4.jar:2.4.4]
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$$anon.run(SparkExecuteStatementOperation.scala:171) [spark-hive-thriftserver_2.11-2.4.4.jar:2.4.4]
    at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_201]
    at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_201]
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844) [hadoop-common-2.8.5.jar:?]
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon.run(SparkExecuteStatementOperation.scala:185) [spark-hive-thriftserver_2.11-2.4.4.jar:2.4.4]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_201]
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_201]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_201]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_201]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
[2020-03-18T10:54:57,765][ERROR][HiveServer2-Background-Pool: Thread-203][org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation][][] Error running hive query:  
org.apache.hive.service.cli.HiveSQLException: org.apache.spark.sql.catalyst.parser.ParseException: 
missing 'FUNCTIONS' at 'IN'(line 1, pos 11)

== SQL ==
SHOW VIEWS IN `space1`
-----------^^^

    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:269) ~[spark-hive-thriftserver_2.11-2.4.4.jar:2.4.4]
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$$anon.run(SparkExecuteStatementOperation.scala:175) [spark-hive-thriftserver_2.11-2.4.4.jar:2.4.4]
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$$anon.run(SparkExecuteStatementOperation.scala:171) [spark-hive-thriftserver_2.11-2.4.4.jar:2.4.4]
    at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_201]
    at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_201]
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844) [hadoop-common-2.8.5.jar:?]
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon.run(SparkExecuteStatementOperation.scala:185) [spark-hive-thriftserver_2.11-2.4.4.jar:2.4.4]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_201]
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_201]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_201]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_201]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]

例如当我将 Tableau 连接到我的 Spark 时,我得到了它,或者我可以通过 JDBC 连接的 SQL 工具显式触发查询。

有什么想法吗?

注意,像

这样的查询
SELECT * FROM `employer` WHERE `Name` IN ('John','Alex');

顺利完成!

之前也有人遇到过这个问题,但没有得到回应:https://community.powerbi.com/t5/Desktop/Spark-connector-issue/td-p/952481

SHOW VIEWS 命令仅适用于 Spark 3。这就是您看到该错误的原因。

参见:https://issues.apache.org/jira/browse/SPARK-31113