用于连接到 Spark Thrift 服务器的 ODBC 配置
ODBC configuration to connect to Spark Thrift Server
这个问题好像重复了,其实我看到过几个与此相关的问题,但错误不完全一样,所以我想问问有没有人有线索。
我已经设置了 Spark Thrift 服务器 运行 默认设置。 Spark 版本是 2.1,运行在 YARN (Hadoop 2.7.3)
事实是我无法设置 Simba 配置单元 ODBC 驱动程序或 Microsoft 驱动程序,因此 ODBC 设置中的测试无法成功。
这是我用于 Microsoft Hive ODBC 驱动程序的配置:
当我点击“测试”按钮时,显示的错误消息如下:
在 Spark Thrift 服务器日志中看到以下内容:
17/09/15 17:31:36 INFO ThriftCLIService: Client protocol version: HIVE_CLI_SERVICE_PROTOCOL_V1
17/09/15 17:31:36 INFO SessionState: Created local directory: /tmp/00abf145-2928-4995-81f2-fea578280c42_resources
17/09/15 17:31:36 INFO SessionState: Created HDFS directory: /tmp/hive/test/00abf145-2928-4995-81f2-fea578280c42
17/09/15 17:31:36 INFO SessionState: Created local directory: /tmp/vagrant/00abf145-2928-4995-81f2-fea578280c42
17/09/15 17:31:36 INFO SessionState: Created HDFS directory: /tmp/hive/test/00abf145-2928-4995-81f2-fea578280c42/_tmp_space.db
17/09/15 17:31:36 INFO HiveSessionImpl: Operation log session directory is created: /tmp/vagrant/operation_logs/00abf145-2928-4995-81f2-fea578280c42
17/09/15 17:31:36 INFO SparkExecuteStatementOperation: Running query 'set -v' with 82d7f9a6-f2a6-4ebd-93bb-5c8da1611f84
17/09/15 17:31:36 INFO SparkSqlParser: Parsing command: set -v
17/09/15 17:31:36 INFO SparkExecuteStatementOperation: Result Schema: StructType(StructField(key,StringType,false), StructField(value,StringType,false), StructField(meaning,StringType,false))
如果我使用 JDBC 驱动程序通过 Beeline 连接(工作正常),这些是日志:
17/09/15 17:04:24 INFO ThriftCLIService: Client protocol version: HIVE_CLI_SERVICE_PROTOCOL_V8
17/09/15 17:04:24 INFO SessionState: Created HDFS directory: /tmp/hive/test
17/09/15 17:04:24 INFO SessionState: Created local directory: /tmp/c0681d6f-cc0f-40ae-970d-e3ea366aa414_resources
17/09/15 17:04:24 INFO SessionState: Created HDFS directory: /tmp/hive/test/c0681d6f-cc0f-40ae-970d-e3ea366aa414
17/09/15 17:04:24 INFO SessionState: Created local directory: /tmp/vagrant/c0681d6f-cc0f-40ae-970d-e3ea366aa414
17/09/15 17:04:24 INFO SessionState: Created HDFS directory: /tmp/hive/test/c0681d6f-cc0f-40ae-970d-e3ea366aa414/_tmp_space.db
17/09/15 17:04:24 INFO HiveSessionImpl: Operation log session directory is created: /tmp/vagrant/operation_logs/c0681d6f-cc0f-40ae-970d-e3ea366aa414
17/09/15 17:04:24 INFO SparkSqlParser: Parsing command: use default
17/09/15 17:04:25 INFO HiveMetaStore: 1: get_database: default
17/09/15 17:04:25 INFO audit: ugi=vagrant ip=unknown-ip-addr cmd=get_database: default
17/09/15 17:04:25 INFO HiveMetaStore: 1: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
17/09/15 17:04:25 INFO ObjectStore: ObjectStore, initialize called
17/09/15 17:04:25 INFO Query: Reading in results for query "org.datanucleus.store.rdbms.query.SQLQuery@0" since the connection used is closing
17/09/15 17:04:25 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
17/09/15 17:04:25 INFO ObjectStore: Initialized ObjectStore
好吧,我通过安装 Microsoft Spark ODBC 驱动程序而不是 Hive 驱动程序成功连接。
看起来问题与驱动程序在发现它不是基于某个服务器 属性 的 Hive2 服务器时拒绝连接到 Spark Thrift 服务器有关。我怀疑 Hive2 和 Spark thrift 服务器之间在线路级别存在实际差异,因为后者是前者的端口,在协议级别(Thrift)没有变化,但无论如何,解决方案是转移到这个驱动程序并配置它与 Hive2 一样:
这个问题好像重复了,其实我看到过几个与此相关的问题,但错误不完全一样,所以我想问问有没有人有线索。
我已经设置了 Spark Thrift 服务器 运行 默认设置。 Spark 版本是 2.1,运行在 YARN (Hadoop 2.7.3)
事实是我无法设置 Simba 配置单元 ODBC 驱动程序或 Microsoft 驱动程序,因此 ODBC 设置中的测试无法成功。
这是我用于 Microsoft Hive ODBC 驱动程序的配置:
当我点击“测试”按钮时,显示的错误消息如下:
在 Spark Thrift 服务器日志中看到以下内容:
17/09/15 17:31:36 INFO ThriftCLIService: Client protocol version: HIVE_CLI_SERVICE_PROTOCOL_V1
17/09/15 17:31:36 INFO SessionState: Created local directory: /tmp/00abf145-2928-4995-81f2-fea578280c42_resources
17/09/15 17:31:36 INFO SessionState: Created HDFS directory: /tmp/hive/test/00abf145-2928-4995-81f2-fea578280c42
17/09/15 17:31:36 INFO SessionState: Created local directory: /tmp/vagrant/00abf145-2928-4995-81f2-fea578280c42
17/09/15 17:31:36 INFO SessionState: Created HDFS directory: /tmp/hive/test/00abf145-2928-4995-81f2-fea578280c42/_tmp_space.db
17/09/15 17:31:36 INFO HiveSessionImpl: Operation log session directory is created: /tmp/vagrant/operation_logs/00abf145-2928-4995-81f2-fea578280c42
17/09/15 17:31:36 INFO SparkExecuteStatementOperation: Running query 'set -v' with 82d7f9a6-f2a6-4ebd-93bb-5c8da1611f84
17/09/15 17:31:36 INFO SparkSqlParser: Parsing command: set -v
17/09/15 17:31:36 INFO SparkExecuteStatementOperation: Result Schema: StructType(StructField(key,StringType,false), StructField(value,StringType,false), StructField(meaning,StringType,false))
如果我使用 JDBC 驱动程序通过 Beeline 连接(工作正常),这些是日志:
17/09/15 17:04:24 INFO ThriftCLIService: Client protocol version: HIVE_CLI_SERVICE_PROTOCOL_V8
17/09/15 17:04:24 INFO SessionState: Created HDFS directory: /tmp/hive/test
17/09/15 17:04:24 INFO SessionState: Created local directory: /tmp/c0681d6f-cc0f-40ae-970d-e3ea366aa414_resources
17/09/15 17:04:24 INFO SessionState: Created HDFS directory: /tmp/hive/test/c0681d6f-cc0f-40ae-970d-e3ea366aa414
17/09/15 17:04:24 INFO SessionState: Created local directory: /tmp/vagrant/c0681d6f-cc0f-40ae-970d-e3ea366aa414
17/09/15 17:04:24 INFO SessionState: Created HDFS directory: /tmp/hive/test/c0681d6f-cc0f-40ae-970d-e3ea366aa414/_tmp_space.db
17/09/15 17:04:24 INFO HiveSessionImpl: Operation log session directory is created: /tmp/vagrant/operation_logs/c0681d6f-cc0f-40ae-970d-e3ea366aa414
17/09/15 17:04:24 INFO SparkSqlParser: Parsing command: use default
17/09/15 17:04:25 INFO HiveMetaStore: 1: get_database: default
17/09/15 17:04:25 INFO audit: ugi=vagrant ip=unknown-ip-addr cmd=get_database: default
17/09/15 17:04:25 INFO HiveMetaStore: 1: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
17/09/15 17:04:25 INFO ObjectStore: ObjectStore, initialize called
17/09/15 17:04:25 INFO Query: Reading in results for query "org.datanucleus.store.rdbms.query.SQLQuery@0" since the connection used is closing
17/09/15 17:04:25 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
17/09/15 17:04:25 INFO ObjectStore: Initialized ObjectStore
好吧,我通过安装 Microsoft Spark ODBC 驱动程序而不是 Hive 驱动程序成功连接。 看起来问题与驱动程序在发现它不是基于某个服务器 属性 的 Hive2 服务器时拒绝连接到 Spark Thrift 服务器有关。我怀疑 Hive2 和 Spark thrift 服务器之间在线路级别存在实际差异,因为后者是前者的端口,在协议级别(Thrift)没有变化,但无论如何,解决方案是转移到这个驱动程序并配置它与 Hive2 一样: