具有外部 MySQL 作为 Hive 元存储的 EMR 集群

EMR cluster with external MySQL as Hive metastore

我正在尝试使用外部 MySQL 设置 EMR 集群作为 Hive 元存储。 我在 EC2 盒子上创建了 MySQL 数据库 "metastore" 并在下面的 hive-site.xml

中使用
<configuration>   <property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://10.10.xxx.xxx:3306/metastore?createDatabaseIfNotExist=true</value>
    <description>JDBC connect string for a JDBC metastore</description>   </property>   <property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>hiveuser</value>
    <description>Username to use against metastore database</description>   </property>   <property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>xxxxxx</value>
    <description>Password to use against metastore database</description>   </property> </configuration>

集群创建失败并出现以下错误(来自 stderr 文件的日志)

org.apache.hadoop.hive.metastore.HiveMetaException: Failed to get schema version. * schemaTool failed org.apache.hadoop.hive.metastore.HiveMetaException: Failed to get schema version. schemaTool failed * /mnt/var/lib/hadoop/steps/s-xxxxxxxxx/./hive-script:617: Error executing cmd: /usr/share/aws/emr/scripts/hive-script "--install-hive" "--base-path" "s3://us-west-2.elasticmapreduce/libs/hive" "--hive-versionsCommand exiting with ret '1'

请帮忙。

存在一些 AWS 安全组问题。通过允许访问 MySQL 端口,我解决了这个问题