我可以在 Amazon EMR 环境之外使用 s3 作为 Hive 存储吗?
Can I use s3 as a Hive storage outside the Amazon EMR environment?
我遇到过在 EC2 机器上管理服务的情况。这台机器 运行 Hive 和我打算使用 s3 作为我的 hive 存储(而不是 hdfs)。可能吗?
是的,这是可能的。您需要使用 AWS_ACCESS_KEY_ID 和 AWS_SECRET_ACCESS_KEY 更新 ec2 实例上的配置文件,以便实例为 运行 的节点可以访问您的 s3 存储桶。
这里有详细的操作方法 http://blog.mustardgrain.com/2010/09/30/using-hive-with-existing-files-on-s3/
一些选择位:
Now, let’s change our configuration a bit so that we can access the S3
bucket with all our data. First, we need to include the following
configuration. This can be done via HIVE_OPTS, configuration files
($HIVE_HOME/conf/hive-site.xml), or via Hive CLI’s SET command.
Here are the configuration parameters:
Name fs.s3n.awsAccessKeyId Value Your S3 access key
Name fs.s3n.awsSecretAccessKey Value Your S3 secret access key
并且:
Whether you prefer the term veneer, façade, wrapper, or whatever, we need to
tell Hive where to find our data and the format of the files. Let’s
create a Hive table definition that references the data in S3:
CREATE EXTERNAL TABLE mydata (key STRING, value INT)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '='
LOCATION 's3n://mys3bucket/';
我遇到过在 EC2 机器上管理服务的情况。这台机器 运行 Hive 和我打算使用 s3 作为我的 hive 存储(而不是 hdfs)。可能吗?
是的,这是可能的。您需要使用 AWS_ACCESS_KEY_ID 和 AWS_SECRET_ACCESS_KEY 更新 ec2 实例上的配置文件,以便实例为 运行 的节点可以访问您的 s3 存储桶。
这里有详细的操作方法 http://blog.mustardgrain.com/2010/09/30/using-hive-with-existing-files-on-s3/
一些选择位:
Now, let’s change our configuration a bit so that we can access the S3 bucket with all our data. First, we need to include the following configuration. This can be done via HIVE_OPTS, configuration files ($HIVE_HOME/conf/hive-site.xml), or via Hive CLI’s SET command.
Here are the configuration parameters:
Name fs.s3n.awsAccessKeyId Value Your S3 access key
Name fs.s3n.awsSecretAccessKey Value Your S3 secret access key
并且:
Whether you prefer the term veneer, façade, wrapper, or whatever, we need to tell Hive where to find our data and the format of the files. Let’s create a Hive table definition that references the data in S3:
CREATE EXTERNAL TABLE mydata (key STRING, value INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY '=' LOCATION 's3n://mys3bucket/';