将数据从 AWS 存储桶复制到 Ceph 存储桶
Copying Data from AWS bucket to Ceph Bucket
我有一个 ceph 对象存储桶和一个 AWS 存储桶。我想将数据从 AWS 存储桶复制到 ceph 存储桶,而不将数据复制到任何中间系统本地系统。有没有办法做到这一点,因为需要单独的端点、单独的密钥。
我在hdfs-client.xml
中添加了两个桶的信息
现在我正在使用 Hadoop distcp
将数据从一个存储桶移动到另一个存储桶。
这很有帮助。
Configuring different S3 buckets with Per-Bucket Configuration
Different S3 buckets can be accessed with different S3A client configurations. This allows for different endpoints, data read and write strategies, as well as login details.
All fs.s3a options other than a small set of unmodifiable values (currently fs.s3a.impl) can be set on a per bucket basis.
The bucket specific option is set by replacing the fs.s3a. prefix on an option with fs.s3a.bucket.BUCKETNAME., where BUCKETNAME is the name of the bucket.
When connecting to a bucket, all options explicitly set will override the base fs.s3a. values.
As an example, a configuration could have a base configuration to use the IAM role information available when deployed in Amazon EC2.
<property>
<name>fs.s3a.aws.credentials.provider</name>
<value>com.amazonaws.auth.InstanceProfileCredentialsProvider</value>
</property>
This will become the default authentication mechanism for S3A buckets.
A bucket s3a://nightly/ used for nightly data can then be given a session key:
<property>
<name>fs.s3a.bucket.nightly.access.key</name>
<value>AKAACCESSKEY-2</value>
</property>
<property>
<name>fs.s3a.bucket.nightly.secret.key</name>
<value>SESSIONSECRETKEY</value>
</property>
<property>
<name>fs.s3a.bucket.nightly.session.token</name>
<value>Short-lived-session-token</value>
</property>
<property>
<name>fs.s3a.bucket.nightly.aws.credentials.provider</name>
<value>org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider</value>
</property>
Finally, the public s3a://landsat-pds/ bucket can be accessed anonymously:
<property>
<name>fs.s3a.bucket.landsat-pds.aws.credentials.provider</name>
<value>org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider</value>
</property>
我有一个 ceph 对象存储桶和一个 AWS 存储桶。我想将数据从 AWS 存储桶复制到 ceph 存储桶,而不将数据复制到任何中间系统本地系统。有没有办法做到这一点,因为需要单独的端点、单独的密钥。
我在hdfs-client.xml
中添加了两个桶的信息
现在我正在使用 Hadoop distcp
将数据从一个存储桶移动到另一个存储桶。
这很有帮助。
Configuring different S3 buckets with Per-Bucket Configuration
Different S3 buckets can be accessed with different S3A client configurations. This allows for different endpoints, data read and write strategies, as well as login details.
All fs.s3a options other than a small set of unmodifiable values (currently fs.s3a.impl) can be set on a per bucket basis.
The bucket specific option is set by replacing the fs.s3a. prefix on an option with fs.s3a.bucket.BUCKETNAME., where BUCKETNAME is the name of the bucket.
When connecting to a bucket, all options explicitly set will override the base fs.s3a. values.
As an example, a configuration could have a base configuration to use the IAM role information available when deployed in Amazon EC2.
<property>
<name>fs.s3a.aws.credentials.provider</name>
<value>com.amazonaws.auth.InstanceProfileCredentialsProvider</value>
</property>
This will become the default authentication mechanism for S3A buckets.
A bucket s3a://nightly/ used for nightly data can then be given a session key:
<property>
<name>fs.s3a.bucket.nightly.access.key</name>
<value>AKAACCESSKEY-2</value>
</property>
<property>
<name>fs.s3a.bucket.nightly.secret.key</name>
<value>SESSIONSECRETKEY</value>
</property>
<property>
<name>fs.s3a.bucket.nightly.session.token</name>
<value>Short-lived-session-token</value>
</property>
<property>
<name>fs.s3a.bucket.nightly.aws.credentials.provider</name>
<value>org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider</value>
</property>
Finally, the public s3a://landsat-pds/ bucket can be accessed anonymously:
<property>
<name>fs.s3a.bucket.landsat-pds.aws.credentials.provider</name>
<value>org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider</value>
</property>