AWS Java SDK v2:将目录上传到 S3

AWS Java SDK v2: Upload a directory to S3

我想使用 AWS Java SDK v2 将目录上传到 S3。

例如,我将如何实现以下功能?

fun uploadDirectory(bucket: String, prefix: String, directory: Path)

我希望 directory 的内容被复制到 S3 上的 s3://bucket/prefix/

v2 SDK 文档有一个来自 v1 的 uploading a single object, but there doesn't seem to be an equivalent to this Upload a Directory 示例。

TransferManager 以及其他一些高级库在 v2 中尚不可用。所以你将不得不使用 migration guide

中的 v1

High-level libraries, such as the Amazon S3 Transfer Manager and the Amazon SQS Client-side Buffering, are not yet available in version 2.x. See the AWS SDK for Java 2.x changelog for a complete list of libraries.

If your application depends on these libraries, see Using both SDKs side-by-side to learn how to configure your pom.xml to use both 1.x and 2.x. Refer to the AWS SDK for Java 2.x changelog for updates about these libraries.

您可以使用以下策略来实现它:

  1. 使用 Files.walk 遍历目录,识别所有文件。
  2. 通过 S3AsyncClient.putObject.
  3. 使用 SDK 异步上传文件
  4. 使用CompletableFuture.allOf合并所有上传任务,等待完成。

此策略使用异步客户端的 default thread pool 50 个线程。对于包含数千个文件的目录,这对我来说工作正常。

这里的s3Prefix是为每个上传到bucket的对象添加的前缀,相当于目标目录。

fun uploadDirectory(s3Bucket: String, s3Prefix: String, directory: Path) {
    require(directory.isDirectory())

    Files.walk(directory).use { stream ->
        stream.asSequence()
            .filter { it.isRegularFile() }
            .map { path ->
                putObject(
                    s3Bucket = s3Bucket,
                    s3Key = "$s3Prefix/${directory.relativize(path)}",
                    path = path
                )
            }
            .toList().toTypedArray()
    }.let { CompletableFuture.allOf(*it) }.join()
}

private fun putObject(s3Bucket: String, s3Key: String, path: Path)
    : CompletableFuture<PutObjectResponse> {
    val request = PutObjectRequest.builder()
        .bucket(s3Bucket)
        .key(s3Key)
        .build()

    return s3AsyncClient.putObject(request, path)
}