如何使用 Java 读取 AWS S3 文件？

Question

我试图从 AWS S3 读取文件到我的 java 代码：

  File file = new File("s3n://mybucket/myfile.txt");
  FileInputStream fileInput = new FileInputStream(file);

然后我得到一个错误：

java.io.FileNotFoundException: s3n:/mybucket/myfile.txt (No such file or directory)
    at java.io.FileInputStream.open(Native Method)
    at java.io.FileInputStream.<init>(FileInputStream.java:146)

有没有办法 open/read 来自 AWS S3 的文件？非常感谢！

Answer 1

来自 Java 的 'File' class 不理解 S3 的存在。 Here's an example of reading a file from the AWS documentation:

AmazonS3 s3Client = new AmazonS3Client(new ProfileCredentialsProvider());        
S3Object object = s3Client.getObject(new GetObjectRequest(bucketName, key));
InputStream objectData = object.getObjectContent();
// Process the objectData stream.
objectData.close();

Answer 2

在 2019 年，从 S3 读取文件的方式更加优化：

private final AmazonS3 amazonS3Client = AmazonS3ClientBuilder.standard().build();

private Collection<String> loadFileFromS3() {
    try (final S3Object s3Object = amazonS3Client.getObject(BUCKET_NAME,
                                                            FILE_NAME);
        final InputStreamReader streamReader = new InputStreamReader(s3Object.getObjectContent(), StandardCharsets.UTF_8);
        final BufferedReader reader = new BufferedReader(streamReader)) {
        return reader.lines().collect(Collectors.toSet());
    } catch (final IOException e) {
        log.error(e.getMessage(), e)
        return Collections.emptySet();
    }
}

Answer 3

如果文件的内容是字符串，那么您可以在 getObjectContent() 上使用 getObjectAsString. Otherwise, you can use IOUtils.toByteArray 将文件的内容读入字节数组。

显然，这些最适合用于 small-ish 易于装入内存的 S3 对象。

private final AmazonS3 amazonS3Client = AmazonS3ClientBuilder.standard().build();

private String loadStringFromS3() {
    try {
        return amazonS3Client.getObjectAsString(BUCKET_NAME, FILE_NAME);
    } catch (final IOException e) {
        log.error(e.getMessage(), e)
        return null;
    }
}

private byte[] loadDataFromS3() {
    try (final S3Object s3Object = amazonS3Client.getObject(BUCKET_NAME, FILE_NAME)) {
        return IOUtils.toByteArray(s3Object.getObjectContent());
    } catch (final IOException e) {
        log.error(e.getMessage(), e)
        return null;
    } finally {
        IOUtils.closeQuietly(object, log);
    }
}

Answer 4

在java中读取S3文件的步骤可以是：

创建 AmazonS3Client。
使用存储桶名称和密钥创建 S3Object。
使用 S3Object 创建缓冲区 reader 并逐行读取文件。

1 >>>

    BasicAWSCredentials awsCreds = new BasicAWSCredentials("accessKey", "secretKey");
    AmazonS3 s3Client = AmazonS3ClientBuilder.standard()
            .withCredentials(new AWSStaticCredentialsProvider(awsCreds))
            .withRegion("region_name_here").build();

2 >>>

   S3Object object = s3Client.getObject(new GetObjectRequest("bucketName", "key"));

3 >>>

   BufferedReader reader = new BufferedReader(new InputStreamReader(object.getObjectContent()));

    String s = null;
    while ((s = reader.readLine()) != null)
    {
        System.out.println(s);
        //your business logic here
    }

谢谢。

Answer 5

这是我的解决方案。我正在使用 spring 启动 2.4.3

创建一个 amazon s3 客户端

AmazonS3 amazonS3Client = AmazonS3ClientBuilder
                .standard()
                .withRegion("your-region")
                .withCredentials(
                        new AWSStaticCredentialsProvider(
                            new BasicAWSCredentials("your-access-key", "your-secret-access-key")))
                .build();

创建一个亚马逊转账客户端。

TransferManager transferManagerClient = TransferManagerBuilder.standard()
                .withS3Client(amazonS3Client)
                .build();

在/tmp/{your-s3-key}中创建一个临时文件，这样我们就可以把我们下载的文件放在这个文件。

File file = new File(System.getProperty("java.io.tmpdir"), "your-s3-key"); 

try {
    file.createNewFile(); // Create temporary file
} catch (IOException e) {
    e.printStackTrace();
}

file.mkdirs();  // Create the directory of the temporary file

然后，我们使用transfer manager client

从s3下载文件

// Note that in this line the s3 file downloaded has been transferred in to the temporary file that we created
Download download = transferManagerClient.download(
               new GetObjectRequest("your-s3-bucket-name", "your-s3-key"), file); 

// This line blocks the thread until the download is finished
download.waitForCompletion();

现在s3文件已经成功传输到我们创建的临时文件中。我们可以得到临时文件.

的InputStream

InputStream input = new DataInputStream(new FileInputStream(file));

因为临时文件不再需要了，我们直接删除。

file.delete();

Answer 6

我们也可以用software.amazon.awssdk:s3

 //Assuming the credentials are read from Environment Variables, so no hardcoding here

    S3Client client = S3Client.builder()
                        .region(regionSelected)
                        .build();
    
    GetObjectRequest getObjectRequest = GetObjectRequest.builder()
                    .bucket(bucketName)
                    .key(fileName)
                    .build();
    
    ResponseInputStream<GetObjectResponse> responseInputStream = client.getObject(getObjectRequest);

    InputStream stream = new ByteArrayInputStream(responseInputStream.readAllBytes());
    
    
    System.out.println("Content :"+ new String(responseInputStream.readAllBytes(), StandardCharsets.UTF_8));

如何使用 Java 读取 AWS S3 文件？

How can I read an AWS S3 File with Java?

java

amazon-s3

amazon-web-services