为什么我们需要在 AWS S3 Java 客户端中设置 ReadLimit(int)

Why we need to setReadLimit(int) in AWS S3 Java client

我正在使用 AWS Java S3 库。

这是我的代码,它使用 AWS 的高级 API 将文件上传到 s3。

        ClientConfiguration configuration = new ClientConfiguration();
        configuration.setUseGzip(true);
        configuration.setConnectionTTL(1000 * 60 * 60);
        AmazonS3Client amazonS3Client = new AmazonS3Client(configuration);
        TransferManager transferManager = new TransferManager(amazonS3Client);

        ObjectMetadata objectMetadata = new ObjectMetadata();
        objectMetadata.setContentLength(message.getBodyLength());
        objectMetadata.setContentType("image/jpg");

        transferManager.getConfiguration().setMultipartUploadThreshold(1024 * 10);

        PutObjectRequest request = new PutObjectRequest("test", "/image/test", inputStream, objectMetadata);
        request.getRequestClientOptions().setReadLimit(1024 * 10);
        request.setSdkClientExecutionTimeout(1000 * 60 * 60);

        Upload upload = transferManager.upload(request);
        upload.waitForCompletion();

我正在尝试上传一个大文件。它工作正常,但有时我会遇到错误。我已将 readLimit 设置为 (1024*10)。

2019-04-05 06:41:05,679 ERROR [com.demo.AwsS3TransferThread] (Aws-S3-upload) Error in saving File[media/image/osc/54/54ec3f2f-a938-473c-94b7-a55f39aac4a6.png] on S3[demo-test]: com.amazonaws.ResetException: Failed to reset the request input stream;  If the request involves an input stream, the maximum stream buffer size can be configured via request.getRequestClientOptions().setReadLimit(int)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.resetRequestInputStream(AmazonHttpClient.java:1221)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1042)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:948)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:661)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:635)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:618)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access0(AmazonHttpClient.java:586)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:573)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:445)
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4041)
    at com.amazonaws.services.s3.AmazonS3Client.doUploadPart(AmazonS3Client.java:3041)
    at com.amazonaws.services.s3.AmazonS3Client.uploadPart(AmazonS3Client.java:3026)
    at com.amazonaws.services.s3.transfer.internal.UploadCallable.uploadPartsInSeries(UploadCallable.java:255)
    at com.amazonaws.services.s3.transfer.internal.UploadCallable.uploadInParts(UploadCallable.java:189)
    at com.amazonaws.services.s3.transfer.internal.UploadCallable.call(UploadCallable.java:121)
    at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:139)
    at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:47)

readLimit的作用是什么? 它将如何有用? 我应该怎么做才能避免这种异常?

经过 1 周的研究, 我发现如果你上传的文件大小小于48GB那么你可以设置readLimit值5.01MB。

因为 AWS 将文件分成多个部分,每个部分大小的值为 5MB(如果您没有更改最小部分大小值)。根据 AWS 规范,最后一部分大小小于 5MB。所以我设置了 readLimit 5MB,它解决了这个问题。

InputStream readLimit 目的:

Marks the current position in this input stream. A subsequent call to the reset method repositions this stream at the last marked position so that subsequent reads re-read the same bytes.Readlimit arguments tells this input stream to allow that many bytes to be read before the mark position gets invalidated. The general contract of mark is that, if the method markSupported returns, the stream somehow remembers all the bytes read after the call to mark and stands ready to supply those same bytes again if and whenever the method reset is called. However, the stream is not required to remember any data at all if more than readLimit bytes are read from the stream before reset is called.