AWS SDK V2 S3 获取对象未获取超过 1000 个对象

AWS SDK V2 S3 fetch object is not fetching objects more than 1000

我使用的是 AWS SDK 版本:2.16.78。但是 ListObjectsRequest 对象没有获取超过 1000 个对象。

我确实浏览了文档,但无法找到如何设置连续标记。 我正在使用下面的代码片段

 try {
        ListObjectsRequest listObjects = ListObjectsRequest
                .builder()
                .bucket(bucketName)
                .build();

        ListObjectsResponse res = s3.listObjects(listObjects);
        List<S3Object> objects = res.contents();

        for (ListIterator iterVals = objects.listIterator(); iterVals.hasNext(); ) {
            S3Object myValue = (S3Object) iterVals.next();
            System.out.print("\n The name of the key is " + myValue.key());
         }

    } catch (S3Exception e) {
        System.err.println(e.awsErrorDetails().errorMessage());
        System.exit(1);
    }

以上代码仅获取 1000 个 s3 对象。

正如您所说,AWS 只会 return up to 1000 of the objects in a bucket:

Returns some or all (up to 1,000) of the objects in a bucket.

Amazon S3 按字母顺序列出对象。您可以利用这一事实并提供 marker to the key that should be used to start with in the next requests, if appropriate:

try {

  ListObjectsRequest listObjects = ListObjectsRequest
    .builder()
    .bucket(bucketName)
      .build()
  ;

  ListObjectsResponse listObjectsResponse = null;
  String lastKey = null;

  do {
    if ( listObjectsResponse != null ) {
      listObjectsRequest = listObjectsRequest.toBuilder()
         .marker(lastKey)
           .build()
      ; 
    }

    listObjectsResponse = s3.listObjects(listObjectsRequest); 

    List<S3Object> objects = listObjectsResponse.contents();

    // Iterate over results
    for (ListIterator iterVals = objects.listIterator();    iterVals.hasNext(); ) {
      S3Object myValue = (S3Object) iterVals.next();
      String key = myValue.key();
      System.out.print("\n The name of the key is " + key);
      // Update the value of the last key processed
      lastKey = object.key();
    }
  } while ( listObjectsResponse.isTruncated() );
} catch (S3Exception e) {
  System.err.println(e.awsErrorDetails().errorMessage());
  System.exit(1);
}

使用列表对象的 v2 API ListObjectsV2Request startAfter 方法可以实现非常相似的东西。

对于 v2,您还可以使用 ListObjectsV2Response 和继续标记。类似于:

try {

  ListObjectsV2Request listObjects = ListObjectsV2Request
    .builder()
    .bucket(bucketName)
      .build()
  ;

  ListObjectsV2Response listObjectsResponse = null;
  String nextContinuationToken = null;

  do {
    if ( listObjectsResponse != null ) {
      listObjectsRequest = listObjectsRequest.toBuilder()
         .continuationToken(nextContinuationToken)
           .build()
      ; 
    }

    listObjectsResponse = s3.listObjectsV2(listObjectsRequest); 
    nextContinuationToken = listObjectsResponse.nextContinuationToken();

    List<S3Object> objects = listObjectsResponse.contents();

    // Iterate over results
    for (ListIterator iterVals = objects.listIterator();    iterVals.hasNext(); ) {
      S3Object myValue = (S3Object) iterVals.next();
      String key = myValue.key();
      System.out.print("\n The name of the key is " + key);
    }
  } while ( listObjectsResponse.isTruncated() );
} catch (S3Exception e) {
  System.err.println(e.awsErrorDetails().errorMessage());
  System.exit(1);
}

终于可以使用listObjectsV2Paginator method to iterate over the results in a similar way like listNextBatchOfObjects was used in the v1 of the API. See for instance this related v1 code and these 2相关的SO题了

API 的 v1 和 v2 版本的操作之间的所有映射都记录在案 here