如何使用全局和本地二级索引将数据加载到 Dynamo DB Table?

How to load data to Dynamo DB Table with Global and Local Secondary Indexes ??

我使用 AWS 控制台在 Dynamo DB 中创建了一些 tables 并定义了一些全局和二级索引。

现在的问题是如何使用 java 的 AWS SDK 在 table 中加载数据。 我浏览了 Dynamo DB 的开发人员指南 (http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Introduction.html),但是当您的 table 具有全局二级索引时,我找不到如何加载数据。

用数据加载 table 的代码是:

  Table table = dynamoDB.getTable(tableName);
  Item item = new Item().withPrimaryKey("Name", "Amazon DynamoDB")
                        .withString("Category", "Amazon Web Services")
                        .withNumber("Threads", 2)
                        .withNumber("Messages", 4)
                        .withNumber("Views", 1000);
  table.putItem(item);

现在假设在我的 table 中,如果我将 Views 定义为 Global Secondar Index。 所以相同的代码可以工作,或者有一些不同的方法来处理这种用例??

您不能直接写入 GSI,即使它们是为 table 定义的。

相反,DynamoDB 会自动将项目插入、更新和删除传播到 GSI,具体取决于每个 GSI 中的 ProjectionType 集。只要您在 CreateTable 操作中定义了 GSI,或在 UpdateTable 操作中添加了它们,GSI 就会反映您放入基础 table 的项目,具体取决于 ProjectionType.

如果您的 table 有二级索引(本地或全局),当您在 table 本身中插入数据时,它们将自动 created/maintained。 DynamoDB 将自动管理您的索引,确保它们与您在 table.

中的任何内容一致

然而,另一个完全不同的问题(您应该注意)是如何将数据正确加载到 DynamoDB tables:

There are times when you load data from other data sources into DynamoDB. Typically, DynamoDB partitions your table data on multiple servers. When uploading data to a table, you get better performance if you upload data to all the allocated servers simultaneously. For example, suppose you want to upload user messages to a DynamoDB table. You might design a table that uses a hash and range type primary key in which UserID is the hash attribute and the MessageID is the range attribute.

有关 Distribute Write Activity During Data Upload

的更多详细信息

在这里您可以找到一些关于二级索引的额外信息来补充您已经知道的内容:

For efficient access to data in a table, Amazon DynamoDB creates and maintains indexes for the primary key attributes. This allows applications to quickly retrieve data by specifying primary key values. However, many applications might benefit from having one or more secondary (or alternate) keys available, to allow efficient access to data with attributes other than the primary key. To address this, you can create one or more secondary indexes on a table, and issue Query or Scan requests against these indexes.

A secondary index is a data structure that contains a subset of attributes from a table, along with an alternate key to support Query operations. With a secondary index, queries are no longer restricted to the table primary key; you can also retrieve the data using the alternate key defined by the secondary index. A table can have multiple secondary indexes, which gives your applications access to many different query patterns.

The data in a secondary index consists of attributes that are projected, or copied, from the table into the index. When you create a secondary index, you define the alternate key for the index, along with any other attributes that you want to be projected in the index. DynamoDB copies these attributes into the index, along with the primary key attributes from the table. You can then query or scan the index just as you would query or scan a table.

Improving Data Access with Secondary Indexes in DynamoDB 部分应提供有关如何正确定义 GSI(包括 投影 等概念)的有用详细信息。