Amazon Dynamodb:使用 Merkle 树的反熵副本同步

Amazon Dynamo: Anti-Entropy Replica Synchronisation using Merkel Trees

上下文: 我正在尝试实施 Amazon Dynamo 的副本同步,它使用默克尔树来检测副本之间的分歧。

如论文网络版所述here

Dynamo uses Merkle trees for anti-entropy as follows: Each node maintains a separate Merkle tree for each key range (the set of keys covered by a virtual node) it hosts. This allows nodes to compare whether the keys within a key range are up-to-date. In this scheme, two nodes exchange the root of the Merkle tree corresponding to the key ranges that they host in common. Subsequently, using the tree traversal scheme described above the nodes determine if they have any differences and perform the appropriate synchronization action.

我不明白“适当的同步操作”是什么意思。 通过使用适当的遍历,假设我们发现两个默克尔树的根节点之间存在差异——我们如何知道哪个是正确的?我们是否总是根据其逻辑时钟时间戳取较更新的?

您链接的 Dynamo 论文(请注意,这 与今天的 DynamoDB 产品不同,因此我删除了您在问题中添加的 dynamodb 标签),解释了如何两个版本之间的协调已经完成,使用矢量时钟(这不像 "time stamp" 那么简单):

Dynamo uses vector clocks in order to capture causality between different versions of the same object. A vector clock is effectively a list of (node, counter) pairs. One vector clock is associated with every version of every object. One can determine whether two versions of an object are on parallel branches or have a causal ordering, by examine their vector clocks. If the counters on the first object’s clock are less-than-or-equal to all of the nodes in the second clock, then the first is an ancestor of the second and can be forgotten. Otherwise, the two changes are considered to be in conflict and require reconciliation.

换句话说,通常Dynamo可以根据vector clocks判断出哪个值是新的(论文中详细解释了这是什么意思),但也有可能存在冲突——想象一下相同的item 由两个不同且不通信的副本同时且不同地读取和修改。在这种情况下,两个 版本的数据都将保存在数据库中并返回给客户端,客户端需要使用一些特定于客户端的逻辑来协调它们。例如,在亚马逊的购物车示例中,如果一次修改是将产品 A 添加到购物车,而第二次修改是将产品 B 添加到购物车,则对帐将是将产品 A 和 B 添加到购物车。