即使 ETCD 使用 CP 算法 Raft,它如何成为一个高可用系统?

How is ETCD a highly available system, even though it uses Raft which is a CP algorithm?

这是来自Kubernetes documentation:

Consistent and highly-available key value store used as Kubernetes' backing store for all cluster data.

Kubernetes 内部是否有单独的机制来让 ETCD 更可用?还是 ETCD 使用允许这种超能力的 Raft 的修改版本?

当涉及到 etcd 的细节时,最好使用 official etcd documentation:

etcd is a strongly consistent, distributed key-value store that provides a reliable way to store data that needs to be accessed by a distributed system or cluster of machines. It gracefully handles leader elections during network partitions and can tolerate machine failure, even in the leader node.

这里没有提到这是高可用性。至于容错,你会发现关于这个主题的一段非常好的here:

An etcd cluster operates so long as a member quorum can be established. If quorum is lost through transient network failures (e.g., partitions), etcd automatically and safely resumes once the network recovers and restores quorum; Raft enforces cluster consistency. For power loss, etcd persists the Raft log to disk; etcd replays the log to the point of failure and resumes cluster participation. For permanent hardware failure, the node may be removed from the cluster through runtime reconfiguration.

It is recommended to have an odd number of members in a cluster. An odd-size cluster tolerates the same number of failures as an even-size cluster but with fewer nodes.

您还可以找到关于 understanding etcd 的非常好的文章:

Etcd is a strongly consistent system. It provides Linearizable reads and writes, and Serializable isolation for transactions. Expressed more specifically, in terms of the PACELC theorem, an extension of the ideas expressed in the CAP theorem, it is a CP/EC system. It optimizes for consistency over latency in normal situations and consistency over availability in the case of a partition.

再看看这张图: