具有一个主节点、工作节点和非集群客户端节点的 Akka 集群

Akka cluster with one master node, worker nodes and non cluster client nodes

所以我正在使用 2.6.6 构建一个 akka 集群,我正在设置一个主节点,它将作为种子节点和可以动态离开或进入集群的工作节点。我还有 "client" 个应该与主节点通信的节点,可能是路由器,但不能直接与工作节点通信。

现在的问题是,有时如果有太多的工作人员因关闭而离开而没有正确离开集群,则分裂的大脑关闭供应商会取消选择主节点作为领导者,因此也会关闭它,而且现在 "client" 节点也是集群的一部分,在这种情况下也会关闭,这是不应该发生的。

有没有一种方法可以将领导者固定到主节点,但仍然可以自动关闭工作节点,但也不要关闭客户端节点?

编辑:

也许更有条理,这是我想要完成的:

假设您正在使用现已开源(以前是 Lightbend 商业版)的裂脑解析器,static-quorum 策略似乎很合适。

The decision can be based on nodes with a configured role instead of all nodes in the cluster. This can be useful when some types of nodes are more valuable than others. You might, for example, have some nodes responsible for persistent data and some nodes with stateless worker services. Then it probably more important to keep as many persistent data nodes as possible even though it means shutting down more worker nodes.

There is another use of the role as well. By defining a role for a few (e.g. 7) stable nodes in the cluster and using that in the configuration of static-quorum you will be able to dynamically add and remove other nodes without this role and still have good decisions of what nodes to keep running and what nodes to shut down in the case of network partitions. The advantage of this approach compared to keep-majority (described below) is that you do not risk splitting the cluster into two separate clusters, i.e. a split brain*. You must still obey the rule of not starting too many nodes with this role as described above. It also suffers the risk of shutting down all nodes if there is a failure when there are not enough nodes with this role remaining in the cluster, as described above.

这可以通过 application.conf 中的以下内容来完成:

akka.cluster.split-brain-resolver.active-strategy=static-quorum

akka.cluster.split-brain-resolver.static-quorum {
  # one leader node at a time
  quorum-size = 1
  role = "leader"
}

akka.cluster.roles = [ ${AKKA_CLUSTER_ROLE} ]

然后,您将通过环境变量 AKKA_CLUSTER_ROLE 为每个实例指定集群角色(在您的领导节点上将其设置为 leader,并将 workerclient 作为合适)。

由于节点需要就 SBR 策略达成一致,您最好的办法就是在领导者离开时让客户端节点死亡。

最后我会借此机会指出,让客户端节点加入 Akka 集群也许是一个值得重新审视的设计决策:它让我印象深刻,因为它正在成为一个分布式整体。我希望认真考虑通过 http 或消息队列与集群交互的客户端。