使用 Akka Cluster 检测被杀死的节点

Detecting killed nodes with Akka Cluster

我正在使用 Akka Cluster 2.4.8 开发一个项目。

有没有办法用 AkkCluster 检测崩溃的节点(如 computer failurekill -9 等)?

我目前有一个 3 节点环境,使用 static-quorum 脑裂解决策略。

akka.cluster.split-brain-resolver {
    active-strategy = static-quorum
    stable-after = 5s

    static-quorum {
        quorum-size = 2
        role = ""
    } 

我希望在杀死一个实例时,其余的集群成员将其标记为DOWN。但是,它仍然是 UNREACHABLE(见下文)。有没有办法做到这一点?

clusterStatus": {
    "members": [
        {
            "uniqueAddress": {
                "address": {
                    "protocol": "akka.tcp",
                    "system": "test-actor-system",
                    "host": "test-out-00",
                    "port": 2552
                },
                "uid": 1998600863
            },
            "upNumber": 1,
            "status": "Up",
            "roles": []
        },
        {
            "uniqueAddress": {
                "address": {
                    "protocol": "akka.tcp",
                    "system": "test-actor-system",
                    "host": "test-out-01",
                    "port": 2552
                },
                "uid": 1371217592
            },
            "upNumber": 3,
            "status": "Up",
            "roles": []
        },
        {
            "uniqueAddress": {
                "address": {
                    "protocol": "akka.tcp",
                    "system": "test-actor-system",
                    "host": "test-out-02",
                    "port": 2552
                },
                "uid": -796176254
            },
            "upNumber": 2,
            "status": "Up",
            "roles": []
        }
    ],
    "unreachable": [
        {
            "uniqueAddress": {
                "address": {
                    "protocol": "akka.tcp",
                    "system": "test-actor-system",
                    "host": "test-out-01",
                    "port": 2552
                },
                "uid": 1371217592
            },
            "upNumber": 3,
            "status": "Up",
            "roles": []
        }
    ]

老实说,我没有使用 Akka 2.4+。我已经使用 akka 2.3.12 开始了我的项目,并且我仍在使用它。当时没有akka提供的脑裂插件解决方案,唯一的建议是设置:

# put to off in order to not have split brain
auto-down-unreachable-after = off

这意味着,为了避免脑裂,您必须手动删除 UNREACHABLE 节点,并且在 2.3.12 中,akka-microkernel(后来被弃用 http://doc.akka.io/docs/akka/2.4.1/project/migration-guide-2.3.x-2.4.x.html#Microkernel_is_Deprecated)让您可以发出命令集群将有问题的节点标记为 DOWN

bin/akka-cluster localhost 9999 down akka.tcp://MySystem@darkstar:2552

所以用户需要采取一些措施来管理裂脑并将节点从集群中取出,这可能与您的 (Akka Cluster 2.4.8) 版本相同。

裂脑解析器是 Akka 的商业功能,您需要订阅 Lightbend。

Note

This is a feature of the Typesafe Reactive Platform that is exclusively available for Typesafe Project Success Subscription customers.

To use the Split Brain Resolver feature you must install Typesafe Reactive Platform.

如果您不是 Reactive Platform 订阅者,您的裂脑配置很可能会被忽略。

完整文档位于 http://doc.akka.io/docs/akka/rp-15v09p02/scala/split-brain-resolver.html