MongoDB 集群上的主节点被杀死后,主节点选举没有完成

Primary election isn't done after primary is killed on a MongoDB Cluster

我尝试测试 mongoDB 集群的故障转移场景。当我停止初选时,我在 Java 代码的日志中没有看到任何新的初选,并且 read/write 操作被忽略并得到以下信息:

No server chosen by ReadPreferenceServerSelector{readPreference=primary} from cluster description ClusterDescription{type=REPLICA_SET, connectionMode=MULTIPLE, serverDescriptions=[ServerDescription{address=mongo1:30001, type=UNKNOWN, state=CONNECTING, exception={com.mongodb.MongoSocketOpenException: Exception opening socket}, caused by {java.net.ConnectException: Connection refused (Connection refused)}}, ServerDescription{address=mongo2:30002, type=REPLICA_SET_SECONDARY, state=CONNECTED, ok=true, minWireVersion=0, maxWireVersion=8, maxDocumentSize=16777216, logicalSessionTimeoutMinutes=30, roundTripTimeNanos=3215664, setName='rs0', canonicalAddress=mongo2:30002, hosts=[mongo1:30001], passives=[mongo2:30002, mongo3:30003], arbiters=[], primary='null', tagSet=TagSet{[]}, electionId=null, setVersion=1, lastWriteDate=Fri Mar 26 02:08:27 CET 2021, lastUpdateTimeNanos=91832460163658}, ServerDescription{address=mongo3:30003, type=REPLICA_SET_SECONDARY, state=CONNECTED, ok=true, minWireVersion=0, maxWireVersion=8, maxDocumentSize=16777216, logicalSessionTimeoutMinutes=30, roundTripTimeNanos=3283858, setName='rs0', canonicalAddress=mongo3:30003, hosts=[mongo1:30001], passives=[mongo2:30002, mongo3:30003], arbiters=[], primary='null', tagSet=TagSet{[]}, electionId=null, setVersion=1, lastWriteDate=Fri Mar 26 02:08:27 CET 2021, lastUpdateTimeNanos=91832459878686}]}. Waiting for 30000 ms before timing out

我正在使用以下配置:

var cfg = {
    "_id": "rs0",
    "protocolVersion": 1,
    "version": 1,
    "members": [
        {
            "_id": 0,
            "host": "mongo1:30001",
            "priority": 4
        },
        {
            "_id": 1,
            "host": "mongo2:30002",
            "priority": 3
        },
        {
            "_id": 2,
            "host": "mongo3:30003",
            "priority": 2,
        }
    ]
};
rs.initiate(cfg, { force: true });
rs.secondaryOk();
db.getMongo().setReadPref('primary');

rs.isMaster() returns 这个:

{
    "hosts" : [
        "mongo1:30001"
    ],
    "passives" : [
        "mongo2:30002",
        "mongo3:30003"
    ],
    "setName" : "rs0",
    "setVersion" : 1,
    "ismaster" : true,
    "secondary" : false,
    "primary" : "mongo1:30001",
    "me" : "mongo1:30001",
    "electionId" : ObjectId("7fffffff0000000000000017"),
    "lastWrite" : {
        "opTime" : {
            "ts" : Timestamp(1616719738, 1),
            "t" : NumberLong(23)
        },
        "lastWriteDate" : ISODate("2021-03-26T00:48:58Z"),
        "majorityOpTime" : {
            "ts" : Timestamp(1616719738, 1),
            "t" : NumberLong(23)
        },
        "majorityWriteDate" : ISODate("2021-03-26T00:48:58Z")
    },
    "maxBsonObjectSize" : 16777216,
    "maxMessageSizeBytes" : 48000000,
    "maxWriteBatchSize" : 100000,
    "localTime" : ISODate("2021-03-26T00:49:08.019Z"),
    "logicalSessionTimeoutMinutes" : 30,
    "connectionId" : 28,
    "minWireVersion" : 0,
    "maxWireVersion" : 8,
    "readOnly" : false,
    "ok" : 1,
    "$clusterTime" : {
        "clusterTime" : Timestamp(1616719738, 1),
        "signature" : {
            "hash" : BinData(0,"/+QXGSyYY+M/OXbZ1UixjrDOVz4="),
            "keyId" : NumberLong("6942620613131370499")
        }
    },
    "operationTime" : Timestamp(1616719738, 1)
}

这里我看到的是主机列表有主节点,被动列表有辅助节点。我不知道什么时候所有节点都被视为集群设置中的主机,因此 passives 将为空。我查到的唯一相关信息是secondary的priority不能为0,否则不会被认为是初选的候选人。

        "mongo1:30001"
    ],
    "passives" : [
        "mongo2:30002",
        "mongo3:30003"
    ],...

来自docs

isMaster.passives

An array of strings in the format of "[hostname]:[port]" listing all members of the replica set which have a members[n].priority of 0.

This field only appears if there is at least one member with a members[n].priority of 0.

这些节点以某种方式设置为优先级 0,因此永远不会尝试成为主要节点。