mongos实例无法与数据库通信

Mongos instance can't communicate with the database

所以我有一个带有 2 个配置服务器的分片集群,2 个分片,每个分片有 2 个副本和 2 个 mongos 实例,所有东西 运行 在不同的虚拟机上。

然而,在配置完所有这些之后,我终于尝试通过来自 mongos 实例的简单 show dbs 查询与空数据库进行交互,但它抛出了以下错误(在考虑类似1 分钟):

uncaught exception: Error: listDatabases failed:{
        "ok" : 0,
        "errmsg" : "Could not find host matching read preference { mode: \"primary\" } for set rep",
        "code" : 133,
        "codeName" : "FailedToSatisfyReadPreference",
        "operationTime" : Timestamp(1648722327, 1),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1648722327, 1),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        }
}

一切似乎都配置得很好,当我从 mongos 实例中执行 sh.status() 时,它会这样识别分片和副本:

sharding version: {
        "_id" : 1,
        "minCompatibleVersion" : 5,
        "currentVersion" : 6,
        "clusterId" : ObjectId("62421dd6b5f9640f309faca0")
  }
  shards:
        {  "_id" : "rep",  "host" : "rep/192.168.86.136:26000,192.168.86.141:26001",  "state" : 1 }
        {  "_id" : "repb",  "host" : "repb/192.168.86.142:26002,192.168.86.143:26003",  "state" : 1 }
  active mongoses:
        "4.4.8" : 2
  autosplit:
        Currently enabled: yes
  balancer:
        Currently enabled:  yes
        Currently running:  no
        Failed balancer rounds in last 5 attempts:  5
        Last reported error:  Empty host component parsing HostAndPort from ""
        Time of Reported error:  Thu Mar 31 2022 11:06:39 GMT+0100 (WEST)
        Migration Results for the last 24 hours:
                No recent migrations
  databases:
        {  "_id" : "config",  "primary" : "config",  "partitioned" : true }
                config.system.sessions
                        shard key: { "_id" : 1 }
                        unique: false
                        balancing: true
                        chunks:
                                rep     919
                                repb    105
                        too many chunks to print, use verbose if you want to force print
        {  "_id" : "testdb",  "primary" : "rep",  "partitioned" : false,  "version" : {  "uuid" : UUID("2e584dcd-25ea-4ba4-805c-b40928e26511"),  "lastMod" : 1 } }

可能是防火墙问题。

集群中的每个节点都必须能够通过相应的端口访问任何其他节点。看

试试这个脚本来检查每个副本集的每个成员:

const MONGO_PASSWROD = '*******'
const AUTH_SOURCE = 'admin'

const user = db.runCommand({ connectionStatus: 1 }).authInfo.authenticatedUsers.shift().user;
const map = db.adminCommand("getShardMap").map;

for (let rs of Object.keys(map)) {
   let uri = map[rs].split("/");
   let connectionString = `mongodb://${user}:${MONGO_PASSWROD}@${uri[1]}/admin?replicaSet=${uri[0]}&authSource=${AUTH_SOURCE}`;
   let replicaSet = Mongo(connectionString).getDB("admin");
   for (let member of replicaSet.adminCommand({ replSetGetStatus: 1 }).members) {
      if (!replicaSet.hello().hosts.includes(member.name)) continue;
      printjsononeline({ replicaSet: rs, host: member.name, stateStr: member.stateStr, health: member.health });

      if (member.health != 1 || !Array("PRIMARY", "SECONDARY").includes(member.stateStr))
         print(`Member state of ${member.name} is '${member.stateStr}'`);
   }
}

原来我错误地配置了副本集,所以我所要做的就是重新创建所有 VM 的卷并从头开始重新配置。现在它可以正常工作了。