Redis sentinel无法故障转移slave服务

Redis sentinel can not fail over the slave service

我打算部署一个简单的主从redis集群,有两台服务器:192.168.0.101, 192.168.0.103,101是master

这是 103 服务器上的 sentinel.conf

port 26379

bind 192.168.0.103 127.0.0.1

sentinel myid 49f552d5540fdcb8aa60be25208c56b689d3c0b0
sentinel monitor mymaster 192.168.0.101 6379 2
sentinel down-after-milliseconds mymaster 60000
sentinel failover-timeout mymaster 900000

sentinel auth-pass mymaster arsenal

sentinel config-epoch mymaster 0

# Generated by CONFIG REWRITE
dir "/etc/redis"
sentinel leader-epoch mymaster 3
sentinel known-slave mymaster 192.168.0.103 6379

sentinel current-epoch 3

我的 redis.conf 在 103 服务器上:

bind 127.0.0.1 ::1
protected-mode yes
port 6379
tcp-backlog 511
timeout 0
daemonize yes
supervised no

dbfilename dump.rdb
dir /var/lib/redis
slaveof device1 6379
masterauth arsenal
slave-serve-stale-data yes
slave-read-only yes

slave-priority 100
requirepass arsenal

slave-lazy-flush no
appendonly no
appendfilename "appendonly.aof"
appendfsync everysec
no-appendfsync-on-rewrite no

activerehashing yes

aof-rewrite-incremental-fsync yes

我从 192.168.0.103 上的哨兵开始 redis-server sentinel.conf --sentinel

7951:X 14 Mar 14:19:48.479 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
7951:X 14 Mar 14:19:48.479 # Sentinel ID is 49f552d5540fdcb8aa60be25208c56b689d3c0b0
7951:X 14 Mar 14:19:48.479 # +monitor master mymaster 192.168.0.101 6379 quorum 2

7951:X 14 Mar 14:20:48.480 # +sdown slave 192.168.0.103:6379 192.168.0.103 6379 @ mymaster 192.168.0.101 6379
7951:X 14 Mar 14:21:11.577 # +sdown master mymaster 192.168.0.101 6379

我的哨兵召唤是这样的:

sentinel = Sentinel([('device3', 26379)], password='arsenal')

sentinel.discover_master('mymaster')

MasterNotFoundError: No master found for 'mymaster'

问题是我尝试停止101上的redis-server服务后,sentinel无法切换103服务器为主。

有人知道吗?谢谢。

  1. 在你的配置sentinel monitor mymaster 192.168.0.101 6379 2中,quorum是2,也就是说只有两个或者两个以上的Sentinels认为master down了,才能failover启动。

  2. 参见Redis Sentinel doc,只有三个或三个以上的Sentinel才能稳定部署,如果你只有一个Sentinel,它不能选举一个领导者(获得多数票)来开始故障转移。