pg_ctl提倡快速不挂起复制

pg_ctl promote not suspending replication quickly

我有一个双节点 PostgreSQL 集群。一个是主要 (192.168.50.3),一个是辅助 (192.168.50.4)。我的 recovery.conf 在 192.168.50.4 上看起来像下面这样。

standby_mode          = 'on'
primary_conninfo      = 'host=192.168.50.3 port=5432 user=myuser password=<password_here> sslmode=require sslcompression=0'
trigger_file = '/tmp/make_master'
recovery_target_timeline = 'latest'

现在我 运行 pg_ctl promote 在辅助 (192.168.50.4) 上,一旦命令成功,我就从主 (192.168.50.3) 上删除一些数据,删除的数据也从辅助 (192.168.50.4) 中删除。

pg_ctl promote 真正暂停复制是否需要时间? 如何确保复制已正确暂停?

来自 /var/log/messages 的 192.168.50.4 日志:

May 10 06:17:45 cluster-node6 sudo: myuser : TTY=pts/0 ; PWD=/home/myuser ; USER=postgres ; COMMAND=/usr/pgsql-11/bin/pg_ctl promote --pgdata=/var/lib/pgsql/11/data
May 10 06:17:45 cluster-node6 sudo: pam_unix(sudo:session): session opened for user postgres by csadmin(uid=0)
May 10 06:17:45 cluster-node6 postmaster: 2019-05-10 06:17:45 UTC LOG:  received promote request
May 10 06:17:45 cluster-node6 postmaster: 2019-05-10 06:17:45 UTC LOG:  received promote request
May 10 06:17:45 cluster-node6 postmaster: 2019-05-10 06:17:45 UTC FATAL:  terminating walreceiver process due to administrator command
May 10 06:17:45 cluster-node6 postmaster: 2019-05-10 06:17:45 UTC FATAL:  terminating walreceiver process due to administrator command
May 10 06:17:45 cluster-node6 postmaster: 2019-05-10 06:17:45 UTC LOG:  redo done at 0/891BFB8
May 10 06:17:45 cluster-node6 postmaster: 2019-05-10 06:17:45 UTC LOG:  redo done at 0/891BFB8
May 10 06:17:45 cluster-node6 postmaster: 2019-05-10 06:17:45 UTC LOG:  last completed transaction was at log time 2019-05-10 06:17:45.550363+00
May 10 06:17:45 cluster-node6 postmaster: 2019-05-10 06:17:45 UTC LOG:  last completed transaction was at log time 2019-05-10 06:17:45.550363+00
May 10 06:17:45 cluster-node6 postmaster: 2019-05-10 06:17:45 UTC LOG:  selected new timeline ID: 2
May 10 06:17:45 cluster-node6 postmaster: 2019-05-10 06:17:45 UTC LOG:  selected new timeline ID: 2
May 10 06:17:45 cluster-node6 postmaster: 2019-05-10 06:17:45 UTC LOG:  archive recovery complete
May 10 06:17:45 cluster-node6 postmaster: 2019-05-10 06:17:45 UTC LOG:  archive recovery complete
May 10 06:17:45 cluster-node6 postmaster: 2019-05-10 06:17:45 UTC LOG:  database system is ready to accept connections
May 10 06:17:45 cluster-node6 postmaster: 2019-05-10 06:17:45 UTC LOG:  database system is ready to accept connections
May 10 06:17:45 cluster-node6 sudo: pam_unix(sudo:session): session closed for user postgres

促销是异步的。它向 postmaster 发送一个信号,然后 postmaster 执行您在日志中看到的序列。

所以在pg_ctl promote发送信号成功后复制会继续一小会儿是正常的。

如果您需要确保提升完成,请继续调用函数 pg_is_in_recovery() 直到它 returns FALSE.

从 PostgreSQL v12 开始,您可以调用我的函数 pg_promote() 来提升备用,默认情况下会等到提升完成。