硬关机后启动时如何防止 pg wals 删除?

How to prevent pg wals removing when starts after hard shutdown?

# select version();
> PostgreSQL 11.4 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 8.3.0-3ubuntu1) 8.3.0, 64-bit

我已经坏了 archive_command 一段时间了。 Pg 保留 WAL,直到它们没有按预期归档。然后我杀死了 pg 进程并启动了它。我注意到所有准备存档的 WAL 都被删除了。 google 桶中也没有 WAL。

重新启动后的 pg 日志:

2020-04-22 14:27:23.702 UTC [7] LOG:  database system was interrupted; last known up at 2020-04-22 14:27:08 UTC
2020-04-22 14:27:24.819 UTC [7] LOG:  database system was not properly shut down; automatic recovery in progress
2020-04-22 14:27:24.848 UTC [7] LOG:  redo starts at 4D/BCEF6BA8
2020-04-22 14:27:24.848 UTC [7] LOG:  invalid record length at 4D/BCEFF0C0: wanted 24, got 0
2020-04-22 14:27:24.848 UTC [7] LOG:  redo done at 4D/BCEFF050
2020-04-22 14:27:25.286 UTC [1] LOG:  database system is ready to accept connections

我用 conf 参数 log_min_messages=DEBUG5 重复了这个场景,我看到 pg 删除了旧的 WAL,而忽略了它们正在等待归档的事实。

2020-04-23 14:55:42.819 UTC [6] LOG:  redo starts at 0/22000098
2020-04-23 14:55:50.138 UTC [6] LOG:  redo done at 0/22193FB0
2020-04-23 14:55:50.138 UTC [6] DEBUG:  resetting unlogged relations: cleanup 0 init 1
2020-04-23 14:55:50.266 UTC [6] DEBUG:  performing replication slot checkpoint
2020-04-23 14:55:50.336 UTC [6] DEBUG:  attempting to remove WAL segments older than log file 000000000000000000000021
2020-04-23 14:55:50.349 UTC [6] DEBUG:  recycled write-ahead log file "000000010000000000000015"
2020-04-23 14:55:50.365 UTC [6] DEBUG:  removing write-ahead log file "000000010000000000000012"
2020-04-23 14:55:50.372 UTC [6] DEBUG:  removing write-ahead log file "00000001000000000000001B"
2020-04-23 14:55:50.382 UTC [6] DEBUG:  removing write-ahead log file "00000001000000000000001E"
2020-04-23 14:55:50.390 UTC [6] DEBUG:  removing write-ahead log file "000000010000000000000013"
2020-04-23 14:55:50.402 UTC [6] DEBUG:  removing write-ahead log file "000000010000000000000014"
2020-04-23 14:55:50.412 UTC [6] DEBUG:  removing write-ahead log file "00000001000000000000001D"
2020-04-23 14:55:50.424 UTC [6] DEBUG:  removing write-ahead log file "00000001000000000000001C"
2020-04-23 14:55:50.433 UTC [6] DEBUG:  removing write-ahead log file "00000001000000000000000F"
2020-04-23 14:55:50.442 UTC [6] DEBUG:  removing write-ahead log file "00000001000000000000001F"
2020-04-23 14:55:50.455 UTC [6] DEBUG:  removing write-ahead log file "00000001000000000000001A"
2020-04-23 14:55:50.471 UTC [6] DEBUG:  removing write-ahead log file "000000010000000000000020"
2020-04-23 14:55:50.480 UTC [6] DEBUG:  removing write-ahead log file "000000010000000000000018"
2020-04-23 14:55:50.489 UTC [6] DEBUG:  removing write-ahead log file "000000010000000000000011"
2020-04-23 14:55:50.502 UTC [6] DEBUG:  removing write-ahead log file "000000010000000000000016"
2020-04-23 14:55:50.518 UTC [6] DEBUG:  removing write-ahead log file "000000010000000000000017"
2020-04-23 14:55:50.529 UTC [6] DEBUG:  removing write-ahead log file "000000010000000000000010"
2020-04-23 14:55:50.536 UTC [6] DEBUG:  removing write-ahead log file "000000010000000000000019"
2020-04-23 14:55:50.547 UTC [6] DEBUG:  removing write-ahead log file "000000010000000000000021"
2020-04-23 14:55:50.559 UTC [6] DEBUG:  MultiXactId wrap limit is 2147483648, limited by database with OID 1
2020-04-23 14:55:50.559 UTC [6] DEBUG:  MultiXact member stop limit is now 4294914944 based on MultiXact 1
2020-04-23 14:55:50.566 UTC [6] DEBUG:  shmem_exit(0): 1 before_shmem_exit callbacks to make
2020-04-23 14:55:50.566 UTC [6] DEBUG:  shmem_exit(0): 4 on_shmem_exit callbacks to make
2020-04-23 14:55:50.566 UTC [6] DEBUG:  proc_exit(0): 2 callbacks to make
2020-04-23 14:55:50.566 UTC [6] DEBUG:  exit(0)
2020-04-23 14:55:50.566 UTC [6] DEBUG:  shmem_exit(-1): 0 before_shmem_exit callbacks to make
2020-04-23 14:55:50.566 UTC [6] DEBUG:  shmem_exit(-1): 0 on_shmem_exit callbacks to make
2020-04-23 14:55:50.566 UTC [6] DEBUG:  proc_exit(-1): 0 callbacks to make
2020-04-23 14:55:50.571 UTC [1] DEBUG:  reaping dead processes
2020-04-23 14:55:50.572 UTC [10] DEBUG:  autovacuum launcher started
2020-04-23 14:55:50.572 UTC [1] DEBUG:  starting background worker process "logical replication launcher"
2020-04-23 14:55:50.572 UTC [10] DEBUG:  InitPostgres
2020-04-23 14:55:50.572 UTC [10] DEBUG:  my backend ID is 1

有什么方法可以防止 pg 删除未归档的 WAL 吗?

看起来这是:“[BUG] 在生产崩溃恢复期间删除了非归档 WAL” 发现了最后几周: https://www.postgresql.org/message-id/20200331172229.40ee00dc%40firost

根据 PG 邮件列表的讨论,补丁目前正在开发中,但尚未可用。最快5月份就可以上市了。