Postgresql 致命错误:所需的 WAL 目录 "pg_wal" 不存在

Postgresql FATAL: required WAL directory "pg_wal" does not exist

我正在尝试设置 postgresql db - 具有高可用性设置的版本 10,并且还想为数据、存档和 wal 目录使用自定义目录

在我使用以下命令启动数据库后,我修改了 systemd 文件中的 DATADIR 路径并尝试重新启动 postgresql 服务,但不幸的是它失败了

/usr/bin/initdb -D /data/dbdata/testdata --locale en_GB.UTF-8 --waldir=/data/wals/testwals

postgresql systemd 文件如下

 cat /lib/systemd/system/postgresql.service
[Unit]
Description=PostgreSQL database server
After=network.target

[Service]
Type=notify

User=postgres
Group=postgres

Environment=PG_OOM_ADJUST_FILE=/proc/self/oom_score_adj
Environment=PG_OOM_ADJUST_VALUE=0
Environment=PGDATA=/data/dbdata/testdata
ExecStartPre=/usr/libexec/postgresql-check-db-dir %N
ExecStart=/usr/bin/postmaster -D ${PGDATA}
ExecReload=/bin/kill -HUP $MAINPID
KillMode=mixed
KillSignal=SIGINT

# No artificial start/stop timeout (rhbz#1525477, pgrpms#2786).
TimeoutSec=0

[Install]
WantedBy=multi-user.target

journaltl 日志输出如下

: Starting PostgreSQL database server...
[54185]: 2021-06-17 14:40:58.953 UTC [54185] LOG:  listening on IPv4 address "0.0.0.0", port 5432
[54185]: 2021-06-17 14:40:58.953 UTC [54185] LOG:  listening on IPv6 address "::", port 5432
[54185]: 2021-06-17 14:40:58.955 UTC [54185] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
[54185]: 2021-06-17 14:40:58.957 UTC [54185] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
[54185]: 2021-06-17 14:40:58.969 UTC [54185] LOG:  redirecting log output to logging collector process
[54185]: 2021-06-17 14:40:58.969 UTC [54185] HINT:  Future log output will appear in directory "log".
: postgresql.service: Main process exited, code=exited, status=1/FAILURE
: postgresql.service: Killing process 54187 (postmaster) with signal SIGKILL.
: postgresql.service: Failed with result 'exit-code'.
: Failed to start PostgreSQL database server.

来自 postgresql 日志文件

# cat /data/dbdata/testdata/log/postgresql-Thu.log
**2021-06-17 14:40:58.972 UTC [54188] FATAL:  required WAL directory "pg_wal" does not exist**
2021-06-17 14:40:58.973 UTC [54185] LOG:  startup process (PID 54188) exited with exit code 1
2021-06-17 14:40:58.973 UTC [54185] LOG:  aborting startup due to startup process failure
2021-06-17 14:40:58.975 UTC [54185] LOG:  database system is shut down

但我可以看到 pg_wal 作为符号存在 link 并且指向在数据库初始化期间使用的 wal 目录

[testdata]# ls -la pg_wal
lrwxrwxrwx. 1 postgres postgres 19 Jun 17 07:10 pg_wal -> /data/wals/kongwals
[root@ip-10-0-1-120 kongdata]# ls -la /data/wals/kongwals
total 16384
drwx------. 3 postgres postgres       60 Jun 17 07:10 .
drwx------. 3 postgres postgres       22 Jun 17 07:10 ..
-rwx------. 1 postgres postgres 16777216 Jun 17 13:23 000000010000000000000001
drwx------. 2 postgres postgres        6 Jun 17 07:10 archive_status

[tstdata]# ls -la /data/wals
total 0
drwx------. 3 postgres postgres 22 Jun 17 07:10 .
drwx------. 6 postgres postgres 59 Jun 16 13:22 ..
drwx------. 3 postgres postgres 60 Jun 17 07:10 kongwals

注意:

/data/wals is a mountpoint.So do i need to set selinux context for this mountpoint?

Yes.As 我怀疑是 selinux 上下文 issue.solved 设置相同

[testdata]# ls -Z /data/dbdata
system_u:object_r:postgresql_db_t:s0 kongdata

[testdata]# ls -Z /data/wals
unconfined_u:object_r:unlabeled_t:s0 kongwals

[testata]# chcon -Rv -u system_u -t postgresql_db_t /data/wals
changing security context of '/data/wals/testwals/archive_status'
changing security context of '/data/wals/testwals/000000010000000000000001'
changing security context of '/data/wals/testwals'
changing security context of '/data/wals'

[kongdata]# restorecon -R -v /data/wals

[kongdata]# ls -Z /data/wals
system_u:object_r:postgresql_db_t:s0 kongwals

[testdata]# service postgresql restart
Redirecting to /bin/systemctl restart postgresql.service

[testdata]# service postgresql status
Redirecting to /bin/systemctl status postgresql.service
● postgresql.service - PostgreSQL database server
   Loaded: loaded (/usr/lib/systemd/system/postgresql.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2021-06-17 16:11:35 UTC; 7s ago
  Process: 54463 ExecStartPre=/usr/libexec/postgresql-check-db-dir postgresql (code=exited, status=0/SUCCESS)
 Main PID: 54465 (postmaster)
    Tasks: 8 (limit: 4625)
   Memory: 16.9M
   CGroup: /system.slice/postgresql.service
           ├─54465 /usr/bin/postmaster -D /data/dbdata/testdata
           ├─54467 postgres: logger process
           ├─54469 postgres: checkpointer process
           ├─54470 postgres: writer process
           ├─54471 postgres: wal writer process
           ├─54472 postgres: autovacuum launcher process
           ├─54473 postgres: stats collector process
           └─54474 postgres: bgworker: logical replication launcher

Jun 17 16:11:35 ip-10-0-1-120.ap-south-1.compute.internal systemd[1]: Starting PostgreSQL database server...
Jun 17 16:11:35 ip-10-0-1-120.ap-south-1.compute.internal postmaster[54465]: 2021-06-17 16:11:35.799 UTC [54465] LOG:  listening on IPv4>
Jun 17 16:11:35 ip-10-0-1-120.ap-south-1.compute.internal postmaster[54465]: 2021-06-17 16:11:35.799 UTC [54465] LOG:  listening on IPv6>
Jun 17 16:11:35 ip-10-0-1-120.ap-south-1.compute.internal postmaster[54465]: 2021-06-17 16:11:35.800 UTC [54465] LOG:  listening on Unix>
Jun 17 16:11:35 ip-10-0-1-120.ap-south-1.compute.internal postmaster[54465]: 2021-06-17 16:11:35.803 UTC [54465] LOG:  listening on Unix>
Jun 17 16:11:35 ip-10-0-1-120.ap-south-1.compute.internal postmaster[54465]: 2021-06-17 16:11:35.813 UTC [54465] LOG:  redirecting log o>
Jun 17 16:11:35 ip-10-0-1-120.ap-south-1.compute.internal postmaster[54465]: 2021-06-17 16:11:35.813 UTC [54465] HINT:  Future log outpu>
Jun 17 16:11:35 ip-10-0-1-120.ap-south-1.compute.internal systemd[1]: Started PostgreSQL database server.
 ESCOC