Postgresql 致命错误:所需的 WAL 目录 "pg_wal" 不存在
Postgresql FATAL: required WAL directory "pg_wal" does not exist
我正在尝试设置 postgresql db - 具有高可用性设置的版本 10,并且还想为数据、存档和 wal 目录使用自定义目录
在我使用以下命令启动数据库后,我修改了 systemd 文件中的 DATADIR 路径并尝试重新启动 postgresql 服务,但不幸的是它失败了
/usr/bin/initdb -D /data/dbdata/testdata --locale en_GB.UTF-8 --waldir=/data/wals/testwals
postgresql systemd 文件如下
cat /lib/systemd/system/postgresql.service
[Unit]
Description=PostgreSQL database server
After=network.target
[Service]
Type=notify
User=postgres
Group=postgres
Environment=PG_OOM_ADJUST_FILE=/proc/self/oom_score_adj
Environment=PG_OOM_ADJUST_VALUE=0
Environment=PGDATA=/data/dbdata/testdata
ExecStartPre=/usr/libexec/postgresql-check-db-dir %N
ExecStart=/usr/bin/postmaster -D ${PGDATA}
ExecReload=/bin/kill -HUP $MAINPID
KillMode=mixed
KillSignal=SIGINT
# No artificial start/stop timeout (rhbz#1525477, pgrpms#2786).
TimeoutSec=0
[Install]
WantedBy=multi-user.target
journaltl 日志输出如下
: Starting PostgreSQL database server...
[54185]: 2021-06-17 14:40:58.953 UTC [54185] LOG: listening on IPv4 address "0.0.0.0", port 5432
[54185]: 2021-06-17 14:40:58.953 UTC [54185] LOG: listening on IPv6 address "::", port 5432
[54185]: 2021-06-17 14:40:58.955 UTC [54185] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
[54185]: 2021-06-17 14:40:58.957 UTC [54185] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
[54185]: 2021-06-17 14:40:58.969 UTC [54185] LOG: redirecting log output to logging collector process
[54185]: 2021-06-17 14:40:58.969 UTC [54185] HINT: Future log output will appear in directory "log".
: postgresql.service: Main process exited, code=exited, status=1/FAILURE
: postgresql.service: Killing process 54187 (postmaster) with signal SIGKILL.
: postgresql.service: Failed with result 'exit-code'.
: Failed to start PostgreSQL database server.
来自 postgresql 日志文件
# cat /data/dbdata/testdata/log/postgresql-Thu.log
**2021-06-17 14:40:58.972 UTC [54188] FATAL: required WAL directory "pg_wal" does not exist**
2021-06-17 14:40:58.973 UTC [54185] LOG: startup process (PID 54188) exited with exit code 1
2021-06-17 14:40:58.973 UTC [54185] LOG: aborting startup due to startup process failure
2021-06-17 14:40:58.975 UTC [54185] LOG: database system is shut down
但我可以看到 pg_wal 作为符号存在 link 并且指向在数据库初始化期间使用的 wal 目录
[testdata]# ls -la pg_wal
lrwxrwxrwx. 1 postgres postgres 19 Jun 17 07:10 pg_wal -> /data/wals/kongwals
[root@ip-10-0-1-120 kongdata]# ls -la /data/wals/kongwals
total 16384
drwx------. 3 postgres postgres 60 Jun 17 07:10 .
drwx------. 3 postgres postgres 22 Jun 17 07:10 ..
-rwx------. 1 postgres postgres 16777216 Jun 17 13:23 000000010000000000000001
drwx------. 2 postgres postgres 6 Jun 17 07:10 archive_status
[tstdata]# ls -la /data/wals
total 0
drwx------. 3 postgres postgres 22 Jun 17 07:10 .
drwx------. 6 postgres postgres 59 Jun 16 13:22 ..
drwx------. 3 postgres postgres 60 Jun 17 07:10 kongwals
注意:
/data/wals is a mountpoint.So do i need to set selinux context for this mountpoint?
Yes.As 我怀疑是 selinux 上下文 issue.solved 设置相同
[testdata]# ls -Z /data/dbdata
system_u:object_r:postgresql_db_t:s0 kongdata
[testdata]# ls -Z /data/wals
unconfined_u:object_r:unlabeled_t:s0 kongwals
[testata]# chcon -Rv -u system_u -t postgresql_db_t /data/wals
changing security context of '/data/wals/testwals/archive_status'
changing security context of '/data/wals/testwals/000000010000000000000001'
changing security context of '/data/wals/testwals'
changing security context of '/data/wals'
[kongdata]# restorecon -R -v /data/wals
[kongdata]# ls -Z /data/wals
system_u:object_r:postgresql_db_t:s0 kongwals
[testdata]# service postgresql restart
Redirecting to /bin/systemctl restart postgresql.service
[testdata]# service postgresql status
Redirecting to /bin/systemctl status postgresql.service
● postgresql.service - PostgreSQL database server
Loaded: loaded (/usr/lib/systemd/system/postgresql.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2021-06-17 16:11:35 UTC; 7s ago
Process: 54463 ExecStartPre=/usr/libexec/postgresql-check-db-dir postgresql (code=exited, status=0/SUCCESS)
Main PID: 54465 (postmaster)
Tasks: 8 (limit: 4625)
Memory: 16.9M
CGroup: /system.slice/postgresql.service
├─54465 /usr/bin/postmaster -D /data/dbdata/testdata
├─54467 postgres: logger process
├─54469 postgres: checkpointer process
├─54470 postgres: writer process
├─54471 postgres: wal writer process
├─54472 postgres: autovacuum launcher process
├─54473 postgres: stats collector process
└─54474 postgres: bgworker: logical replication launcher
Jun 17 16:11:35 ip-10-0-1-120.ap-south-1.compute.internal systemd[1]: Starting PostgreSQL database server...
Jun 17 16:11:35 ip-10-0-1-120.ap-south-1.compute.internal postmaster[54465]: 2021-06-17 16:11:35.799 UTC [54465] LOG: listening on IPv4>
Jun 17 16:11:35 ip-10-0-1-120.ap-south-1.compute.internal postmaster[54465]: 2021-06-17 16:11:35.799 UTC [54465] LOG: listening on IPv6>
Jun 17 16:11:35 ip-10-0-1-120.ap-south-1.compute.internal postmaster[54465]: 2021-06-17 16:11:35.800 UTC [54465] LOG: listening on Unix>
Jun 17 16:11:35 ip-10-0-1-120.ap-south-1.compute.internal postmaster[54465]: 2021-06-17 16:11:35.803 UTC [54465] LOG: listening on Unix>
Jun 17 16:11:35 ip-10-0-1-120.ap-south-1.compute.internal postmaster[54465]: 2021-06-17 16:11:35.813 UTC [54465] LOG: redirecting log o>
Jun 17 16:11:35 ip-10-0-1-120.ap-south-1.compute.internal postmaster[54465]: 2021-06-17 16:11:35.813 UTC [54465] HINT: Future log outpu>
Jun 17 16:11:35 ip-10-0-1-120.ap-south-1.compute.internal systemd[1]: Started PostgreSQL database server.
ESCOC
我正在尝试设置 postgresql db - 具有高可用性设置的版本 10,并且还想为数据、存档和 wal 目录使用自定义目录
在我使用以下命令启动数据库后,我修改了 systemd 文件中的 DATADIR 路径并尝试重新启动 postgresql 服务,但不幸的是它失败了
/usr/bin/initdb -D /data/dbdata/testdata --locale en_GB.UTF-8 --waldir=/data/wals/testwals
postgresql systemd 文件如下
cat /lib/systemd/system/postgresql.service
[Unit]
Description=PostgreSQL database server
After=network.target
[Service]
Type=notify
User=postgres
Group=postgres
Environment=PG_OOM_ADJUST_FILE=/proc/self/oom_score_adj
Environment=PG_OOM_ADJUST_VALUE=0
Environment=PGDATA=/data/dbdata/testdata
ExecStartPre=/usr/libexec/postgresql-check-db-dir %N
ExecStart=/usr/bin/postmaster -D ${PGDATA}
ExecReload=/bin/kill -HUP $MAINPID
KillMode=mixed
KillSignal=SIGINT
# No artificial start/stop timeout (rhbz#1525477, pgrpms#2786).
TimeoutSec=0
[Install]
WantedBy=multi-user.target
journaltl 日志输出如下
: Starting PostgreSQL database server...
[54185]: 2021-06-17 14:40:58.953 UTC [54185] LOG: listening on IPv4 address "0.0.0.0", port 5432
[54185]: 2021-06-17 14:40:58.953 UTC [54185] LOG: listening on IPv6 address "::", port 5432
[54185]: 2021-06-17 14:40:58.955 UTC [54185] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
[54185]: 2021-06-17 14:40:58.957 UTC [54185] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
[54185]: 2021-06-17 14:40:58.969 UTC [54185] LOG: redirecting log output to logging collector process
[54185]: 2021-06-17 14:40:58.969 UTC [54185] HINT: Future log output will appear in directory "log".
: postgresql.service: Main process exited, code=exited, status=1/FAILURE
: postgresql.service: Killing process 54187 (postmaster) with signal SIGKILL.
: postgresql.service: Failed with result 'exit-code'.
: Failed to start PostgreSQL database server.
来自 postgresql 日志文件
# cat /data/dbdata/testdata/log/postgresql-Thu.log
**2021-06-17 14:40:58.972 UTC [54188] FATAL: required WAL directory "pg_wal" does not exist**
2021-06-17 14:40:58.973 UTC [54185] LOG: startup process (PID 54188) exited with exit code 1
2021-06-17 14:40:58.973 UTC [54185] LOG: aborting startup due to startup process failure
2021-06-17 14:40:58.975 UTC [54185] LOG: database system is shut down
但我可以看到 pg_wal 作为符号存在 link 并且指向在数据库初始化期间使用的 wal 目录
[testdata]# ls -la pg_wal
lrwxrwxrwx. 1 postgres postgres 19 Jun 17 07:10 pg_wal -> /data/wals/kongwals
[root@ip-10-0-1-120 kongdata]# ls -la /data/wals/kongwals
total 16384
drwx------. 3 postgres postgres 60 Jun 17 07:10 .
drwx------. 3 postgres postgres 22 Jun 17 07:10 ..
-rwx------. 1 postgres postgres 16777216 Jun 17 13:23 000000010000000000000001
drwx------. 2 postgres postgres 6 Jun 17 07:10 archive_status
[tstdata]# ls -la /data/wals
total 0
drwx------. 3 postgres postgres 22 Jun 17 07:10 .
drwx------. 6 postgres postgres 59 Jun 16 13:22 ..
drwx------. 3 postgres postgres 60 Jun 17 07:10 kongwals
注意:
/data/wals is a mountpoint.So do i need to set selinux context for this mountpoint?
Yes.As 我怀疑是 selinux 上下文 issue.solved 设置相同
[testdata]# ls -Z /data/dbdata
system_u:object_r:postgresql_db_t:s0 kongdata
[testdata]# ls -Z /data/wals
unconfined_u:object_r:unlabeled_t:s0 kongwals
[testata]# chcon -Rv -u system_u -t postgresql_db_t /data/wals
changing security context of '/data/wals/testwals/archive_status'
changing security context of '/data/wals/testwals/000000010000000000000001'
changing security context of '/data/wals/testwals'
changing security context of '/data/wals'
[kongdata]# restorecon -R -v /data/wals
[kongdata]# ls -Z /data/wals
system_u:object_r:postgresql_db_t:s0 kongwals
[testdata]# service postgresql restart
Redirecting to /bin/systemctl restart postgresql.service
[testdata]# service postgresql status
Redirecting to /bin/systemctl status postgresql.service
● postgresql.service - PostgreSQL database server
Loaded: loaded (/usr/lib/systemd/system/postgresql.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2021-06-17 16:11:35 UTC; 7s ago
Process: 54463 ExecStartPre=/usr/libexec/postgresql-check-db-dir postgresql (code=exited, status=0/SUCCESS)
Main PID: 54465 (postmaster)
Tasks: 8 (limit: 4625)
Memory: 16.9M
CGroup: /system.slice/postgresql.service
├─54465 /usr/bin/postmaster -D /data/dbdata/testdata
├─54467 postgres: logger process
├─54469 postgres: checkpointer process
├─54470 postgres: writer process
├─54471 postgres: wal writer process
├─54472 postgres: autovacuum launcher process
├─54473 postgres: stats collector process
└─54474 postgres: bgworker: logical replication launcher
Jun 17 16:11:35 ip-10-0-1-120.ap-south-1.compute.internal systemd[1]: Starting PostgreSQL database server...
Jun 17 16:11:35 ip-10-0-1-120.ap-south-1.compute.internal postmaster[54465]: 2021-06-17 16:11:35.799 UTC [54465] LOG: listening on IPv4>
Jun 17 16:11:35 ip-10-0-1-120.ap-south-1.compute.internal postmaster[54465]: 2021-06-17 16:11:35.799 UTC [54465] LOG: listening on IPv6>
Jun 17 16:11:35 ip-10-0-1-120.ap-south-1.compute.internal postmaster[54465]: 2021-06-17 16:11:35.800 UTC [54465] LOG: listening on Unix>
Jun 17 16:11:35 ip-10-0-1-120.ap-south-1.compute.internal postmaster[54465]: 2021-06-17 16:11:35.803 UTC [54465] LOG: listening on Unix>
Jun 17 16:11:35 ip-10-0-1-120.ap-south-1.compute.internal postmaster[54465]: 2021-06-17 16:11:35.813 UTC [54465] LOG: redirecting log o>
Jun 17 16:11:35 ip-10-0-1-120.ap-south-1.compute.internal postmaster[54465]: 2021-06-17 16:11:35.813 UTC [54465] HINT: Future log outpu>
Jun 17 16:11:35 ip-10-0-1-120.ap-south-1.compute.internal systemd[1]: Started PostgreSQL database server.
ESCOC