postgresql什么时候需要写入文件"pg_logical/replorigin_checkpoint.tmp"?

When does postgresql need to write to file "pg_logical/replorigin_checkpoint.tmp"?

Postgres(版本 10.10)在我的机器上崩溃(无法连接数据库)。我检查了日志并看到

2019-10-11 15:46:41.262 UTC [30233] postgres_prod@syntax_prod LOG:  could not receive data from client: Connection reset by peer
2019-10-11 17:41:06.104 UTC [2001] PANIC:  could not write to file "pg_logical/replorigin_checkpoint.tmp": No space left on device
2019-10-11 17:41:06.364 UTC [1999] LOG:  checkpointer process (PID 2001) was terminated by signal 6: Aborted
2019-10-11 17:41:06.364 UTC [1999] LOG:  terminating any other active server processes
2019-10-11 17:41:06.364 UTC [1326] postgres_prod@syntax_prod WARNING:  terminating connection because of crash of another server process
2019-10-11 17:41:06.364 UTC [1326] postgres_prod@syntax_prod DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
...

我认为问题发生在

PANIC: could not write to file "pg_logical/replorigin_checkpoint.tmp": No space left on device

但是我的机器上还剩下 77 GB(我刚刚重新启动了 Postgres,到目前为止一切正常)。所以我不太理解 PANIC 错误信息。我认为了解更多关于 postgres 必须写入文件 "pg_logical/replorigin_checkpoint.tmp" 的信息可能会帮助我理解出了什么问题。所以我正在寻找相关信息。

But I have 77 gigabytes left on my machine

可能您在错误发生时没有这样做,或者可用的 space 在错误的分区上被使用。可能有很多临时文件在错误发生后被清理,所以现在免费 space 并不意味着你那时有免费 space。也许您可以在不同的分区上设置一个临时 tablespace,这样它就无法从 space 中 运行 其他东西并使整个系统崩溃?

I figured that knowing more about postgres having to write to the file "pg_logical/replorigin_checkpoint.tmp" may help me to understand what went wrong.

我很确定它不会。但这是检查逻辑复制进度的一部分。它创建一个新文件,然后自动重命名旧文件。

我 运行 今天遇到了同样的问题。此命令似乎可以解决我的问题:

docker run --rm -v /var/run/docker.sock:/var/run/docker.sock:ro -v /var/lib/docker:/var/lib/docker martin/docker-cleanup-volumes