执行缓慢的 postgresql 提交语句

Question

在带有 postgresql 后端（和带有 gunicorn 应用程序服务器的 nginx 反向代理）的 Django 网络应用程序中，我在 postgresql 的慢速日志中看到数十条 COMMIT 消息。看：

2020-02-01 17:56:16.335 UTC [19424] ubuntu@app LOG LOG:  duration: 175.630 ms  statement: COMMIT
2020-02-01 17:56:21.355 UTC [19435] ubuntu@app LOG LOG:  duration: 107.735 ms  statement: COMMIT
2020-02-01 17:57:22.592 UTC [19419] ubuntu@app LOG LOG:  duration: 235.313 ms  statement: COMMIT
2020-02-01 17:57:30.685 UTC [19419] ubuntu@app LOG LOG:  duration: 249.875 ms  statement: COMMIT
2020-02-01 17:57:30.688 UTC [19424] ubuntu@app LOG LOG:  duration: 99.049 ms  statement: COMMIT
2020-02-01 17:57:30.688 UTC [19435] ubuntu@app LOG LOG:  duration: 115.772 ms  statement: COMMIT
2020-02-01 17:57:30.688 UTC [19554] ubuntu@app LOG LOG:  duration: 248.656 ms  statement: COMMIT
2020-02-01 17:58:03.266 UTC [19435] ubuntu@app LOG LOG:  duration: 780.232 ms  statement: COMMIT
2020-02-01 17:58:03.270 UTC [19424] ubuntu@app LOG LOG:  duration: 622.424 ms  statement: COMMIT
2020-02-01 17:58:07.579 UTC [19435] ubuntu@app LOG LOG:  duration: 75.658 ms  statement: COMMIT

有问题的数据库一天前刚刚从一台专用服务器迁移到另一台。在其之前的环境中，COMMIT 从未出现在 slow log 中。在新环境中，我做了一些改动：

1) 我设置 checkpoint_completion_target = 0.7（低于之前的 checkpoint_completion_target = 0.8）

2) 我切换到 gunicorn 中的 gevent 工人（以前使用 sync 工人）。这也需要我将以下内容添加到 postgresql（允许我们使用 psycopg2 与 gevent - source 异步使用）：

from psycogreen.gevent import patch_psycopg

def post_fork(server, worker):
  from gevent import monkey

  patch_psycopg()
  worker.log.info("Made PostgreSQL Green!")
  monkey.patch_all()

3) 将 max_connections 增加到 400（从之前的 300）

4) 将 shared_buffers 增加到 8GB（从之前的 6GB）

还有一些其他更改，我 增加了 阈值（这应该是一件好事）。

专家能否建议我上面所做的更改是否可能导致所有 COMMIT 语句乱扔我的 slow log？如果不是，还有其他猜测吗？

最重要的是，我可以采取哪些缓解措施来改善这种情况？

vmstat 1 产量：

ubuntu@main-app:~$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
16  0      0 10031160 263796 11650344    0    0     2    20    5    4 13  1 86  0  0
 2  0      0 10030184 263796 11650308    0    0     0  1832 8598 9433 20  1 78  0  0
 4  1      0 10028432 263796 11650416    0    0    40   436 7229 7887 20  1 79  0  0
 6  0      0 10027076 263796 11650456    0    0     0   764 7836 8833 17  1 81  0  0
 4  0      0 10025904 263796 11650464    0    0     0   588 7947 9754 22  1 76  0  0
 4  0      0 10025988 263796 11650512    0    0     0   652 9727 12033 30  2 68  0  0
 7  0      0 10025572 263796 11650532    0    0     0   816 8296 9984 25  1 73  0  0
 2  1      0 10025680 263796 11650596    0    0     0  1128 8794 11003 23  1 75  0  0
 0  0      0 10025552 263796 11650588    0    0     0   288 7153 8091 20  1 79  0  0
 6  0      0 10025096 263796 11650612    0    0     0   412 9423 12016 25  2 73  0  0
 1  0      0 10025056 263796 11650640    0    0     0   240 9227 11442 32  2 66  0  0
 6  0      0 10025056 263796 11650800    0    0    32  1036 8762 10418 25  2 73  0  0
 2  0      0 10025116 263796 11650828    0    0     0   352 8730 10924 23  2 75  0  0
 6  0      0 10024992 263796 11650940    0    0     0   592 7920 9399 14  1 85  0  0
 3  0      0 10024288 263796 11650952    0    0     0   380 8380 9662 23  1 75  0  0
 4  0      0 10024536 263796 11650896    0    0     0   680 9193 10819 22  1 76  0  0
 1  0      0 10024720 263796 11650776    0    0     0   588 9655 10757 24  2 74  0  0
 5  0      0 10024820 263796 11650700    0    0     0    48 10237 13216 28  2 70  0  0
 1  0      0 10023660 263796 11650716    0    0     0   396 9291 11251 34  2 64  0  0
 6  0      0 10024564 263796 11650744    0    0     0   720 8557 10500 22  1 76  0  0

Answer 1

缓慢提交有两种可能的解释：

存储过载。这可能是由于
- 高I/O成交量。
- 很多小事务，WAL 同步请求太多了。
WITH HOLD 较大查询的游标。

在 Linux 上，检查 vmstat 1 中的 %iowait 列以查看 I/O 子系统是否过载。

关于你的措施：

增加max_connections或减少checkpoint_completion_target会产生不利影响，如果有的话。
如果问题是读取量I/O，增加shared_buffers会有帮助I/O。
如果问题是许多同步请求，并且您可以承受在崩溃时丢失一些已提交的事务，请设置 synchronous_commit = off。

如果这不是一个选项，您可以使用 commit_delay 来降低 I/O 负载。

我从来没有听说过"gevent workers"，所以我不能说什么。

执行缓慢的 postgresql 提交语句

Slow performing postgresql commit statements

django

postgresql

gevent

gunicorn