如果 PostgreSQL 没有变化,事务是否记录到 WAL?

Are transactions logged to WAL if there is no change in PostgreSQL?

我试图查明如果行没有更改,更改是否反映在 WAL(预写日志)文件中。为了对其进行测试,我在 PostgreSQL 中创建了一个复制槽来捕获更改。 这是我采取的步骤。

ALTER SYSTEM SET wal_level TO logical;
$ pg_ctl restart
SELECT pg_create_logical_replication_slot('slotname', 'test_decoding');
CREATE TABLE foo(col1 INTEGER, col2 INTEGER);
ALTER TABLE foo REPLICA IDENTITY FULL;
INSERT INTO foo VALUES(1,2);

然后我在psql中执行SELECT * FROM pg_logical_slot_get_changes('slotname', NULL, NULL);(之前的改动省略)

输出为:

    lsn    | xid |                           data                            
-----------+-----+-----------------------------------------------------------
 0/165B208 | 488 | BEGIN 488
 0/165B208 | 488 | table public.foo: INSERT: col1[integer]:1 col2[integer]:2
 0/165B278 | 488 | COMMIT 488
(3 rows)

然后我执行UPDATE foo SET col2=2 WHERE col1=1;。那么select * from pg_logical_slot_get_changes('slotname', null, null);的输出就是:

    lsn    | xid |                                                     data                                                      
-----------+-----+---------------------------------------------------------------------------------------------------------------
 0/165B2B0 | 489 | BEGIN 489
 0/165B2B0 | 489 | table public.foo: UPDATE: old-key: col1[integer]:1 col2[integer]:2 new-tuple: col1[integer]:1 col2[integer]:2
 0/165B338 | 489 | COMMIT 489
(3 rows)

看起来 UPDATE 语句更新了 WAL 文件,即使它对 table 没有影响。但是我感到困惑的是,如果我们查看我正在使用的版本 12 的 PostgreSQL docs,它在“REPLICA IDENTITY”部分中说,

REPLICA IDENTITY This form changes the information which is written to the write-ahead log to identify rows which are updated or deleted. This option has no effect except when logical replication is in use. DEFAULT (the default for non-system tables) records the old values of the columns of the primary key, if any. USING INDEX records the old values of the columns covered by the named index, which must be unique, not partial, not deferrable, and include only columns marked NOT NULL. FULL records the old values of all columns in the row. NOTHING records no information about the old row. (This is the default for system tables.) In all cases, no old values are logged unless at least one of the columns that would be logged differs between the old and new versions of the row.

最后一句话指出一行的新旧版本必须不同才能被记录。但我看到了相反的情况。我在这里错过了什么?

Replica identity只是逻辑复制message/protocol的一部分,见Message format:

Update ... Byte1('K')

Identifies the following TupleData submessage as a key. This field is optional and is only present if the update changed data in any of the column(s) that are part of the REPLICA IDENTITY index. Byte1('O')

Identifies the following TupleData submessage as an old tuple. This field is optional and is only present if table in which the update happened has REPLICA IDENTITY set to FULL.

您引用的文档部分参考上述内容。您显示的插槽信息是从整体上查看复制过程。

Replica identity 的目的在这里 Logical replication:

A published table must have a “replica identity” configured in order to be able to replicate UPDATE and DELETE operations, so that appropriate rows to update or delete can be identified on the subscriber side. By default, this is the primary key, if there is one. Another unique index (with certain additional requirements) can also be set to be the replica identity. If the table does not have any suitable key, then it can be set to replica identity “full”, which means the entire row becomes the key. This, however, is very inefficient and should only be used as a fallback if no other solution is possible. If a replica identity other than “full” is set on the publisher side, a replica identity comprising the same or fewer columns must also be set on the subscriber side. See REPLICA IDENTITY for details on how to set the replica identity. If a table without a replica identity is added to a publication that replicates UPDATE or DELETE operations then subsequent UPDATE or DELETE operations will cause an error on the publisher. INSERT operations can proceed regardless of any replica identity.