当使用 debezium 从 mongoDB 读取时,KafkaConnect 生成具有空值的 CDC 事件

KafkaConnect produces CDC event with null value when reading from mongoDB with debezium

当阅读包含大量由 Kafka-Connect 使用 debezium 生成的 CDC 事件的 kafka 主题时,数据源位于 mongodb 集合 TTL 中,我看到一些 CDC 事件是空的,那些在删除事件之间。这到底是什么意思?

据我了解所有的CDC事件都应该有CDC事件结构,甚至删除事件也是如此,为什么会有空值的事件?

null,
{
  "after": null,
  "patch": null,
  "source": {
    "version": "0.9.3.Final",
    "connector": "mongodb",
    "name": "test",
    "rs": "rs1",
    "ns": "testestest",
    "sec": 1555060472,
    "ord": 297,
    "h": 1196279425766381600,
    "initsync": false
  },
  "op": "d",
  "ts_ms": 1555060472177
},
null,
{
  "after": null,
  "patch": null,
  "source": {
    "version": "0.9.3.Final",
    "connector": "mongodb",
    "name": "test",
    "rs": "rs1",
    "ns": "testestest",
    "sec": 1555060472,
    "ord": 298,
    "h": -2199232943406075600,
    "initsync": false
  },
  "op": "d",
  "ts_ms": 1555060472177
}

我用的是https://debezium.io/docs/connectors/mongodb/,没有展开任何事件,使用配置如下:

{   
    "connector.class": "io.debezium.connector.mongodb.MongoDbConnector",
    "mongodb.hosts": "live.xxx.xxx:27019",
    "mongodb.name": "testmongodb",
    "collection.whitelist": "testest",
    "tasks.max": 4,
    "snapshot.mode": "never",
    "poll.interval.ms": 15000
}

这些是所谓的墓碑事件,用于正确压缩已删除的事件 - 请参阅 https://kafka.apache.org/documentation/#compaction

Compaction also allows for deletes. A message with a key and a null payload will be treated as a delete from the log. This delete marker will cause any prior message with that key to be removed (as would any new message with that key), but delete markers are special in that they will themselves be cleaned out of the log after a period of time to free up space. The point in time at which deletes are no longer retained is marked as the "delete retention point" in the above diagram.