Debezium MySQL (MariaDB) - 错误消息计数
Debezium MySQL (MariaDB) - wrong message counts
我正在使用 debezium 导出数据库,之前我测试了这个设置并且它工作正常(大约 1% 的生产数据),但是在生产设置中我发现数据库中的行数与debezium 导出的消息数。
例如我有一个 table db.large
,它有大约 2.59 亿个条目,但 debezium 只导出了 2 亿个。对于其他一些 tables,我收到的 debezium 导出的消息多于 table 中实际存在的消息(这只是在初始快照期间)。对于只有 542 个条目的小型 table,计数匹配。
我在日志中看到一些 Failed to flush
和 Failed to commit offsets
消息,但并非所有偏移量刷新都会出现这些消息 - 有些是成功的。这些 flush/commit 失败是否是不匹配的原因?
我在 debezium 1.7 中使用 MySQL 连接器。
以下是证明不匹配的部分日志:
INFO || WorkerSourceTask{id=connector-v1-0} flushing 5722 outstanding messages for offset commit
ERROR || WorkerSourceTask{id=connector-v1-0} Failed to flush, timed out while waiting for producer to flush outstanding 211 messages
ERROR || WorkerSourceTask{id=connector-v1-0} Failed to commit offsets
INFO MySQL|connector_v1|snapshot Exported 201944873 of 259000000 records for table 'db.large' after 10:09:38.853
INFO MySQL|connector_v1|snapshot Exported 202002217 of 259000000 records for table 'db.large' after 10:09:49.062
INFO MySQL|connector_v1|snapshot Exported 202057513 of 259000000 records for table 'db.large' after 10:09:59.281
INFO MySQL|connector_v1|snapshot Exported 202112809 of 259000000 records for table 'db.large' after 10:10:09.488
INFO MySQL|connector_v1|snapshot Exported 202168105 of 259000000 records for table 'db.large' after 10:10:19.669
INFO MySQL|connector_v1|snapshot Exported 202221353 of 259000000 records for table 'db.large' after 10:10:30.152
INFO || WorkerSourceTask{id=connector-v1-0} flushing 5788 outstanding messages for offset commit
INFO MySQL|connector_v1|snapshot Exported 202278697 of 259000000 records for table 'db.large' after 10:10:40.334
ERROR || WorkerSourceTask{id=connector-v1-0} Failed to flush, timed out while waiting for producer to flush outstanding 561 messages
ERROR || WorkerSourceTask{id=connector-v1-0} Failed to commit offsets
INFO MySQL|connector_v1|snapshot Exported 202336041 of 259000000 records for table 'db.large' after 10:10:50.352
INFO MySQL|connector_v1|snapshot Finished exporting 202353026 records for table 'db.large'; total duration '10:10:53.191'
INFO MySQL|connector_v1|snapshot Exporting data from table 'db.small' (2 of 7 tables)
INFO MySQL|connector_v1|snapshot For table 'db.small' using select statement: 'SELECT `field1`, `field2`, `field3` FROM `db`.`small`'
INFO MySQL|connector_v1|snapshot Finished exporting 500 records for table 'db.small'; total duration '00:00:00.021'
INFO MySQL|connector_v1|snapshot Exporting data from table 'db.medium' (3 of 7 tables)
INFO MySQL|connector_v1|snapshot For table 'db.medium' using select statement: 'SELECT `field1`, `field2`, `field3` FROM `db`.`medium`'
INFO MySQL|connector_v1|snapshot Exported 84873 of 14000000 records for table 'db.medium' after 00:00:10.006
INFO MySQL|connector_v1|snapshot Exported 170889 of 14000000 records for table 'db.medium' after 00:00:20.172
INFO MySQL|connector_v1|snapshot Exported 258953 of 14000000 records for table 'db.medium' after 00:00:30.267
INFO MySQL|connector_v1|snapshot Exported 349065 of 14000000 records for table 'db.medium' after 00:00:40.392
有什么想法吗?
谢谢
弄明白了 - 导出的消息数量实际上是正确的。
答案是 debezium 不使用这些日志中的实际消息计数,而是使用估计计数:
https://github.com/debezium/debezium/blob/8d71080a9a8aac875e338964af417dc8de93dfcc/debezium-connector-mysql/src/main/java/io/debezium/connector/mysql/MySqlConnection.java#L427
我正在使用 debezium 导出数据库,之前我测试了这个设置并且它工作正常(大约 1% 的生产数据),但是在生产设置中我发现数据库中的行数与debezium 导出的消息数。
例如我有一个 table db.large
,它有大约 2.59 亿个条目,但 debezium 只导出了 2 亿个。对于其他一些 tables,我收到的 debezium 导出的消息多于 table 中实际存在的消息(这只是在初始快照期间)。对于只有 542 个条目的小型 table,计数匹配。
我在日志中看到一些 Failed to flush
和 Failed to commit offsets
消息,但并非所有偏移量刷新都会出现这些消息 - 有些是成功的。这些 flush/commit 失败是否是不匹配的原因?
我在 debezium 1.7 中使用 MySQL 连接器。
以下是证明不匹配的部分日志:
INFO || WorkerSourceTask{id=connector-v1-0} flushing 5722 outstanding messages for offset commit
ERROR || WorkerSourceTask{id=connector-v1-0} Failed to flush, timed out while waiting for producer to flush outstanding 211 messages
ERROR || WorkerSourceTask{id=connector-v1-0} Failed to commit offsets
INFO MySQL|connector_v1|snapshot Exported 201944873 of 259000000 records for table 'db.large' after 10:09:38.853
INFO MySQL|connector_v1|snapshot Exported 202002217 of 259000000 records for table 'db.large' after 10:09:49.062
INFO MySQL|connector_v1|snapshot Exported 202057513 of 259000000 records for table 'db.large' after 10:09:59.281
INFO MySQL|connector_v1|snapshot Exported 202112809 of 259000000 records for table 'db.large' after 10:10:09.488
INFO MySQL|connector_v1|snapshot Exported 202168105 of 259000000 records for table 'db.large' after 10:10:19.669
INFO MySQL|connector_v1|snapshot Exported 202221353 of 259000000 records for table 'db.large' after 10:10:30.152
INFO || WorkerSourceTask{id=connector-v1-0} flushing 5788 outstanding messages for offset commit
INFO MySQL|connector_v1|snapshot Exported 202278697 of 259000000 records for table 'db.large' after 10:10:40.334
ERROR || WorkerSourceTask{id=connector-v1-0} Failed to flush, timed out while waiting for producer to flush outstanding 561 messages
ERROR || WorkerSourceTask{id=connector-v1-0} Failed to commit offsets
INFO MySQL|connector_v1|snapshot Exported 202336041 of 259000000 records for table 'db.large' after 10:10:50.352
INFO MySQL|connector_v1|snapshot Finished exporting 202353026 records for table 'db.large'; total duration '10:10:53.191'
INFO MySQL|connector_v1|snapshot Exporting data from table 'db.small' (2 of 7 tables)
INFO MySQL|connector_v1|snapshot For table 'db.small' using select statement: 'SELECT `field1`, `field2`, `field3` FROM `db`.`small`'
INFO MySQL|connector_v1|snapshot Finished exporting 500 records for table 'db.small'; total duration '00:00:00.021'
INFO MySQL|connector_v1|snapshot Exporting data from table 'db.medium' (3 of 7 tables)
INFO MySQL|connector_v1|snapshot For table 'db.medium' using select statement: 'SELECT `field1`, `field2`, `field3` FROM `db`.`medium`'
INFO MySQL|connector_v1|snapshot Exported 84873 of 14000000 records for table 'db.medium' after 00:00:10.006
INFO MySQL|connector_v1|snapshot Exported 170889 of 14000000 records for table 'db.medium' after 00:00:20.172
INFO MySQL|connector_v1|snapshot Exported 258953 of 14000000 records for table 'db.medium' after 00:00:30.267
INFO MySQL|connector_v1|snapshot Exported 349065 of 14000000 records for table 'db.medium' after 00:00:40.392
有什么想法吗? 谢谢
弄明白了 - 导出的消息数量实际上是正确的。
答案是 debezium 不使用这些日志中的实际消息计数,而是使用估计计数: https://github.com/debezium/debezium/blob/8d71080a9a8aac875e338964af417dc8de93dfcc/debezium-connector-mysql/src/main/java/io/debezium/connector/mysql/MySqlConnection.java#L427