为什么特定分区上的 Cassandra COUNT(*) 在相对较小的数据集上需要很长时间

Question

我有一个 table 定义如下：

键空间:

CREATE KEYSPACE messages WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'} AND durable_writes = true;

Table:

CREATE TABLE messages.textmessages (
    categoryid int,
    date timestamp,
    messageid timeuuid,
    message text,
    userid int,
    PRIMARY KEY ((categoryid, date), messageid)
) WITH CLUSTERING ORDER BY (messageid ASC);

目标是拥有一个宽行时间序列存储，这样 categoryid 和 date（一天的开始）构成我的分区键，messageid 提供集群。这使我能够进行如下查询：

SELECT * FROM messages.textmessages WHERE categoryid=2 AND date='2019-05-14 00:00:00.000+0300' AND messageId > maxTimeuuid('2019-05-14 00:00:00.000+0300') AND messageId < minTimeuuid('2019-05-15 00:00:00.000+0300')

获取指定日期的消息；效果这么好这么快！

问题

我需要能够通过将上面的 SELECT * 替换为 SELECT COUNT(*) 来计算给定日期的消息数。即使列族中的条目少于 100K，这也需要很长时间；它实际上在 cqlsh.

超时

我已经阅读并理解了为什么 COUNT 对于像 Counting keys? Might as well be counting stars

中的 Cassandra 这样的分布式数据库来说是一项昂贵的操作

问题

为什么即使在以下情况下此查询也会花费这么长时间：

SELECT COUNT(*) FROM messages.textmessages WHERE categoryid=2 AND date='2019-05-14 00:00:00.000+0300' AND messageId > maxTimeuuid('2019-05-14 00:00:00.000+0300') AND messageId < minTimeuuid('2019-05-15 00:00:00.000+0300')

计数在少于 10 万条记录的特定分区上
我在高性能 Macbook Pro 上只有一个 Cassandra 节点
实例中没有活动 writes/reads；开发笔记本电脑上少于 20 个分区

Answer 1

这是可以理解的，原因是 common pitfall 忽略了 Cassandra 中 'everything-is-a-write' 的概念，因此会出现墓碑。

When executing a scan, within or across a partition, we need to keep the tombstones seen in memory so we can return them to the coordinator, which will use them to make sure other replicas also know about the deleted rows. With workloads that generate a lot of tombstones, this can cause performance problems and even exhaust the server heap.

感谢@JimWartnick 关于可能与墓碑相关的延迟的建议；这是由我插入的具有 NULL 字段的大量墓碑引起的。我没想到这会导致墓碑，我也没料到墓碑会对查询性能产生重大影响；特别是 COUNT.

解决方案

在不存在的字段中使用默认的未设置值或在 inserts/updates
了解Common Problems with Cassandra Tombstones - Alla Babkina

One common misconception is that tombstones only appear when the client issues DELETE statements to Cassandra. Some developers assume that it is safe to choose a way of operations which relies on Cassandra being completely tombstone free. In reality there are other many other things causing tombstones apart from issuing DELETE statements. Inserting null values, inserting collections and expiring data using TTL are common sources of tombstones.

为什么特定分区上的 Cassandra COUNT(*) 在相对较小的数据集上需要很长时间

Why Cassandra COUNT(*) on a specific partition takes really long on relatively small datasets

cql

bigdata

cassandra

nosql