当行是一次写入时，Leveled Compaction Strategy 是否仍然有利于读取？

Question

在其他情况下，这个 datastax post 说 当行是一次写入时，压缩可能不是一个好的选择 :

If your rows are always written entirely at once and are never updated, they will naturally always be contained by a single SSTable when using size-tiered compaction. Thus, there's really nothing to gain from leveled compaction.

此外，在演讲 The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla) 幻灯片 30 中说 LCS 最适合

Use cases needing very consistent read performance with much higher read to write ratio

Wide-partition data model with limited (or slow-growing) number of total partitions but a lot of updates and deletes, or fully TTL’ed dataset

我知道如果一行被频繁更新或删除，它可能会出现在多个 SSTable 中，因此这会影响读取性能。来自 Leveled Compaction in Apache Cassandra

Performance can be inconsistent because there are no guarantees as to how many sstables a row may be spread across: in the worst case, we could have columns from a given row in each sstable.

但是，在行是一次写入的情况下，这种策略在读取分区键的所有行时是否也没有好处？

因为如果我理解正确的话，使用这种策略，具有相同分区键的行往往在同一个 SSTable 中，因为合并重叠的 SSTables 与合并具有相似大小的 SSTables 的 Size Tiered Compaction 形成对比。

Answer 1

当行严格写入一次时，选择 LeveledCompactionStrategy 而不是 SizeTieredCompactionStrategy 对读取性能没有影响（还有其他影响，例如 LCS 需要更多 IO）

关于问题

的以下评论

with this strategy the rows with the same partition key tend to be in the same SSTable, because merges SSTables that overlaps in contrast to Size Tiered Compaction that merges SSTables with similar size.

当具有相同分区键的行只写入一次时，就没有合并 SSTables 的情况，因为它首先没有分布在不同的 SSTables 中。

当我们谈论更新时，它不需要更新该行中的现有列。在某些情况下，我们会添加一组完整的新集群列以及已存在分区键的关联列。

这是一个示例table

CREATE TABLE tablename(
   emailid text,
   sent-date date,
   column3 text,
   PRIMARY KEY (emailid,sent-date)
   )

现在，对于给定的 emailid（例如 hello@gmail.com），单个分区键可能会插入两次或更多次不同的 "sent-date"。尽管它们是对相同分区键的插入（本质上是更新插入），因此 LeveledCompaction 会在这里受益。

但假设相同的 table 仅将 emailid 作为主键并且只写入一次。然后，无论 SSTables 如何压缩，无论是 SizeTieredCompactionStrategy 还是 LeveledCompactionStrategy，都没有优势，因为行总是只存在于一个 SSTable 上。

Answer 2

我认为回应是当博客谈论行时，它指的是 Thrift 行而不是 CQL 行。（我混淆了这个术语）

当我们说 Thrift 行时，我们指的是一个分区（或一组具有相同分区键的 CQL 行）。来自 Does CQL support dynamic columns / wide rows?

+--------------------------------------------------+-----------+
|                   Thrift term                    | CQL term  |
+--------------------------------------------------+-----------+
| row                                              | partition |
| column                                           | cell      |
| [cell name component or value]                   | column    |
| [group of cells with shared component prefixes]  | row       |
+--------------------------------------------------+-----------+

来自Understanding How CQL3 Maps to Cassandra’s Internal Data Structure 使用以下架构

CREATE TABLE tweets (
        ... user text,
        ... time timestamp,
        ... tweet text,
        ... lat float,
        ... long float,
        ... PRIMARY KEY (user, time)
        ... );

（记住，分区键是第一个出现在主键中的，在本例中 "user"）

以下 CQL 行

user         | time                     | lat    | long    | tweet
--------------+--------------------------+--------+---------+---------------------
 softwaredoug | 2013-07-13 08:21:54-0400 | 38.162 | -78.549 |  Having chest pain.
 softwaredoug | 2013-07-21 12:15:27-0400 | 38.093 | -78.573 |   Speedo self shot.
      jnbrymn | 2013-06-29 20:53:15-0400 | 38.092 | -78.453 | I like programming.
      jnbrymn | 2013-07-14 22:55:45-0400 | 38.073 | -78.659 |     Who likes cats?
      jnbrymn | 2013-07-24 06:23:54-0400 | 38.073 | -78.647 |  My coffee is cold.

像这样在 Thrift 内部持久化

RowKey: softwaredoug
=> (column=2013-07-13 08:21:54-0400:, value=, timestamp=1374673155373000)
=> (column=2013-07-13 08:21:54-0400:lat, value=4218a5e3, timestamp=1374673155373000)
=> (column=2013-07-13 08:21:54-0400:long, value=c29d1917, timestamp=1374673155373000)
=> (column=2013-07-13 08:21:54-0400:tweet, value=486176696e67206368657374207061696e2e, timestamp=1374673155373000)
=> (column=2013-07-21 12:15:27-0400:, value=, timestamp=1374673155407000)
=> (column=2013-07-21 12:15:27-0400:lat, value=42185f3b, timestamp=1374673155407000)
=> (column=2013-07-21 12:15:27-0400:long, value=c29d2560, timestamp=1374673155407000)
=> (column=2013-07-21 12:15:27-0400:tweet, value=53706565646f2073656c662073686f742e, timestamp=1374673155407000)
-------------------
RowKey: jnbrymn
=> (column=2013-06-29 20:53:15-0400:, value=, timestamp=1374673155419000)
=> (column=2013-06-29 20:53:15-0400:lat, value=42185e35, timestamp=1374673155419000)
=> (column=2013-06-29 20:53:15-0400:long, value=c29ce7f0, timestamp=1374673155419000)
=> (column=2013-06-29 20:53:15-0400:tweet, value=49206c696b652070726f6772616d6d696e672e, timestamp=1374673155419000)
=> (column=2013-07-14 22:55:45-0400:, value=, timestamp=1374673155434000)
=> (column=2013-07-14 22:55:45-0400:lat, value=42184ac1, timestamp=1374673155434000)
=> (column=2013-07-14 22:55:45-0400:long, value=c29d5168, timestamp=1374673155434000)
=> (column=2013-07-14 22:55:45-0400:tweet, value=57686f206c696b657320636174733f, timestamp=1374673155434000)
=> (column=2013-07-24 06:23:54-0400:, value=, timestamp=1374673155485000)
=> (column=2013-07-24 06:23:54-0400:lat, value=42184ac1, timestamp=1374673155485000)
=> (column=2013-07-24 06:23:54-0400:long, value=c29d4b44, timestamp=1374673155485000)
=> (column=2013-07-24 06:23:54-0400:tweet, value=4d7920636f6666656520697320636f6c642e, timestamp=1374673155485000)

我们清楚地看到带有用户 softwaredoug 的 2 个 CQL 行是单个 Thrift 行。

单个 CQL 行对应单个 Thrift 行的情况（例如，当分区键 == 主键时）是 Deng 和 Svihla 指出的 an anti-pattern use case for LCS

Heavy write with all unique partitions

但是，我会将 dilsingi 的答案标记为正确的，因为我认为他已经知道这种关系。

当行是一次写入时，Leveled Compaction Strategy 是否仍然有利于读取？

Is Leveled Compaction Strategy still beneficial for reads when Rows Are Write-Once?

cassandra

datastax