Cassandra 中的分区键列

Question

如果我决定采用以下分区策略，我想确切地了解什么会提高我的性能

假设我有一个 table 歌曲，我想将艺术家定义为分区键。这个 table 会逐渐增长。今天我有 25 位艺术家和这 25 位艺术家每人 5 首歌曲（总共 125 行）。但在一段时间内，我预见到 500 位艺术家和每位艺术家 5 首歌曲（总共 2500 首）行。我想将艺术家 ID 作为分区键，因为在 CQL 中，有必要在 where 子句和我的 ui 中提及分区键，这是我可以显示这 5 首歌曲的唯一值。

此外，如果我今天从 2 个 cassandra 节点开始，最终增长到 4 个节点，然后再增长到 10 个节点，会怎样？随着我的成长，我能否继续拥有相同的分区键？

这是我的 table 结构：

ArtistId (partition key)  |  SongId  |  Song
--------------------------------------------
1                         | 1        |  abc
1                         | 2        |  cde
1                         | 3        |  fgh
2                         | 4        |  ijk
2                         | 5        |  lmn
1                         | 6        |  opq
1                         | 7        |  rst

Answer 1

Also, what if I start with 2 cassandra nodes today and eventually grow to 4 nodes and then later 10 nodes. Can I continue to have the same partition key as I grow?

是的，您可以保留您的分区键。

I want to understand exactly what will improve my performance if I decide to go with following strategy for partition

说明主键可以是单列，也可以是复合列，复合列可以有分区键和集群键[s]。

既然你说的是艺术家的分区键，那将是你的行键，我假设歌曲将是你的集群键。

分区键用于在不同的节点和集群键之间分配它们的存储顺序。

根据 cql documentation:

all the rows sharing the same partition key (even across table in fact) are stored on the same physical node

这将是非常有效的搜索，因为不需要所有节点上的法定人数，反而会更快地找到它们。

Cassandra 中的分区键列

Partition key column in Cassandra

cql

cassandra

datastax-enterprise

cql3

datastax