使用宽列存储构建复合主键是正确的模式吗？

Question

HBase 和 Cassandra 被构建为宽列存储，同时使用行和列的概念。

一行由一个key组成，类似于RDBMS中主键的概念，一个值由几列组成

表示可以如下：

*******|    Key     |                   Value
-------+------------+-------------+------------------------------------------
Colunms|            |     name    |                 value
-------+------------+-------------+------------------------------------------
       |     a      |   title     | "Building a python graphdb in one night"
       |     b      |   body      | "You maybe already know that I am..."
       |     c      | publishedat |              "2015-08-23"
       |     d      |   name      |                database

       |     e      |   start     |                   1
       |     f      |    end      |                   2

            ...          ...                         ...

       |    u       |   title     |     "key/value store key composition"

            ...          ...                         ...

       |    x       |   title     |    "building a graphdb with HappyBase"

            ...          ...                         ...

在应用层构建组合主键是否正确，允许在并置行上快速迭代。

这可以表示如下。

*******|           Key            |                 Value
-------+------------+-------------+------------------------------------------
Colunms| identifier |  name       |                 value
-------+------------+-------------+------------------------------------------
       |     1      |   title     | "Building a python graphdb in one night"
       |     1      |   body      | "You maybe already know that I am..."
       |     1      | publishedat |              "2015-08-23"
       |     2      |   name      |                database

       |     3      |   start     |                   1
       |     3      |    end      |                   2

            ...          ...                         ...

       |     4      |   title     |     "key/value store key composition"

            ...          ...                         ...

       |     42     |   title     |    "building a graphdb with HappyBase"

            ...          ...                         ...

name 列从 Value 移动到 Key 并且 Value 有一个列名 value.

Answer 1

在设计 Cassandra 架构时一直使用复合键。

在 C* 中，键分为两部分，分区键和集群列。

分区键用于将数据散列到集群中的节点。分区是一个数据桶，可以容纳单行或基于集群列的多行。分区内的数据对于节点而言是本地的，并由集群键按排序顺序保存，这使得分区内的数据访问快速高效，支持对集群键的范围查询。

C* 还允许数据字段，这些字段不是复合键的一部分，并且通常不会在查询中使用，除非您在其上创建二级索引。

"wide column" 术语对于 C* 来说有点过时了。在当前的 CQL 事物视图中，数据在更传统的术语中被认为是 table 中的行，这些行被分组到有效访问分区中。

因此，为了回答您的问题，是的，在 C* 中，将可能被认为是 RDBMS 中的数据列的列移动到 C* 中的复合键的一部分是很常见的。

要查看有关分区键和集群列的更多信息，以及它们如何影响您可以执行的查询类型，请参阅 a deep look at the CQL WHERE clause。

Answer 2

复合键在 HBase 模式设计中非常流行。它们还允许您对 Rowkey 的前缀组件进行快速范围扫描。与 Cassandra 不同，RowKey 在存储数据时不会被分解成 Parts。

简单示例：http://riteshadval.blogspot.com/2012/03/hbase-composite-row-key-design-doing.html

在 HBase 中，在您的示例中，您将能够做到 range scans with identifier only and with identifier+name also。

使用宽列存储构建复合主键是正确的模式吗？

Is it a correct pattern to build composite primary key using wide columns stores?

hbase

bigtable

cassandra

happybase

wide-column-store