如何根据随时间变化的状态字段对 Table 到运行查询建模

Question

目前我们有一个table，我们使用shipment_id查询，将来我们需要根据状态字段进行查询当前Table：

CREATE TABLE shipment ( 
    shipment_id text,
    tenant_id text,
    actual_arrival_time text,
    actual_dep_time text,
    email_ids set,
    is_deleted boolean,
    modified_by text,
    modified_time timestamp,
    planned_arrival_time text,
    planned_dep_time text,
    route_id text,
    shipment_departure_date text,
    status_code text,
    PRIMARY KEY (shipment_id, tenant_id) 
); 

CREATE INDEX shipment_id_index ON shipment (tenant_id);

当前查询

1) SELECT * FROM tenant_id=?0 允许过滤;

2) SELECT * FROM shipment WHERE shipment_id=?0 and tenant_id=?1 ;

Pending/Future 查询的

截至目前给定状态代码的货件 ID 列表 3) SELECT * FROM shipment WHERE tenant_id = 'y' and status_code = x ? ;

4) 过去 1 周给定状态代码的货件 ID 列表

5) 延误的货件 ID 列表

以上 table 可能有 10-15 个不同的租户并且每 table 将有 1 shipment_id,1 tenant_id 1 行并且 status_code 将随着发货的进行而随时间变化，从 Shipment_started、shipment_progress、shipment_delayed、shipment_delayed_completed 和 shipment_completed 等每批货物在其生命周期内都会经历 3-5 个状态，当前 table 仅当给定 shipment_id 的状态发生变化时才会更新。

我需要创建一个新的 table 来解决如下查询

3) 给定租户的发货清单，截至目前 status_code = 'x'

4) 过去 1 周 status_code = 'x' 给定租户的发货清单

5) 延误的货件清单？

Answer 1

在 Cassandra 中，您根据查询对 table 进行建模，因此您实际上可以为可能执行的每个查询创建一个 table。此外，在您的查询中使用 ALLOW FILTERING 应该仅用于开发和测试目的，而不应用于您的实际生产应用程序（在此处查看答案：）。

因此，对于您提到的每个 cases/queries，我建议如下：

1) SELECT * FROM shipment where tenant_id=?0 ALLOW FILTERING;

这应该通过以下 table 解决：

CREATE TABLE shipment ( 
    tenant_id text,
    shipment_id text,
    actual_arrival_time text,
    actual_dep_time text,
    email_ids set,
    is_deleted boolean,
    modified_by text,
    modified_time timestamp,
    planned_arrival_time text,
    planned_dep_time text,
    route_id text,
    shipment_departure_date text,
    status_code text,
    PRIMARY KEY (tenant_id, shipment_id) 
);

这里的 tenant_id 是 partition key 所以如果你执行你的查询： SELECT * FROM shipment where tenant_id='x'; 那么你就不需要再使用 ALLOW FILTERING。

更新： 我还添加了 shipment_id 作为主键的一部分以处理相同的 cardinality 以防 tenant_id不是唯一的，因此 primary key 由 tenant_id 和 shipment_id 组成，以避免覆盖具有相同 tenant_id 的记录。根据@Himanshu Ahire 的评论。

2)SELECT * FROM shipment WHERE shipment_id='x' and tenant_id='y';

这应该通过以下 table 解决：

CREATE TABLE shipment ( 
    shipment_id text,
    tenant_id text,
    actual_arrival_time text,
    actual_dep_time text,
    email_ids set,
    is_deleted boolean,
    modified_by text,
    modified_time timestamp,
    planned_arrival_time text,
    planned_dep_time text,
    route_id text,
    shipment_departure_date text,
    status_code text,
    PRIMARY KEY ((shipment_id, tenant_id)) 
);

这里shipment_id和tenant_id都用作composite partition key

3) SELECT * FROM shipment WHERE tenant_id = 'y' and status_code = 'x';

4) list of shipment id's for given status code for last 1 week

5) list of shipment id's for which got delayed

这些应该通过以下 table 解决：

CREATE TABLE shipment (
    status_code text,
    tenant_id text,
    shipment_id text,
    actual_arrival_time text,
    actual_dep_time text,
    email_ids set,
    is_deleted boolean,
    modified_by text,
    modified_time timestamp,
    planned_arrival_time text,
    planned_dep_time text,
    route_id text,
    shipment_departure_date text,
    PRIMARY KEY ((tenant_id, status_code), actual_arrival_time) 
) WITH CLUSTERING ORDER BY (actual_arrival_time DESC);

在这里你也应该同时使用 tenant_id 和 status_code 作为 composite partition key 和 actual_arrival_time 作为 clustering column 这样您就可以轻松地创建查询，例如：

3) SELECT * FROM shipment WHERE tenant_id = 'y' and status_code = 'x';

4) SELECT * FROM shipment WHERE tenant_id = 'y' and status_code = 'x' and actual_arrival_time >= 'date of last week';

5) SELECT * FROM shipment WHERE tenant_id = 'y' and status_code = 'x' and actual_arrival_time > planned_arrival_time;

只是查询编号 4 的备注，您可以从您的申请代码或使用 cql functions

发送上周的日期

如何根据随时间变化的状态字段对 Table 到运行查询建模

How to Model Table to run query's based on status field which is changing over time

cassandra

cassandra-2.0

cassandra-3.0

如何根据随时间变化的状态字段对 Table 到 运行 查询建模

How to Model Table to run query's based on status field which is changing over time

cassandra

cassandra-2.0

cassandra-3.0

如何根据随时间变化的状态字段对 Table 到运行查询建模