cassandra中二级索引的范围查询
Range query on secondary index in cassandra
我正在使用 cassandra 2.1.10。
所以首先我要明确我知道二级索引在 cassandra.But 中是反模式的,出于测试目的我正在尝试以下操作:
CREATE TABLE test_topology1.tt (
a text PRIMARY KEY,
b timestamp
) WITH bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
CREATE INDEX idx_tt ON test_topology1.tt (b);
当我 运行 查询后出现错误。
cqlsh:test_topology1> Select * from tt where b>='2016-04-29 18:00:00' ALLOW FILTERING;
InvalidRequest: code=2200 [Invalid query] message="No secondary indexes on the restricted columns support the provided operators: 'b >= <value>'"
虽然这个Blog说允许过滤可以用来查询二级索引。
Cassandra 安装在 windows 机器上。
这会让您得到想要的结果。使用 b 作为聚类列。
创建 TABLE test_topology1.tt (
一段文字,
b 时间戳,
主键 (a, b)
)
select * from tt where b>='2016-04-29 18:00:00' 允许过滤;
范围查询确实使用ALLOW FILTERING
与二级索引一起工作
cqlsh:spark_demo> create table tt (
... a text PRIMARY KEY,
... b timestamp
... );
cqlsh:spark_demo> CREATE INDEX ON tt(b);
cqlsh:spark_demo> SELECT * FROM tt WHERE b >= '2016-03-01 12:00:00+0000';
InvalidRequest: code=2200 [Invalid query] message="No supported secondary index found for the non primary key columns restrictions"
cqlsh:spark_demo> SELECT * FROM tt WHERE b >= '2016-03-01 12:00:00+0000' ALLOW FILTERING;
a | b
---+---
(0 rows)
cqlsh:spark_demo>
在 2.2.x 之前的 Cassandra 中,不允许对二级索引列进行范围查询。但是,正如 post A deep look at the CQL WHERE clause 指出的那样,如果全部过滤,则允许在非索引列上使用它们:
Direct queries on secondary indices support only =, CONTAINS or
CONTAINS KEY restrictions.
[..]
Secondary index queries allow you to restrict the returned results
using the =, >, >=, <= and <, CONTAINS and CONTAINS KEY restrictions
on non-indexed columns using filtering.
因此,给定 table 结构和索引
CREATE TABLE test_secondary_index (
a text PRIMARY KEY,
b timestamp,
c timestamp
);
CREATE INDEX idx_inequality_test ON test_secondary_index (b);
以下查询失败,因为不等式测试是在索引列上完成的:
SELECT * FROM test_secondary_index WHERE b >= '2016-04-29 18:00:00' ALLOW FILTERING ;
InvalidRequest: code=2200 [Invalid query] message="No secondary indexes on the restricted columns support the provided operators: 'b >= <value>'"
但以下内容有效,因为不等式测试是在非索引列上完成的:
SELECT * FROM test_secondary_index WHERE b = '2016-04-29 18:00:00' AND c >= '2016-04-29 18:00:00' ALLOW FILTERING ;
a | b | c
---+---+---
(0 rows)
如果您在 c
列上添加另一个索引,这仍然有效,但仍然需要 ALLOW FILTERING
项,这对我来说意味着在这种情况下不使用 c 列上的索引。
我正在使用 cassandra 2.1.10。 所以首先我要明确我知道二级索引在 cassandra.But 中是反模式的,出于测试目的我正在尝试以下操作:
CREATE TABLE test_topology1.tt (
a text PRIMARY KEY,
b timestamp
) WITH bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
CREATE INDEX idx_tt ON test_topology1.tt (b);
当我 运行 查询后出现错误。
cqlsh:test_topology1> Select * from tt where b>='2016-04-29 18:00:00' ALLOW FILTERING;
InvalidRequest: code=2200 [Invalid query] message="No secondary indexes on the restricted columns support the provided operators: 'b >= <value>'"
虽然这个Blog说允许过滤可以用来查询二级索引。 Cassandra 安装在 windows 机器上。
这会让您得到想要的结果。使用 b 作为聚类列。
创建 TABLE test_topology1.tt ( 一段文字, b 时间戳, 主键 (a, b) )
select * from tt where b>='2016-04-29 18:00:00' 允许过滤;
范围查询确实使用ALLOW FILTERING
与二级索引一起工作cqlsh:spark_demo> create table tt (
... a text PRIMARY KEY,
... b timestamp
... );
cqlsh:spark_demo> CREATE INDEX ON tt(b);
cqlsh:spark_demo> SELECT * FROM tt WHERE b >= '2016-03-01 12:00:00+0000';
InvalidRequest: code=2200 [Invalid query] message="No supported secondary index found for the non primary key columns restrictions"
cqlsh:spark_demo> SELECT * FROM tt WHERE b >= '2016-03-01 12:00:00+0000' ALLOW FILTERING;
a | b
---+---
(0 rows)
cqlsh:spark_demo>
在 2.2.x 之前的 Cassandra 中,不允许对二级索引列进行范围查询。但是,正如 post A deep look at the CQL WHERE clause 指出的那样,如果全部过滤,则允许在非索引列上使用它们:
Direct queries on secondary indices support only =, CONTAINS or CONTAINS KEY restrictions.
[..]
Secondary index queries allow you to restrict the returned results using the =, >, >=, <= and <, CONTAINS and CONTAINS KEY restrictions on non-indexed columns using filtering.
因此,给定 table 结构和索引
CREATE TABLE test_secondary_index (
a text PRIMARY KEY,
b timestamp,
c timestamp
);
CREATE INDEX idx_inequality_test ON test_secondary_index (b);
以下查询失败,因为不等式测试是在索引列上完成的:
SELECT * FROM test_secondary_index WHERE b >= '2016-04-29 18:00:00' ALLOW FILTERING ;
InvalidRequest: code=2200 [Invalid query] message="No secondary indexes on the restricted columns support the provided operators: 'b >= <value>'"
但以下内容有效,因为不等式测试是在非索引列上完成的:
SELECT * FROM test_secondary_index WHERE b = '2016-04-29 18:00:00' AND c >= '2016-04-29 18:00:00' ALLOW FILTERING ;
a | b | c
---+---+---
(0 rows)
如果您在 c
列上添加另一个索引,这仍然有效,但仍然需要 ALLOW FILTERING
项,这对我来说意味着在这种情况下不使用 c 列上的索引。