我应该创建什么样的索引才能使 "WHERE col1 LIKE '0000%' AND col2 = 'somevalue'" 更快?
What kind of index should I create to make "WHERE col1 LIKE '0000%' AND col2 = 'somevalue'" faster?
我尝试了以下查询,以便使用 PostgreSQL 的 LIKE
运算符在四叉树内进行搜索。在col3
列中插入了'0133002112300300320'这样的词,它描述了四叉树的路径。
CREATE TABLE table1
(col1 CHARACTER(9) NOT NULL,
col2 INTEGER NOT NULL,
col3 CHARACTER VARYING(64),
col4 INTEGER NOT NULL,
col5 DOUBLE PRECISION NOT NULL,
PRIMARY KEY(col1,col2,col3));
-- Performs sequential search
SELECT col1,col2,col3,col4,col5
FROM table1
WHERE col1='somevalue' AND col2=0 AND col3 LIKE '01330021123003003%';
问题是我设置的 PRIMARY KEY 索引不适用于 WHERE col1='somevalue' AND col2=0 AND col3 LIKE '01330021123003003%'
。如果你想使用创建的索引,我似乎不能同时使用 LIKE
运算符和 AND
运算符。
我可以创建任何特殊索引来使 SELECT
更快吗?
It seems that I can't use LIKE operator with AND operator at the same
time if you want to use the created index.
在这种情况下可以使用索引。以下是您的精确 table 和精确查询的方式,在 10 万行中随机分布均匀的内容:
insert into table1 select
(random()*10000)::int,
(random()*10000)::int,
md5(random()::text),
0,0
from generate_series(1,100000);
ANALYZE table1;
EXPLAIN ANALYZE SELECT col1,col2,col3,col4,col5
FROM table1
WHERE col1='somevalue' AND col2=0 AND col3 LIKE '01330021123003003%';
结果:
Index Scan using table1_pkey on table1 (cost=0.00..8.32 rows=1 width=59) (actual time=0.022..0.022 rows=0 loops=1)
Index Cond: ((col1 = 'somevalue'::bpchar) AND (col2 = 0))
Filter: ((col3)::text ~~ '01330021123003003%'::text)
Total runtime: 0.050 ms
(4 rows)
Index Scan using table1_pkey
表明索引已用于该查询。
如果您的数据集不存在,最可能的原因是您正在搜索过于常见的值。
第一个问题是您没有对文本列使用模式匹配表达式。最好把col3做成text
第二个想法是创建索引的方式。要将索引与模式匹配表达式一起使用,您必须以特殊方式创建索引。看:
http://www.postgresql.org/docs/9.1/static/indexes-opclass.html
这里有一个例子:
--Firstly, I generate example data (10m records):
drop table tmp_example_record;
create table tmp_example_record as
select
id,
floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||
floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||
floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||
floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text as quad_tree_path
from generate_series(1,10000000) id;
--Create copy of quad_tree_path -> on this column we create right index type to pattern matching
alter table tmp_example_record add column quad_tree_path_copy text;
update tmp_example_record set quad_tree_path_copy =quad_tree_path;
--create index, with a special operator class
CREATE INDEX tmp_example_record_quad_tree_path_copy_index ON tmp_example_record (quad_tree_path_copy varchar_pattern_ops);
explain analize
select * from tmp_example_record where quad_tree_path_copy like '212013223122333%'
--about 10ms
/*
"Index Scan using tmp_example_record_quad_tree_path_copy_index on tmp_example_record (cost=0.56..8.58 rows=1000 width=86)"
" Index Cond: ((quad_tree_path_copy ~>=~ '212013223122333'::text) AND (quad_tree_path_copy ~<~ '212013223122334'::text))"
" Filter: (quad_tree_path_copy ~~ '212013223122333%'::text)"
*/
explain analize
select * from tmp_example_record where quad_tree_path like '212013223122333%'
--more then 2000ms
我尝试了以下查询,以便使用 PostgreSQL 的 LIKE
运算符在四叉树内进行搜索。在col3
列中插入了'0133002112300300320'这样的词,它描述了四叉树的路径。
CREATE TABLE table1
(col1 CHARACTER(9) NOT NULL,
col2 INTEGER NOT NULL,
col3 CHARACTER VARYING(64),
col4 INTEGER NOT NULL,
col5 DOUBLE PRECISION NOT NULL,
PRIMARY KEY(col1,col2,col3));
-- Performs sequential search
SELECT col1,col2,col3,col4,col5
FROM table1
WHERE col1='somevalue' AND col2=0 AND col3 LIKE '01330021123003003%';
问题是我设置的 PRIMARY KEY 索引不适用于 WHERE col1='somevalue' AND col2=0 AND col3 LIKE '01330021123003003%'
。如果你想使用创建的索引,我似乎不能同时使用 LIKE
运算符和 AND
运算符。
我可以创建任何特殊索引来使 SELECT
更快吗?
It seems that I can't use LIKE operator with AND operator at the same time if you want to use the created index.
在这种情况下可以使用索引。以下是您的精确 table 和精确查询的方式,在 10 万行中随机分布均匀的内容:
insert into table1 select
(random()*10000)::int,
(random()*10000)::int,
md5(random()::text),
0,0
from generate_series(1,100000);
ANALYZE table1;
EXPLAIN ANALYZE SELECT col1,col2,col3,col4,col5
FROM table1
WHERE col1='somevalue' AND col2=0 AND col3 LIKE '01330021123003003%';
结果:
Index Scan using table1_pkey on table1 (cost=0.00..8.32 rows=1 width=59) (actual time=0.022..0.022 rows=0 loops=1) Index Cond: ((col1 = 'somevalue'::bpchar) AND (col2 = 0)) Filter: ((col3)::text ~~ '01330021123003003%'::text) Total runtime: 0.050 ms (4 rows)
Index Scan using table1_pkey
表明索引已用于该查询。
如果您的数据集不存在,最可能的原因是您正在搜索过于常见的值。
第一个问题是您没有对文本列使用模式匹配表达式。最好把col3做成text
第二个想法是创建索引的方式。要将索引与模式匹配表达式一起使用,您必须以特殊方式创建索引。看: http://www.postgresql.org/docs/9.1/static/indexes-opclass.html
这里有一个例子:
--Firstly, I generate example data (10m records):
drop table tmp_example_record;
create table tmp_example_record as
select
id,
floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||
floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||
floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||
floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text||floor(random()*4)::text as quad_tree_path
from generate_series(1,10000000) id;
--Create copy of quad_tree_path -> on this column we create right index type to pattern matching
alter table tmp_example_record add column quad_tree_path_copy text;
update tmp_example_record set quad_tree_path_copy =quad_tree_path;
--create index, with a special operator class
CREATE INDEX tmp_example_record_quad_tree_path_copy_index ON tmp_example_record (quad_tree_path_copy varchar_pattern_ops);
explain analize
select * from tmp_example_record where quad_tree_path_copy like '212013223122333%'
--about 10ms
/*
"Index Scan using tmp_example_record_quad_tree_path_copy_index on tmp_example_record (cost=0.56..8.58 rows=1000 width=86)"
" Index Cond: ((quad_tree_path_copy ~>=~ '212013223122333'::text) AND (quad_tree_path_copy ~<~ '212013223122334'::text))"
" Filter: (quad_tree_path_copy ~~ '212013223122333%'::text)"
*/
explain analize
select * from tmp_example_record where quad_tree_path like '212013223122333%'
--more then 2000ms