PostgreSQL 多列索引包括数组
PostgreSQL multi-column index including arrays
文档建议对数组列使用 GIN 索引。但是,我想通过该列和布尔列的组合进行查询,并且我无法将布尔值添加到索引中,因为 GIN 不支持该类型。我是否最好 (a) 为布尔列创建一个单独的索引,(b) 使用不同的索引类型(哪个?),或者 (c) 不索引布尔列,因为在我的例子中是搜索的结果集数组列索引将只有几行,因此如果查询优化器在其中搜索匹配的布尔值,它只会进行少量比较?
create table foo (
id integer generated by default as identity primary key,
...
bar bool not null, -- TODO: Separate index? Cannot include bool in GIN index
...
baz smallint[] not null);
create index foo_baz_idx on lambdas using gin (baz);
大多数查询将采用 select * from foo where X = any(baz) and bar = Y
的形式,并且在搜索 X
中最多只有少量行
这实际上取决于您的数据的性质。如果 where X = any(baz)
产生的行数很少,那么就不需要索引 bar
.
如果 where X = any(baz)
导致大量行,bar
上的单独索引可能会有所帮助;它会给查询规划器更多的选择。但由于它是一个布尔值,您可以在 bar
上代替 partition the table。然后每个查询都在 where bar = ?
.
上有效索引
Query performance can be improved dramatically in certain situations, particularly when most of the heavily accessed rows of the table are in a single partition or a small number of partitions. The partitioning substitutes for leading columns of indexes, reducing index size and making it more likely that the heavily-used parts of the indexes fit in memory.
When queries or updates access a large percentage of a single partition, performance can be improved by taking advantage of sequential scan of that partition instead of using an index and random access reads scattered across the whole table.
文档建议对数组列使用 GIN 索引。但是,我想通过该列和布尔列的组合进行查询,并且我无法将布尔值添加到索引中,因为 GIN 不支持该类型。我是否最好 (a) 为布尔列创建一个单独的索引,(b) 使用不同的索引类型(哪个?),或者 (c) 不索引布尔列,因为在我的例子中是搜索的结果集数组列索引将只有几行,因此如果查询优化器在其中搜索匹配的布尔值,它只会进行少量比较?
create table foo (
id integer generated by default as identity primary key,
...
bar bool not null, -- TODO: Separate index? Cannot include bool in GIN index
...
baz smallint[] not null);
create index foo_baz_idx on lambdas using gin (baz);
大多数查询将采用 select * from foo where X = any(baz) and bar = Y
的形式,并且在搜索 X
中最多只有少量行
这实际上取决于您的数据的性质。如果 where X = any(baz)
产生的行数很少,那么就不需要索引 bar
.
如果 where X = any(baz)
导致大量行,bar
上的单独索引可能会有所帮助;它会给查询规划器更多的选择。但由于它是一个布尔值,您可以在 bar
上代替 partition the table。然后每个查询都在 where bar = ?
.
Query performance can be improved dramatically in certain situations, particularly when most of the heavily accessed rows of the table are in a single partition or a small number of partitions. The partitioning substitutes for leading columns of indexes, reducing index size and making it more likely that the heavily-used parts of the indexes fit in memory.
When queries or updates access a large percentage of a single partition, performance can be improved by taking advantage of sequential scan of that partition instead of using an index and random access reads scattered across the whole table.