使用 JSONB 数据查询 Postgres Table
Querying Postgres Table with JSONB data
我有一个 Table 将数据存储在 JSONB 列中。
现在,我想做的是,查询 table,并获取具有特定键值的记录。
这很好用:
SELECT "documents".*
FROM "documents"
WHERE (data @> '{"type": "foo"}')
但我想做的是,获取 table 中类型为 foo
或 bar
.
的所有行
我试过这个:
SELECT "documents".*
FROM "documents"
WHERE (data @> '{"type": ["foo", "bar"]}')
但这似乎不起作用。
我也试过这个:
SELECT "documents".*
FROM "documents"
WHERE (data->'type' ?| array['foo', 'bar'])
这行得通,但是如果我像这样指定一个键 data->'type'
它会带走查询的动态性。
顺便说一句,我在 Rails 上使用 Ruby 和 Postgres,所以所有查询都通过 ActiveRecord
。是这样的:
Document.where("data @> ?", query)
if I specify a key like so data->'type' it takes away the dynamicity of the query.
我了解到您在列 data
上有一个 gin 索引,定义如下:
CREATE INDEX ON documents USING GIN (data);
索引适用于此查询:
EXPLAIN ANALYSE
SELECT "documents".*
FROM "documents"
WHERE data @> '{"type": "foo"}';
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on documents (cost=30.32..857.00 rows=300 width=25) (actual time=0.639..0.640 rows=1 loops=1)
Recheck Cond: (data @> '{"type": "foo"}'::jsonb)
Heap Blocks: exact=1
-> Bitmap Index Scan on documents_data_idx (cost=0.00..30.25 rows=300 width=0) (actual time=0.581..0.581 rows=1 loops=1)
Index Cond: (data @> '{"type": "foo"}'::jsonb)
Planning time: 7.928 ms
Execution time: 0.841 ms
但不是这个:
EXPLAIN ANALYSE
SELECT "documents".*
FROM "documents"
WHERE (data->'type' ?| array['foo', 'bar']);
QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Seq Scan on documents (cost=0.00..6702.98 rows=300 width=25) (actual time=31.895..92.813 rows=2 loops=1)
Filter: ((data -> 'type'::text) ?| '{foo,bar}'::text[])
Rows Removed by Filter: 299997
Planning time: 1.836 ms
Execution time: 92.839 ms
解决方法 1. 两次使用运算符@>
,两个条件都会使用索引:
EXPLAIN ANALYSE
SELECT "documents".*
FROM "documents"
WHERE data @> '{"type": "foo"}'
OR data @> '{"type": "bar"}';
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on documents (cost=60.80..1408.13 rows=600 width=25) (actual time=0.222..0.233 rows=2 loops=1)
Recheck Cond: ((data @> '{"type": "foo"}'::jsonb) OR (data @> '{"type": "bar"}'::jsonb))
Heap Blocks: exact=2
-> BitmapOr (cost=60.80..60.80 rows=600 width=0) (actual time=0.204..0.204 rows=0 loops=1)
-> Bitmap Index Scan on documents_data_idx (cost=0.00..30.25 rows=300 width=0) (actual time=0.144..0.144 rows=1 loops=1)
Index Cond: (data @> '{"type": "foo"}'::jsonb)
-> Bitmap Index Scan on documents_data_idx (cost=0.00..30.25 rows=300 width=0) (actual time=0.059..0.059 rows=1 loops=1)
Index Cond: (data @> '{"type": "bar"}'::jsonb)
Planning time: 3.170 ms
Execution time: 0.289 ms
解决方案 2. 在 (data->'type')
上创建附加索引:
CREATE INDEX ON documents USING GIN ((data->'type'));
EXPLAIN ANALYSE
SELECT "documents".*
FROM "documents"
WHERE (data->'type' ?| array['foo', 'bar']);
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on documents (cost=30.32..857.75 rows=300 width=25) (actual time=0.056..0.067 rows=2 loops=1)
Recheck Cond: ((data -> 'type'::text) ?| '{foo,bar}'::text[])
Heap Blocks: exact=2
-> Bitmap Index Scan on documents_expr_idx (cost=0.00..30.25 rows=300 width=0) (actual time=0.035..0.035 rows=2 loops=1)
Index Cond: ((data -> 'type'::text) ?| '{foo,bar}'::text[])
Planning time: 2.951 ms
Execution time: 0.108 ms
解决方案3。实际上这是解决方案1的变体,条件格式不同,可能更方便客户端程序使用:
EXPLAIN ANALYSE
SELECT "documents".*
FROM "documents"
WHERE data @> any(array['{"type": "foo"}', '{"type": "bar"}']::jsonb[]);
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on documents (cost=60.65..1544.20 rows=600 width=26) (actual time=0.803..0.819 rows=2 loops=1)
Recheck Cond: (data @> ANY ('{"{\"type\": \"foo\"}","{\"type\": \"bar\"}"}'::jsonb[]))
Heap Blocks: exact=2
-> Bitmap Index Scan on documents_data_idx (cost=0.00..60.50 rows=600 width=0) (actual time=0.778..0.778 rows=2 loops=1)
Index Cond: (data @> ANY ('{"{\"type\": \"foo\"}","{\"type\": \"bar\"}"}'::jsonb[]))
Planning time: 2.080 ms
Execution time: 0.304 ms
(7 rows)
在 the documentation 阅读更多内容。
我有一个 Table 将数据存储在 JSONB 列中。
现在,我想做的是,查询 table,并获取具有特定键值的记录。
这很好用:
SELECT "documents".*
FROM "documents"
WHERE (data @> '{"type": "foo"}')
但我想做的是,获取 table 中类型为 foo
或 bar
.
我试过这个:
SELECT "documents".*
FROM "documents"
WHERE (data @> '{"type": ["foo", "bar"]}')
但这似乎不起作用。
我也试过这个:
SELECT "documents".*
FROM "documents"
WHERE (data->'type' ?| array['foo', 'bar'])
这行得通,但是如果我像这样指定一个键 data->'type'
它会带走查询的动态性。
顺便说一句,我在 Rails 上使用 Ruby 和 Postgres,所以所有查询都通过 ActiveRecord
。是这样的:
Document.where("data @> ?", query)
if I specify a key like so data->'type' it takes away the dynamicity of the query.
我了解到您在列 data
上有一个 gin 索引,定义如下:
CREATE INDEX ON documents USING GIN (data);
索引适用于此查询:
EXPLAIN ANALYSE
SELECT "documents".*
FROM "documents"
WHERE data @> '{"type": "foo"}';
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on documents (cost=30.32..857.00 rows=300 width=25) (actual time=0.639..0.640 rows=1 loops=1)
Recheck Cond: (data @> '{"type": "foo"}'::jsonb)
Heap Blocks: exact=1
-> Bitmap Index Scan on documents_data_idx (cost=0.00..30.25 rows=300 width=0) (actual time=0.581..0.581 rows=1 loops=1)
Index Cond: (data @> '{"type": "foo"}'::jsonb)
Planning time: 7.928 ms
Execution time: 0.841 ms
但不是这个:
EXPLAIN ANALYSE
SELECT "documents".*
FROM "documents"
WHERE (data->'type' ?| array['foo', 'bar']);
QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Seq Scan on documents (cost=0.00..6702.98 rows=300 width=25) (actual time=31.895..92.813 rows=2 loops=1)
Filter: ((data -> 'type'::text) ?| '{foo,bar}'::text[])
Rows Removed by Filter: 299997
Planning time: 1.836 ms
Execution time: 92.839 ms
解决方法 1. 两次使用运算符@>
,两个条件都会使用索引:
EXPLAIN ANALYSE
SELECT "documents".*
FROM "documents"
WHERE data @> '{"type": "foo"}'
OR data @> '{"type": "bar"}';
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on documents (cost=60.80..1408.13 rows=600 width=25) (actual time=0.222..0.233 rows=2 loops=1)
Recheck Cond: ((data @> '{"type": "foo"}'::jsonb) OR (data @> '{"type": "bar"}'::jsonb))
Heap Blocks: exact=2
-> BitmapOr (cost=60.80..60.80 rows=600 width=0) (actual time=0.204..0.204 rows=0 loops=1)
-> Bitmap Index Scan on documents_data_idx (cost=0.00..30.25 rows=300 width=0) (actual time=0.144..0.144 rows=1 loops=1)
Index Cond: (data @> '{"type": "foo"}'::jsonb)
-> Bitmap Index Scan on documents_data_idx (cost=0.00..30.25 rows=300 width=0) (actual time=0.059..0.059 rows=1 loops=1)
Index Cond: (data @> '{"type": "bar"}'::jsonb)
Planning time: 3.170 ms
Execution time: 0.289 ms
解决方案 2. 在 (data->'type')
上创建附加索引:
CREATE INDEX ON documents USING GIN ((data->'type'));
EXPLAIN ANALYSE
SELECT "documents".*
FROM "documents"
WHERE (data->'type' ?| array['foo', 'bar']);
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on documents (cost=30.32..857.75 rows=300 width=25) (actual time=0.056..0.067 rows=2 loops=1)
Recheck Cond: ((data -> 'type'::text) ?| '{foo,bar}'::text[])
Heap Blocks: exact=2
-> Bitmap Index Scan on documents_expr_idx (cost=0.00..30.25 rows=300 width=0) (actual time=0.035..0.035 rows=2 loops=1)
Index Cond: ((data -> 'type'::text) ?| '{foo,bar}'::text[])
Planning time: 2.951 ms
Execution time: 0.108 ms
解决方案3。实际上这是解决方案1的变体,条件格式不同,可能更方便客户端程序使用:
EXPLAIN ANALYSE
SELECT "documents".*
FROM "documents"
WHERE data @> any(array['{"type": "foo"}', '{"type": "bar"}']::jsonb[]);
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on documents (cost=60.65..1544.20 rows=600 width=26) (actual time=0.803..0.819 rows=2 loops=1)
Recheck Cond: (data @> ANY ('{"{\"type\": \"foo\"}","{\"type\": \"bar\"}"}'::jsonb[]))
Heap Blocks: exact=2
-> Bitmap Index Scan on documents_data_idx (cost=0.00..60.50 rows=600 width=0) (actual time=0.778..0.778 rows=2 loops=1)
Index Cond: (data @> ANY ('{"{\"type\": \"foo\"}","{\"type\": \"bar\"}"}'::jsonb[]))
Planning time: 2.080 ms
Execution time: 0.304 ms
(7 rows)
在 the documentation 阅读更多内容。