有没有更有效的方式来编写这个查询?
Is there a more efficient way of writing this query?
为了节省我在 Postgres 中执行的查询数量,我想尝试在单个查询中 return 尽可能多的数据。这是一个简单的案例,展示了我天真的解决方案,我想 return 所有资产和与资产相关的所有属性 ID。资产有很多属性,通过
绑定
SELECT assets.id, ARRAY(
SELECT attributes.id
FROM attributes
WHERE attributes.asset_id = assets.id
) as attributes
FROM assets
LIMIT 100;
运行 这在 Postgres 中 return 是一个看起来像这样的数据集:
id attributes
3017 "{8948,9386}"
现在 ORM 需要单独 运行 该内部查询,在我看来,通过将其委托给应用程序的不同部分来执行相同任务的效率较低。虽然嵌套查询很糟糕,但至少通过这种方式我不会进行多次可能代价高昂的数据库调用。
但是这种方法至少有一个大问题:return从 assets
table 中提取第二列大约会使时间增加一倍。仅返回 id
列需要 4893 毫秒,returning id
和 name
需要 8819ms,returning 三个列需要 10744ms...它只会变得更糟添加到查询中的每一列。不是真正可扩展的解决方案。另外我想每行检索多个关系,所以每一行可能有多个这样的子查询,这肯定会变得更昂贵。
编辑: 根据请求,我 运行 针对此查询进行解释分析。
包括assets.id
:
"Seq Scan on assets (cost=0.00..1804514.69 rows=142204 width=4) (actual time=0.059..2306.204 rows=142178 loops=1)"
" SubPlan 1"
" -> Bitmap Heap Scan on attributes (cost=4.18..12.64 rows=4 width=4) (actual time=0.009..0.009 rows=0 loops=142178)"
" Recheck Cond: (asset_id = assets.id)"
" -> Bitmap Index Scan on attributes_asset_id_idx (cost=0.00..4.18 rows=4 width=0) (actual time=0.004..0.004 rows=0 loops=142178)"
" Index Cond: (asset_id = assets.id)"
"Total runtime: 2674.115 ms"
包括assets.id
和assets.name
:
"Seq Scan on assets (cost=0.00..1804514.69 rows=142204 width=20) (actual time=0.058..2330.947 rows=142178 loops=1)"
" SubPlan 1"
" -> Bitmap Heap Scan on attributes (cost=4.18..12.64 rows=4 width=4) (actual time=0.009..0.009 rows=0 loops=142178)"
" Recheck Cond: (asset_id = assets.id)"
" -> Bitmap Index Scan on attributes_asset_id_idx (cost=0.00..4.18 rows=4 width=0) (actual time=0.004..0.004 rows=0 loops=142178)"
" Index Cond: (asset_id = assets.id)"
"Total runtime: 2693.455 ms"
编辑 2:根据请求,这里是有问题的 table 模式的(轻微混淆)版本:
-- Table: ta_main.assets
-- DROP TABLE ta_main.assets;
CREATE TABLE ta_main.assets
(
id serial NOT NULL,
name text NOT NULL,
field1 double precision NOT NULL DEFAULT 0,
field2 double precision NOT NULL DEFAULT 0,
field3 smallint NOT NULL DEFAULT 3,
field4 date,
field5 json,
field6 text,
field7 text,
field8 text,
field9 date,
field10 date,
field11 date,
field12 text,
field13 boolean NOT NULL DEFAULT false,
field14 boolean NOT NULL DEFAULT false,
field15 boolean NOT NULL DEFAULT false,
field16 boolean NOT NULL DEFAULT false,
field17 boolean NOT NULL DEFAULT false,
field18 boolean NOT NULL DEFAULT false,
field19 double precision NOT NULL DEFAULT 0,
field20 date,
field21 date,
field22 date,
field23 text,
field24 double precision NOT NULL DEFAULT 0,
field25 double precision NOT NULL DEFAULT 0,
field26 integer NOT NULL DEFAULT 0,
field27 integer NOT NULL DEFAULT 0,
field28 double precision NOT NULL DEFAULT 0,
field29 boolean NOT NULL DEFAULT false,
field30 integer NOT NULL DEFAULT 0,
field31 double precision NOT NULL DEFAULT 0,
field32 double precision NOT NULL DEFAULT 0,
field33 double precision NOT NULL DEFAULT 0,
field34 double precision NOT NULL DEFAULT 0,
field35 date,
field36 integer NOT NULL DEFAULT 0,
field37 integer NOT NULL DEFAULT 0,
field38 integer NOT NULL DEFAULT 0,
field39 integer NOT NULL DEFAULT 0,
field40 json,
field41 double precision NOT NULL DEFAULT 0,
field42 boolean NOT NULL DEFAULT false,
field43 double precision NOT NULL DEFAULT 0,
field44 double precision NOT NULL DEFAULT 0,
field46 double precision NOT NULL DEFAULT 0,
field47 double precision NOT NULL DEFAULT 0,
field48 text,
field49 date,
field50 text,
field51 text,
field52 date,
field53 double precision NOT NULL DEFAULT 0,
field54 integer,
field55 boolean NOT NULL DEFAULT false,
field56 boolean NOT NULL DEFAULT true,
created timestamp with time zone NOT NULL DEFAULT now(),
updated timestamp with time zone,
_deleted boolean NOT NULL DEFAULT false,
CONSTRAINT asset_pkey PRIMARY KEY (id)
)
WITH (
OIDS=FALSE
);
ALTER TABLE ta_main.assets
OWNER TO postgres;
-- Index: ta_main.assets_deleted_idx
-- DROP INDEX ta_main.assets_deleted_idx;
CREATE INDEX assets_deleted_idx
ON ta_main.assets
USING btree
(_deleted);
-- Trigger: update_timestamp_assets on ta_main.assets
-- DROP TRIGGER update_timestamp_assets ON ta_main.assets;
CREATE TRIGGER update_timestamp_assets
BEFORE UPDATE
ON ta_main.assets
FOR EACH ROW
EXECUTE PROCEDURE global.update_timestamp();
对于资产
-- Table: ta_main.attributes
-- DROP TABLE ta_main.attributes;
CREATE TABLE ta_main.attributes
(
id serial NOT NULL,
asset_id integer NOT NULL,
related_id integer,
type smallint NOT NULL,
description text,
flag smallint NOT NULL DEFAULT 0,
value double precision NOT NULL DEFAULT 0,
nbv double precision NOT NULL DEFAULT 0,
acc double precision NOT NULL DEFAULT 0,
eul integer NOT NULL DEFAULT 0,
quantity double precision NOT NULL DEFAULT 0,
quantity_extra double precision NOT NULL DEFAULT 0,
added_by text,
is_import boolean NOT NULL DEFAULT false,
is_donated boolean NOT NULL DEFAULT false,
created timestamp with time zone NOT NULL DEFAULT now(),
updated timestamp with time zone,
_deleted boolean NOT NULL DEFAULT false,
CONSTRAINT adjustment_pkey PRIMARY KEY (id),
CONSTRAINT asset_fkey FOREIGN KEY (asset_id)
REFERENCES ta_main.assets (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE CASCADE,
CONSTRAINT related_fkey FOREIGN KEY (related_id)
REFERENCES ta_main.attributes (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
)
WITH (
OIDS=FALSE
);
ALTER TABLE ta_main.attributes
OWNER TO postgres;
-- Index: ta_main.attributes_deleted_idx
-- DROP INDEX ta_main.attributes_deleted_idx;
CREATE INDEX attributes_deleted_idx
ON ta_main.attributes
USING btree
(_deleted);
-- Index: ta_main.fki_asset_fkey
-- DROP INDEX ta_main.fki_asset_fkey;
CREATE INDEX fki_asset_fkey
ON ta_main.attributes
USING btree
(asset_id);
-- Index: ta_main.fki_related_fkey
-- DROP INDEX ta_main.fki_related_fkey;
CREATE INDEX fki_related_fkey
ON ta_main.attributes
USING btree
(related_id);
-- Trigger: update_timestamp_attributes on ta_main.attributes
-- DROP TRIGGER update_timestamp_attributes ON ta_main.attributes;
CREATE TRIGGER update_timestamp_attributes
BEFORE UPDATE
ON ta_main.attributes
FOR EACH ROW
EXECUTE PROCEDURE global.update_timestamp();
你能试试这个吗?这似乎消除了内部查询。
SELECT assets.id, ARRAY_AGG(attributes.id) FROM assets LEFT OUTER JOIN attributes ON assets.id = attributes.asset_id GROUP BY assets.id
为了节省我在 Postgres 中执行的查询数量,我想尝试在单个查询中 return 尽可能多的数据。这是一个简单的案例,展示了我天真的解决方案,我想 return 所有资产和与资产相关的所有属性 ID。资产有很多属性,通过
绑定SELECT assets.id, ARRAY(
SELECT attributes.id
FROM attributes
WHERE attributes.asset_id = assets.id
) as attributes
FROM assets
LIMIT 100;
运行 这在 Postgres 中 return 是一个看起来像这样的数据集:
id attributes
3017 "{8948,9386}"
现在 ORM 需要单独 运行 该内部查询,在我看来,通过将其委托给应用程序的不同部分来执行相同任务的效率较低。虽然嵌套查询很糟糕,但至少通过这种方式我不会进行多次可能代价高昂的数据库调用。
但是这种方法至少有一个大问题:return从 assets
table 中提取第二列大约会使时间增加一倍。仅返回 id
列需要 4893 毫秒,returning id
和 name
需要 8819ms,returning 三个列需要 10744ms...它只会变得更糟添加到查询中的每一列。不是真正可扩展的解决方案。另外我想每行检索多个关系,所以每一行可能有多个这样的子查询,这肯定会变得更昂贵。
编辑: 根据请求,我 运行 针对此查询进行解释分析。
包括assets.id
:
"Seq Scan on assets (cost=0.00..1804514.69 rows=142204 width=4) (actual time=0.059..2306.204 rows=142178 loops=1)"
" SubPlan 1"
" -> Bitmap Heap Scan on attributes (cost=4.18..12.64 rows=4 width=4) (actual time=0.009..0.009 rows=0 loops=142178)"
" Recheck Cond: (asset_id = assets.id)"
" -> Bitmap Index Scan on attributes_asset_id_idx (cost=0.00..4.18 rows=4 width=0) (actual time=0.004..0.004 rows=0 loops=142178)"
" Index Cond: (asset_id = assets.id)"
"Total runtime: 2674.115 ms"
包括assets.id
和assets.name
:
"Seq Scan on assets (cost=0.00..1804514.69 rows=142204 width=20) (actual time=0.058..2330.947 rows=142178 loops=1)"
" SubPlan 1"
" -> Bitmap Heap Scan on attributes (cost=4.18..12.64 rows=4 width=4) (actual time=0.009..0.009 rows=0 loops=142178)"
" Recheck Cond: (asset_id = assets.id)"
" -> Bitmap Index Scan on attributes_asset_id_idx (cost=0.00..4.18 rows=4 width=0) (actual time=0.004..0.004 rows=0 loops=142178)"
" Index Cond: (asset_id = assets.id)"
"Total runtime: 2693.455 ms"
编辑 2:根据请求,这里是有问题的 table 模式的(轻微混淆)版本:
-- Table: ta_main.assets
-- DROP TABLE ta_main.assets;
CREATE TABLE ta_main.assets
(
id serial NOT NULL,
name text NOT NULL,
field1 double precision NOT NULL DEFAULT 0,
field2 double precision NOT NULL DEFAULT 0,
field3 smallint NOT NULL DEFAULT 3,
field4 date,
field5 json,
field6 text,
field7 text,
field8 text,
field9 date,
field10 date,
field11 date,
field12 text,
field13 boolean NOT NULL DEFAULT false,
field14 boolean NOT NULL DEFAULT false,
field15 boolean NOT NULL DEFAULT false,
field16 boolean NOT NULL DEFAULT false,
field17 boolean NOT NULL DEFAULT false,
field18 boolean NOT NULL DEFAULT false,
field19 double precision NOT NULL DEFAULT 0,
field20 date,
field21 date,
field22 date,
field23 text,
field24 double precision NOT NULL DEFAULT 0,
field25 double precision NOT NULL DEFAULT 0,
field26 integer NOT NULL DEFAULT 0,
field27 integer NOT NULL DEFAULT 0,
field28 double precision NOT NULL DEFAULT 0,
field29 boolean NOT NULL DEFAULT false,
field30 integer NOT NULL DEFAULT 0,
field31 double precision NOT NULL DEFAULT 0,
field32 double precision NOT NULL DEFAULT 0,
field33 double precision NOT NULL DEFAULT 0,
field34 double precision NOT NULL DEFAULT 0,
field35 date,
field36 integer NOT NULL DEFAULT 0,
field37 integer NOT NULL DEFAULT 0,
field38 integer NOT NULL DEFAULT 0,
field39 integer NOT NULL DEFAULT 0,
field40 json,
field41 double precision NOT NULL DEFAULT 0,
field42 boolean NOT NULL DEFAULT false,
field43 double precision NOT NULL DEFAULT 0,
field44 double precision NOT NULL DEFAULT 0,
field46 double precision NOT NULL DEFAULT 0,
field47 double precision NOT NULL DEFAULT 0,
field48 text,
field49 date,
field50 text,
field51 text,
field52 date,
field53 double precision NOT NULL DEFAULT 0,
field54 integer,
field55 boolean NOT NULL DEFAULT false,
field56 boolean NOT NULL DEFAULT true,
created timestamp with time zone NOT NULL DEFAULT now(),
updated timestamp with time zone,
_deleted boolean NOT NULL DEFAULT false,
CONSTRAINT asset_pkey PRIMARY KEY (id)
)
WITH (
OIDS=FALSE
);
ALTER TABLE ta_main.assets
OWNER TO postgres;
-- Index: ta_main.assets_deleted_idx
-- DROP INDEX ta_main.assets_deleted_idx;
CREATE INDEX assets_deleted_idx
ON ta_main.assets
USING btree
(_deleted);
-- Trigger: update_timestamp_assets on ta_main.assets
-- DROP TRIGGER update_timestamp_assets ON ta_main.assets;
CREATE TRIGGER update_timestamp_assets
BEFORE UPDATE
ON ta_main.assets
FOR EACH ROW
EXECUTE PROCEDURE global.update_timestamp();
对于资产
-- Table: ta_main.attributes
-- DROP TABLE ta_main.attributes;
CREATE TABLE ta_main.attributes
(
id serial NOT NULL,
asset_id integer NOT NULL,
related_id integer,
type smallint NOT NULL,
description text,
flag smallint NOT NULL DEFAULT 0,
value double precision NOT NULL DEFAULT 0,
nbv double precision NOT NULL DEFAULT 0,
acc double precision NOT NULL DEFAULT 0,
eul integer NOT NULL DEFAULT 0,
quantity double precision NOT NULL DEFAULT 0,
quantity_extra double precision NOT NULL DEFAULT 0,
added_by text,
is_import boolean NOT NULL DEFAULT false,
is_donated boolean NOT NULL DEFAULT false,
created timestamp with time zone NOT NULL DEFAULT now(),
updated timestamp with time zone,
_deleted boolean NOT NULL DEFAULT false,
CONSTRAINT adjustment_pkey PRIMARY KEY (id),
CONSTRAINT asset_fkey FOREIGN KEY (asset_id)
REFERENCES ta_main.assets (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE CASCADE,
CONSTRAINT related_fkey FOREIGN KEY (related_id)
REFERENCES ta_main.attributes (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
)
WITH (
OIDS=FALSE
);
ALTER TABLE ta_main.attributes
OWNER TO postgres;
-- Index: ta_main.attributes_deleted_idx
-- DROP INDEX ta_main.attributes_deleted_idx;
CREATE INDEX attributes_deleted_idx
ON ta_main.attributes
USING btree
(_deleted);
-- Index: ta_main.fki_asset_fkey
-- DROP INDEX ta_main.fki_asset_fkey;
CREATE INDEX fki_asset_fkey
ON ta_main.attributes
USING btree
(asset_id);
-- Index: ta_main.fki_related_fkey
-- DROP INDEX ta_main.fki_related_fkey;
CREATE INDEX fki_related_fkey
ON ta_main.attributes
USING btree
(related_id);
-- Trigger: update_timestamp_attributes on ta_main.attributes
-- DROP TRIGGER update_timestamp_attributes ON ta_main.attributes;
CREATE TRIGGER update_timestamp_attributes
BEFORE UPDATE
ON ta_main.attributes
FOR EACH ROW
EXECUTE PROCEDURE global.update_timestamp();
你能试试这个吗?这似乎消除了内部查询。
SELECT assets.id, ARRAY_AGG(attributes.id) FROM assets LEFT OUTER JOIN attributes ON assets.id = attributes.asset_id GROUP BY assets.id