如何在 postgresql-10 中通过给定路径递归搜索 table 树状结构

How to search table recursively by given path for a tree-like structure in postgresql-10

我有一个 table 这样的:

+-----------------------------------+
| id | client_id | main_id |  name  |
|-----------------------------------|
| 1  | 1         | NULL    | hello  |
| 2  | 1         | 1       | hello2 |
| 3  | 1         | 2       | hello3 |
| 4  | 2         | NULL    | hello  |
| 5  | 2         | 4       | hello2 |
| 6  | 2         | 5       | hello3 |
+-----------------------------------+

我想通过给/hello/hello2/hello3client_id得到id:3因为/hello3属于hello2hello2属于hello。当我给出完整路径时,我想 return 最后一条路径 ID.

这是我的 table 架构:

CREATE TABLE "public"."paths" (
  "id" serial8,
  "client_id" int8,
  "main_id" int8,
  "name" varchar(255) NOT NULL,
  FOREIGN KEY ("main_id") REFERENCES "public"."paths" ("id")
)
;
-- INDEX ON CLIENT ID's.
CREATE INDEX "cid" ON "public"."paths" USING btree (
  "client_id"
);

到目前为止,使用递归我试过这个:

WITH RECURSIVE full_paths AS
(SELECT id, name, main_id, CAST(name As varchar(1000)) As fname
FROM paths
WHERE client_id = 1
UNION ALL
SELECT x.id, x.name, x.main_id, CAST(y.fname || '/' || x.name As varchar(1000)) As fname
FROM paths As x
    INNER JOIN full_paths AS y ON (x.main_id = y.id)
)
SELECT id, fname FROM full_paths WHERE fname = '/home/home2/home3';

但是我的 table 中有一百万条记录,这会通过查询整个 table.

来减慢请求速度

另请参阅下面的 EXPLAIN

CTE Scan on full_paths  (cost=4383987797.32..7008489047.29 rows=583222500 width=40) (actual time=1254.573..1675.192 rows=1 loops=1)
  Filter: (fname = '/home/home2/home3'::text)
  Rows Removed by Filter: 482943
  Buffers: shared hit=23754, temp read=8510 written=13548
  CTE full_paths
    ->  Recursive Union  (cost=0.00..4383987797.32 rows=116644499999 width=61) (actual time=0.015..1476.644 rows=482944 loops=1)
          Buffers: shared hit=23754, temp read=8510 written=10261
          ->  Seq Scan on paths  (cost=0.00..13955.49 rows=482999 width=42) (actual time=0.013..127.433 rows=482943 loops=1)
                Filter: (client_id = 24)
                Rows Removed by Filter: 3
                Buffers: shared hit=7918
          ->  Merge Join  (cost=966864.46..205108384.18 rows=11664401700 width=61) (actual time=600.989..600.990 rows=0 loops=2)
                Merge Cond: (x.main_id = y.id)
                Buffers: shared hit=15836, temp read=8510 written=6974
                ->  Sort  (cost=69904.11..71111.60 rows=482999 width=29) (actual time=276.900..360.597 rows=482946 loops=2)
                      Sort Key: x.main_id
                      Sort Method: external sort  Disk: 19848kB
                      Buffers: shared hit=15836, temp read=4962 written=4962
                      ->  Seq Scan on paths x  (cost=0.00..12747.99 rows=482999 width=29) (actual time=0.010..106.355 rows=482946 loops=2)
                            Buffers: shared hit=15836
                ->  Materialize  (cost=896960.36..921110.31 rows=4829990 width=40) (actual time=192.873..192.876 rows=3 loops=2)
                      Buffers: temp read=3548 written=2012
                      ->  Sort  (cost=896960.36..909035.33 rows=4829990 width=40) (actual time=191.121..191.122 rows=3 loops=2)
                            Sort Key: y.id
                            Sort Method: quicksort  Memory: 25kB
                            Buffers: temp read=3548 written=2012
                            ->  WorkTable Scan on full_paths y  (cost=0.00..96599.80 rows=4829990 width=40) (actual time=0.012..44.830 rows=241472 loops=2)
                                  Buffers: temp read=3289 written=1
Planning time: 0.261 ms
Execution time: 1685.199 ms

如何编写正确有效的快速查询?我是否需要编写函数(如果您提供示例函数,我不知道我会很高兴)?

您应该按所需路径的适当部分(名称)过滤访问过的行。添加辅助查询(模式)将输入路径转换为数组,并使用数组的元素去除不必要的行。

with recursive pattern(pattern) as (
    select string_to_array('hello/hello2/hello3', '/') -- input
),
full_paths as (
    select id, main_id, name, 1 as idx
    from paths
    cross join pattern
    where client_id = 1 and name = pattern[1]
union all
    select x.id, x.main_id, x.name, idx+ 1
    from paths as x
    cross join pattern
    inner join full_paths as y 
        on x.main_id = y.id 
        and x.name = pattern[idx+ 1]
)
select id, name
from full_paths
cross join pattern
where idx = cardinality(pattern)

Working example in rextester.