WITH RECURSIVE 查询选择最长的路径

WITH RECURSIVE query to choose the longest paths

我是 PostgreSQL WITH RECURSIVE 的新手。我有一个遵循邻接表的合理标准的递归查询。如果我有,例如:

1 -> 2
2 -> 3
3 -> 4
3 -> 5
5 -> 6

它产生:

1
1,2
1,2,3
1,2,3,4
1,2,3,5
1,2,3,5,6

我想要的是:

1,2,3,4
1,2,3,5,6

但我看不到如何在 Postgres 中执行此操作。这似乎是 "choose the longest paths" 或 "choose the paths that are not contained in another path"。我可能可以看到如何通过自身连接来执行此操作,但这似乎效率很低。

示例查询是:

WITH RECURSIVE search_graph(id, link, data, depth, path, cycle) AS (
   SELECT g.id, g.link, g.data, 1, ARRAY[g.id], false
   FROM graph g
  UNION ALL
   SELECT g.id, g.link, g.data, sg.depth + 1, path || g.id, g.id = ANY(path)
   FROM graph g, search_graph sg
   WHERE g.id = sg.link AND NOT cycle
)
SELECT * FROM search_graph;

只需将额外的子句添加到最终查询中,如:

WITH RECURSIVE search_graph(id, link, data, depth, path, cycle) AS (
   SELECT g.id, g.link, g.data, 1, ARRAY[g.id], false
    FROM graph g
    -- BTW: you should add a START-CONDITION here, like:
    -- WHERE g.id = 1
    -- or even (to find ALL linked lists):
    -- WHERE NOT EXISTS ( SELECT 13
          -- FROM graph nx
          -- WHERE nx.link = g.id
          -- )
  UNION ALL
     SELECT g.id, g.link, g.data, sg.depth + 1, path || g.id, g.id = ANY(path)
    FROM graph g, search_graph sg
    WHERE g.id = sg.link AND NOT cycle
)
SELECT * FROM search_graph sg
WHERE NOT EXISTS ( -- <<-- extra condition
   SELECT 42 FROM graph nx
   WHERE nx.id = sg.link
    );

请注意:

  • not exists(...) -子句试图加入完全与递归联合的第二个leg相同的记录。
  • 所以:它们是互斥的。
  • 如果它存在,它应该通过递归查询附加到"list"。

您已经有了触手可及的解决方案 cycle,只需在末尾添加谓词即可。

但是将你的中断条件调整一级,目前你追加一个节点太多了:

WITH RECURSIVE search AS (
   SELECT id, link, data, ARRAY[g.id] AS path, <b>(link = id) AS cycle</b>
   FROM   graph g
   WHERE  NOT EXISTS (
      SELECT 1
      FROM   graph
      WHERE  link = g.id
      )

   UNION ALL
   SELECT g.id, g.link, g.data, s.path || g.id, <b>g.link = ANY(s.path)</b>
   FROM   search s
   JOIN   graph g ON g.id = s.link
   WHERE  NOT s.cycle
   )
SELECT *
FROM   search
<b>WHERE cycle</b>;
-- WHERE cycle IS NOT FALSE;  -- alternative if link can be NULL
  • 还包括开始条件,例如

  • cycle 的初始条件是 (link = id) 以捕获快捷循环。如果您有 CHECK 约束以在 table.

  • 中禁止这样做,则没有必要
  • 具体实现取决于缺失的细节。

  • 这是假设所有图都以循环或link IS NULL终止,并且在同一个[=43]中存在从linkid的FK约束=]. 确切的实施取决于缺少的细节。如果link实际上不是link(没有参照完整性),你需要适应...

我不确定这是否应该被视为丑陋的连接解决方​​案。

WITH recursive graph (child, parent) AS (
    SELECT 2, 1
    UNION
    SELECT 3, 2
    UNION
    SELECT 4, 2
    UNION
    SELECT 6, 5
    UNION
    SELECT 7, 6
    UNION
    SELECT 6, 7
),
paths (start, node, depth, path, has_cycle, terminated) AS (
    SELECT
        ARRAY[g1.parent],
        false,
        false
    FROM graph g1
    WHERE true
        AND NOT EXISTS (SELECT 1 FROM graph g2 WHERE g1.parent = g2.child)
    UNION ALL
    SELECT
        p.path || g.child,
        g.child = ANY(p.path),
        g.parent is null AS terminated
    FROM paths p
    LEFT OUTER JOIN graph g ON g.parent = p.node
    WHERE NOT has_cycle
)
SELECT * from path WHERE terminated
;

所以诀窍是通过使用 LEFT OUTER JOIN 使用 terminated 列,然后 select 仅终止路径。