在 T-SQL 中具体化嵌套集层次结构的路径

Materializing the path of Nested Set hierarchy in T-SQL

我有一个 table 包含我公司会计科目表的详细信息 - 此数据基本上存储在嵌套集中(在 SQL Server 2014 上),每个记录都有一个左右锚点- 没有 Parent 个 ID。

示例数据:

ID  LeftAnchor  RightAnchor  Name     
 1           0           25  Root     
 2           1           16  Group 1  
 3           2            9  Group 1.1
 4           3            4  Account 1
 5           5            6  Account 2
 6           7            8  Account 3
 7          10           15  Group 1.2
 8          11           12  Account 4
 9          13           14  Account 5
10          17           24  Group 2  
11          18           23  Group 2.1
12          19           20  Account 1
13          21           22  Account 1

我需要具体化每条记录的路径,以便我的输出如下所示:

ID  LeftAnchor  RightAnchor  Name       MaterializedPath
 1           0           25  Root       Root
 2           1           16  Group 1    Root > Group 1
 3           2            9  Group 1.1  Root > Group 1 > Group 1.1
 4           3            4  Account 1  Root > Group 1 > Group 1.1 > Account 1
 5           5            6  Account 2  Root > Group 1 > Group 1.1 > Account 2
 6           7            8  Account 3  Root > Group 1 > Group 1.1 > Account 3
 7          10           15  Group 1.2  Root > Group 1 > Group 1.2
 8          11           12  Account 4  Root > Group 1 > Group 1.2 > Acount 4
 9          13           14  Account 5  Root > Group 1 > Group 1.2 > Account 5
10          17           24  Group 2    Root > Group 2
11          18           23  Group 2.1  Root > Group 2 > Group 2.1
12          19           20  Account 1  Root > Group 2 > Group 2.1 > Account 10
13          21           22  Account 1  Root > Group 2 > Group 2.1 > Account 11

虽然我已经设法使用 CTE 实现了这一点,但查询速度非常慢。 运行 只需不到两分钟,输出中就有大约 1200 条记录。

这是我的代码的简化版本:

;with accounts as
(
    -- Chart of Accounts
    select AccountId, LeftAnchor, RightAnchor, Name
    from ChartOfAccounts
    -- dirty great where clause snipped
)
, parents as
(
    -- Work out the Parent Nodes
    select c.AccountId, p.AccountId [ParentId]
    from accounts c
    left join accounts p on (p.LeftAnchor = (
        select max(i.LeftAnchor)
        from accounts i
        where i.LeftAnchor<c.LeftAnchor
        and i.RightAnchor>c.RightAnchor
    ))
)
, path as
(
    -- Calculate the Account path for each node

    -- Root Node
    select c.AccountId, c.LeftAnchor, c.RightAnchor, 0 [Level], convert(varchar(max), c.name) [MaterializedPath]
    from accounts c
    where c.LeftAnchor = (select min(LeftAnchor) from chart)

    union all

    -- Children
    select n.AccountId, n.LeftAnchor, n.RightAnchor, p.level+1, p.path + ' > ' + n.name
    from accounts n
    inner join parents x on (n.AccountId=x.AccountId)
    inner join path p on (x.ParentId=p.AccountId)
)
select * from path order by LeftAnchor

理想情况下,此查询只需要几秒钟(最多)到 运行。我无法对数据库本身(read-only 连接)进行任何更改,所以谁能想出更好的方法来编写此查询?

我觉得您没有父 ID 有点奇怪,但是借助初始 OUTER APPLY,我们可以生成一个父 ID,然后 运行 一个标准的递归 CTE。

例子

Declare @Top int = null  --<<  Sets top of Hier Try 12 (Just for Fun)

;with cte0 as (
                Select A.* 
                      ,B.*
                 From  YourTable A
                 Outer Apply ( 
                                Select Top 1 Pt=ID 
                                 From  YourTable 
                                 Where A.LeftAnchor between LeftAnchor and RightAnchor and LeftAnchor<A.LeftAnchor 
                                 Order By LeftAnchor Desc
                              ) B
              ) 
     ,cteP as (
                  Select ID
                        ,Pt
                        ,LeftAnchor
                        ,RightAnchor
                        ,Lvl=1
                        ,Name 
                        ,Path = cast(Name as varchar(max))
                  From   cte0
                  Where  IsNull(@Top,-1) = case when @Top is null then isnull(Pt ,-1) else ID end
                  Union  All
                  Select r.ID
                        ,r.Pt
                        ,r.LeftAnchor
                        ,r.RightAnchor
                        ,p.Lvl+1
                        ,r.Name 
                        ,cast(p.path + ' > '+r.Name as varchar(max))
                  From   cte0 r
                  Join   cteP p on r.Pt  = p.ID
              )
Select *
 From  cteP
 Order By LeftAnchor

Returns

首先,您可以尝试重新安排准备中的 CTE(帐户和 parents),使每个 CTE 都包含以前的所有数据,因此您只使用路径 CTE 中的最后一个 - 不需要多个加入:

;with accounts as
(
    -- Chart of Accounts
    select AccountId, LeftAnchor, RightAnchor, Name
    from ChartOfAccounts
    -- dirty great where clause snipped
)
, parents as
(
    -- Work out the Parent Nodes
    select c.*, p.AccountId [ParentId]
    from accounts c
    left join accounts p on (p.LeftAnchor = (
        select max(i.LeftAnchor)
        from accounts i
        where i.LeftAnchor<c.LeftAnchor
        and i.RightAnchor>c.RightAnchor
    ))
)
, path as
(
    -- Calculate the Account path for each node

    -- Root Node
    select c.AccountId, c.LeftAnchor, c.RightAnchor, 0 [Level], convert(varchar(max), c.name) [MaterializedPath]
    from parents c
    where c.ParentID IS NULL

    union all

    -- Children
    select n.AccountId, n.LeftAnchor, n.RightAnchor, p.level+1, p.[MaterializedPath] + ' > ' + n.name
    from parents n
    inner join path p on (n.ParentId=p.AccountId)
)
select * from path order by LeftAnchor

这应该会带来一些改进(在我的测试中为 50%),但要让它变得更好,您可以将准备数据的前半部分拆分为#temp table,将聚簇索引放在 # 中的 ParentID 列上temp table 并在第二部分中使用它

if (Object_ID('tempdb..#tmp') IS NOT NULL) DROP TABLE #tmp;
with accounts as
(
    -- Chart of Accounts
    select AccountId, LeftAnchor, RightAnchor, Name
    from ChartOfAccounts
    -- dirty great where clause snipped
)
, parents as
(
    -- Work out the Parent Nodes
    select c.*, p.AccountId [ParentId]
    from accounts c
    left join accounts p on (p.LeftAnchor = (
        select max(i.LeftAnchor)
        from accounts i
        where i.LeftAnchor<c.LeftAnchor
        and i.RightAnchor>c.RightAnchor
    ))
)
select * into #tmp
from parents;

CREATE CLUSTERED INDEX IX_tmp1 ON #tmp (ParentID);

With path as
(
    -- Calculate the Account path for each node

    -- Root Node
    select c.AccountId, c.LeftAnchor, c.RightAnchor, 0 [Level], convert(varchar(max), c.name) [MaterializedPath]
    from #tmp c
    where c.ParentID IS NULL

    union all

    -- Children
    select n.AccountId, n.LeftAnchor, n.RightAnchor, p.level+1, p.[MaterializedPath] + ' > ' + n.name
    from #tmp n
    inner join path p on (n.ParentId=p.AccountId)
)
select * from path order by LeftAnchor

小样本数据很难说,但应该是一个改进。试过请告知

在你的评论之后,我意识到不需要 CTE...你已经有了范围键。

例子

Select A.*
      ,Path = Replace(Path,'&gt;','>')
 From YourTable A
 Cross Apply (
                Select Path = Stuff((Select ' > ' +Name 
                                      From (
                                            Select LeftAnchor,Name
                                             From  YourTable
                                             Where A.LeftAnchor between LeftAnchor and RightAnchor 
                                           ) B1
                                      Order By LeftAnchor
                                      For XML Path (''))
                                    ,1,6,'')
             ) B
 Order By LeftAnchor

Returns