将位置定义的层次结构转换为 SQL 定义的层次结构 ID

Question

我有很多来自旧系统的数据，该系统根据数据在 table 中的位置定义物料清单中的数据。来自旧系统的 BoM 数据 table 看起来像

    ID  level   ItemNumber
    1   1   TopItem
    2   .2  FirstChildOfTop
    3   .2  2ndChildofTop
    4   .2  3ChildOfTop
    5   ..3 1stChildof3ChildofTop
    6   ..3 2ndChildof3ChildofTop
    7   .2  4thChildofTop
    8   ..3 1stChildof4ChildTop
    9   ...4    1stChildof4ChildTop
    10  ..3 2ndChildof4ChildofTop
    11  .2  5thChildofTop
    12  ..3 1stChildof5thChildofTop
    13  ...4    1stChildof1stChildof5thChildofTop
    14  ..3 2ndChildof5thChildofTop
    15  1   2ndTopItem
    16  1   3rdTopItem

在我的示例中，ID 是连续的，ID 的真实数据可以被打破，但始终是从低到高，因为这是定义层次结构的方式。

通过使用一些简单的代码将级别编号替换为选项卡，我们可以获得视觉层次结构

    1    TopItem
    2        FirstChildOfTop
    3        2ndChildofTop
    4        3ChildOfTop
    5            1stChildof3ChildofTo
    6            2ndChildof3ChildofTo
    7        4thChildofTop
    8            1stChildof4ChildTop
    9                1stChildof4ChildTop
    10           2ndChildof4ChildofTo
    11       5thChildofTop
    12           1stChildof5thChildof
    13               1stChildof1stChildof
    14           2ndChildof5thChildof
    15   2ndTopItem
    16   3rdTopItem

因为我有大约 5,000 个这样的列表，它们的长度都在 25 到 55,000 行之间，我需要一些代码来转换这个层次结构以使用 sql HierarchyID，这样我们就可以在任何级别查询列表。目前我希望我的解释表明，你必须从上到下找到项目是第 2 级、第 3 级或其他级别，以及它是否有任何子级。第三列中的项目存在于一个简单的 Item Master table 中，但它在 BoM 中的角色仅在这些 table 中定义。

我会提供一些代码，但我所有的尝试和转换都惨遭失败。我会声称我可以基于集合的查询

目标是微软 SQL 2014 主要目的是将数据存储在数据仓库中，但让人们能够找到子组件和使用位置。

编辑：为了回答 Anthony Hancock 非常中肯的问题，我做了一些工作。请考虑以下内容

    ID  level   ItemNumber  sampH     lft       rgt
   1    1   TopItem   1/2           2           28        
   2    .2  FirstChildOfTop   1/2/3         3           4         
   3    .2  2ndChildofTop     1/2/3         5           6         
   4    .2  3ChildOfTop   1/2/3         7           11        
   5    ..3 1stChildof3ChildofTop     2/3/4         8           9         
   6    ..3 2ndChildof3ChildofTop     2/3/4         10          11        
   7    .2  4thChildofTop     1/2/3         13          20        
   8    ..3 1stChildof4ChildTop   2/3/4         14          17        
   9    ...4    1stChildof4ChildTop   3/4/5         15          16        
  10    ..3 2ndChildof4ChildofTop     2/3/4         18          19        
  11    .2  5thChildofTop     1/2/3/        20          25        
  12    ..3 1stChildof5thChildofTop   2/3/4         21          24        
  13    ...4    1stChildof1stChildof5thChildofTop     3/4/5         22          23        
  14    ..3 2ndChildof5thChildofTop   2/3/4         26          27        
  15    1   2ndTopItem    1/2           2           28        
  16    1   3rdTopItem    1/2           2           28        
  17    0   verytop   1/            1           29

为糟糕的格式道歉
1) 我在第 17 行添加了我们正在制作的项目——即这个 BoM 制作了 'verytop' 项目——所以我重新编号了 'level'
2) 我在 'sampH' 列中添加了我手工编辑的 PathEnumeratedTree 值
3) 在 'lft' 和 'rgt' 两列中，我添加了一些 NestedSets 数据的标识符
如果我手工编辑的栏目不正确，请原谅。

我的目标是获得一个结构，以便有人可以查询这么多深层列表以查找项目在树中的位置以及它的子项是什么。所以我对任何作品都持开放态度。

到目前为止，我对 NestedSets 的测试表明我可以做这样的事情：

-- 给定父项 ItemNumber

的子项

Select c.itemnumber, ' is child of 2ndTopItem'
from [dbo].[Sample] as p, [dbo].[Sample] as c
where (c.lft between p.lft and p.rgt)
and (c.lft <> p.lft)
and p.ItemNumber = '2ndTopItem'

但我完全愿意接受任何如何枚举树结构的建议。

Answer 1

使用您的示例数据创建测试 table 然后为每一行创建父 ID 我认为这就是您想要的？重要的警告是，这完全取决于您的 table 是否为层次结构正确排序，但我没有从所提供的信息中看到任何其他选项。

DROP TABLE IF EXISTS TEST;

CREATE TABLE TEST
(
    ID INT
    ,[Level] VARCHAR(20)
    ,ItemNumber VARCHAR(50)
)
;

INSERT INTO TEST
(ID,[Level],ItemNumber)
VALUES
(1,'1','TopItem')
,(2,'.2','FirstChildOfTop')
,(3,'.2','2ndChildofTop')
,(4,'.2','3ChildOfTop')
,(5,'..3','1stChildof3ChildofTop')
,(6,'..3','2ndChildof3ChildofTop')
,(7,'.2','4thChildofTop')
,(8,'..3','1stChildof4ChildTop')
,(9,'...4','1stChildof4ChildTop')
,(10,'..3','2ndChildof4ChildofTop')
,(11,'.2','5thChildofTop')
,(12,'..3','1stChildof5thChildofTop')
,(13,'...4','1stChildof1stChildof5thChildofTop')
,(14,'..3','2ndChildof5thChildofTop')
,(15,'1','2ndTopItem')
,(16,'1','3rdTopItem')
;

SELECT *
    ,V.ParentID
FROM TEST AS T
OUTER APPLY
(
    SELECT TOP 1 ID AS ParentID
    FROM TEST AS _T
    WHERE _T.ID < T.ID
        AND REPLACE(_T.[Level],'.','') < REPLACE(T.[Level],'.','')
    ORDER BY _T.ID DESC
) AS V
ORDER BY T.ID
;

DROP TABLE IF EXISTS TEST;

Answer 2

试试下面的代码：

declare @Source table (
    Id       int         ,
    [Level]  varchar(20) ,
    [Name]   varchar(50)
);

declare @Target table (
    Id        int          ,
    [Level]   int          ,
    [Name]    varchar(50)  ,
    ParentId  int          ,
    Hid       hierarchyid  ,

    primary key (Id),
    unique ([Level], Id),
    unique (ParentId, Id)
);

-- 1. The Test Data (Thanks Anthony Hancock for it)

insert into @Source
values
    (  1 , '1'    , 'TopItem'                            ),
    (  2 , '.2'   , 'FirstChildOfTop'                    ),
    (  3 , '.2'   , '2ndChildofTop'                      ),
    (  4 , '.2'   , '3ChildOfTop'                        ),
    (  5 , '..3'  , '1stChildof3ChildofTop'              ),
    (  6 , '..3'  , '2ndChildof3ChildofTop'              ),
    (  7 , '.2'   , '4thChildofTop'                      ),
    (  8 , '..3'  , '1stChildof4ChildTop'                ),
    (  9 , '...4' , '1stChildof4ChildTop'                ),
    ( 10 , '..3'  , '2ndChildof4ChildofTop'              ),
    ( 11 , '.2'   , '5thChildofTop'                      ),
    ( 12 , '..3'  , '1stChildof5thChildofTop'            ),
    ( 13 , '...4' , '1stChildof1stChildof5thChildofTop'  ),
    ( 14 , '..3'  , '2ndChildof5thChildofTop'            ),
    ( 15 , '1'    , '2ndTopItem'                         ),
    ( 16 , '1'    , '3rdTopItem'                         );


-- 2. Insert the Test Data to the @Target table
--    with converting of the Level column to int data type
--    to use it as an indexed column in the query # 3
--    (once there are millions of records, that index will be highly useful)

insert into @Target (Id, [Level], [Name]) 
select
    Id, 
    [Level] = cast(replace([Level],'.','') as int),
    [Name]
from
    @Source


-- 3. Calculate the ParentId column and update the @Target table 
--    to use the ParentId as an indexed column in the query # 4

update t set
    ParentId = (
        select top 1 Id 
        from @Target as p
        where p.Id < t.Id and p.[Level] < t.[Level]
        order by p.Id desc )
from 
    @Target t;


-- 4. Calculate the Hid column 
--    based on the ParentId link and in accordance with the Id order

with Recursion as
(
    select
        Id       ,
        ParentId ,
        Hid      =  cast(
                        concat(
                            '/',
                            row_number() over (order by  Id),
                            '/'
                        ) 
                        as varchar(1000)
                    )
    from
        @Target
    where
        ParentId is null

    union all 

    select
        Id        =  t.Id       ,
        ParentId  =  t.ParentId ,
        Hid       =  cast(
                         concat(
                             r.Hid, 
                             row_number() over (partition by t.ParentId order by t.Id), 
                             '/'
                         )
                         as varchar(1000)
                     )
    from 
        Recursion r        
        inner join @Target t on t.ParentId = r.Id 
)
update t set
    Hid = r.Hid
from 
    @Target t
    inner join Recursion r  on r.Id = t.Id;


-- 5. See the result ordered by Hid

select 
    Id       ,
    [Level]  ,
    [Name]   ,
    ParentId ,
    Hid      ,
    HidPath  =  Hid.ToString()
from 
    @Target 
order by 
    Hid;

阅读更多关于 Combination of Id-ParentId and HierarchyId Approaches to Hierarchical Data

将位置定义的层次结构转换为 SQL 定义的层次结构 ID

Convert hierarchy defined by position to SQL hierarchy id defined

tsql

sql-server

hierarchyid