如何在 BigQuery 中映射多个嵌套的父 ID 和子 ID

How to map multiple nested parent and child ids in BigQuery

我有以下查询生成的数据:

with example_data as (
  select 'A' as category, 1 as child_id, 0 as parent_id, 'a' as label
  union all select 'A' as category, 2 as child_id, 1 as parent_id, 'b' as label
  union all select 'A' as category, 3 as child_id, 1 as parent_id, 'c' as label
  union all select 'A' as category, 4 as child_id, 1 as parent_id, 'd' as label
  union all select 'A' as category, 5 as child_id, 2 as parent_id, 'e' as label
  union all select 'A' as category, 6 as child_id, 2 as parent_id, 'f' as label
  union all select 'A' as category, 7 as child_id, 2 as parent_id, 'g' as label
  union all select 'A' as category, 8 as child_id, 3 as parent_id, 'h' as label
  union all select 'A' as category, 9 as child_id, 3 as parent_id, 'i' as label
  union all select 'A' as category, 10 as child_id, 3 as parent_id, 'j' as label
  union all select 'B' as category, 1 as child_id, 0 as parent_id, 'k' as label
  union all select 'B' as category, 2 as child_id, 1 as parent_id, 'l' as label
  union all select 'B' as category, 3 as child_id, 1 as parent_id, 'm' as label
  union all select 'B' as category, 4 as child_id, 1 as parent_id, 'n' as label
  union all select 'B' as category, 5 as child_id, 2 as parent_id, 'o' as label
  union all select 'B' as category, 6 as child_id, 2 as parent_id, 'p' as label
  union all select 'B' as category, 7 as child_id, 2 as parent_id, 'q' as label
  union all select 'B' as category, 8 as child_id, 3 as parent_id, 'r' as label
  union all select 'B' as category, 9 as child_id, 3 as parent_id, 's' as label
  union all select 'B' as category, 10 as child_id, 3 as parent_id, 't' as label
)

select *
from example_data

在这个 table 中,我们有一些标签,每个标签都有一个 parent_id。我想要做的是获得以下结果。第一行可以这样解释:标签e有标签b作为父级,标签b有标签a作为父级,基于父级ids.

category    label_1    label_2    label_3
       A          a          b          e
       A          a          b          f
       A          a          b          g
       A          a          c          h
       A          a          c          i
       A          a          c          j
       A          a          d       null
       B          k          l          o
       B          k          l          p
       B          k          l          q
       B          k          m          r
       B          k          m          s
       B          k          m          t
       B          k          n       null

我相信一定有比我最初尝试更好的方法。希望这会有所帮助,直到有人找到它。

WITH RECURSIVE tree AS (
  SELECT category, 1 AS level, parent_id, child_id, label, [label] labels
    FROM example_data WHERE parent_id = 0
   UNION ALL
  SELECT e.category, level + 1 AS level, e.parent_id, e.child_id, e.label, ARRAY_CONCAT(labels,[e.label])
    FROM tree t JOIN example_data e ON e.category = t.category AND e.parent_id = t.child_id
),
filtered AS (
  SELECT * EXCEPT(labels, max_level, i), i + 1 AS level FROM (
    SELECT * REPLACE(IF(ARRAY_LENGTH(labels) <> max_level, ARRAY_CONCAT(labels, ['null']), labels) AS labels) FROM (
      SELECT category, labels, MAX(ARRAY_LENGTH(labels)) OVER () max_level,
        FROM tree
       WHERE child_id NOT IN (SELECT parent_id FROM example_data)
    )
  ), UNNEST(labels) label WITH OFFSET i
),
pivotted AS (
  SELECT * 
    FROM filtered
   PIVOT (ARRAY_AGG(label IGNORE NULLS) AS level FOR level IN (1, 2, 3))
)
-- Unnest a pivotted result to make final output.
SELECT category, lv1 AS level_1, lv2 AS level_2, lv3 AS level_3 
  FROM pivotted, UNNEST(level_1) lv1 WITH OFFSET
  JOIN UNNEST(level_2) lv2 WITH OFFSET USING(offset)
  JOIN UNNEST(level_3) lv3 WITH OFFSET USING(offset)
 ORDER BY 1, 2, 3, 4
;

考虑以下方法

with recursive iterations as (
  select category, child_id, label as labels
    from example_data where parent_id = 0
   union all
  select e.category, e.child_id, concat(labels, '||', label)
    from iterations i join example_data e 
    on e.category = i.category and e.parent_id = i.child_id 
)
select * except(labels) from (
  select * from (
    select category, labels from iterations
    qualify not ifnull(starts_with(lead(labels) over(partition by category order by labels), labels || '||'), false)
  ), unnest(split(labels, '||')) label with offset
)
pivot (any_value(label) as label for offset + 1 in (1, 2, 3))
order by category, labels           

如果应用于您问题中的示例数据 - 输出为