BigQuery 中的多个联合自联接

Multiple Unioned Self-joins in BigQuery

我有一个 table,带有 id、name 和 parent_id,其中 parent_id 是与 id 相关的父层次结构,见下文。

id 姓名 parent_id
0 一个
1 B 0
2 C 1
3 D 1
4 E 2

我正在尝试使用每个 ID 及其 parent_id 创建一个更好看的 table,包括层次结构中的多个级别。我使用 UNION 和 self-join 来完成这个,但我觉得应该有一种更好的方式来使用 BigQuery 的标准 SQL.

来查询它

在下面的查询中我去了两个级别,但你可以想象我想去 5-6 个级别。

WITH T1 as (
   select 0 as id, 'A' as name, null as parent_id union all
   select 1 as id, 'B' as name, 0 as parent_id union all
   select 2 as id, 'C' as name, 1 as parent_id union all
   select 3 as id, 'D' as name, 1 as parent_id union all
   select 4 as id, 'E' as name, 2 as parent_id
)

SELECT 
    a.id as id, 
    a.name as req_name,
FROM T1 as a
UNION ALL
SELECT  
    a.id as id,
    b.name as req_name,
FROM T1 as a
JOIN T1 as b ON a.parent_id = b.id
UNION ALL
SELECT 
    a.id as id,
    c.name as req_name,
FROM T1 as a
JOIN T1 as b on a.parent_id = b.id
JOIN T1 as c on b.parent_id = c.id

导致 table

id req_name
0 一个
1 B
2 C
3 D
4 E
2 一个
3 一个
4 B
1 一个
2 B
3 B
4 C

如有任何见解,我将不胜感激!

BigQuery(尚)不支持递归或分层查询。所以你的方法实际上很好。如果愿意,您可以使用 left joins:

压缩它
with t as (
   select 0 as id, 'A' as name, null as parent_id union all
   select 1 as id, 'B' as name, 0 as parent_id union all
   select 2 as id, 'C' as name, 1 as parent_id union all
   select 3 as id, 'D' as name, 1 as parent_id union all
   select 4 as id, 'E' as name, 2 as parent_id
)
select distinct id, t1.name
from t t1 left join 
     t t2
     on t2.parent_id = t1.id left join 
     t t3
     on t3.parent_id = t2.id cross join
     unnest(array[t1.id, t2.id, t3.id]) id
where id is not null;

您仍然需要显式连接到数据的最大深度。

另一种方法是使用脚本语言中提供的循环结构。