BigQuery 中的多个联合自联接
Multiple Unioned Self-joins in BigQuery
我有一个 table,带有 id、name 和 parent_id,其中 parent_id 是与 id 相关的父层次结构,见下文。
id
姓名
parent_id
0
一个
无
1
B
0
2
C
1
3
D
1
4
E
2
我正在尝试使用每个 ID 及其 parent_id 创建一个更好看的 table,包括层次结构中的多个级别。我使用 UNION 和 self-join 来完成这个,但我觉得应该有一种更好的方式来使用 BigQuery 的标准 SQL.
来查询它
在下面的查询中我去了两个级别,但你可以想象我想去 5-6 个级别。
WITH T1 as (
select 0 as id, 'A' as name, null as parent_id union all
select 1 as id, 'B' as name, 0 as parent_id union all
select 2 as id, 'C' as name, 1 as parent_id union all
select 3 as id, 'D' as name, 1 as parent_id union all
select 4 as id, 'E' as name, 2 as parent_id
)
SELECT
a.id as id,
a.name as req_name,
FROM T1 as a
UNION ALL
SELECT
a.id as id,
b.name as req_name,
FROM T1 as a
JOIN T1 as b ON a.parent_id = b.id
UNION ALL
SELECT
a.id as id,
c.name as req_name,
FROM T1 as a
JOIN T1 as b on a.parent_id = b.id
JOIN T1 as c on b.parent_id = c.id
导致 table
id
req_name
0
一个
1
B
2
C
3
D
4
E
2
一个
3
一个
4
B
1
一个
2
B
3
B
4
C
如有任何见解,我将不胜感激!
BigQuery(尚)不支持递归或分层查询。所以你的方法实际上很好。如果愿意,您可以使用 left join
s:
压缩它
with t as (
select 0 as id, 'A' as name, null as parent_id union all
select 1 as id, 'B' as name, 0 as parent_id union all
select 2 as id, 'C' as name, 1 as parent_id union all
select 3 as id, 'D' as name, 1 as parent_id union all
select 4 as id, 'E' as name, 2 as parent_id
)
select distinct id, t1.name
from t t1 left join
t t2
on t2.parent_id = t1.id left join
t t3
on t3.parent_id = t2.id cross join
unnest(array[t1.id, t2.id, t3.id]) id
where id is not null;
您仍然需要显式连接到数据的最大深度。
另一种方法是使用脚本语言中提供的循环结构。
我有一个 table,带有 id、name 和 parent_id,其中 parent_id 是与 id 相关的父层次结构,见下文。
id | 姓名 | parent_id |
---|---|---|
0 | 一个 | 无 |
1 | B | 0 |
2 | C | 1 |
3 | D | 1 |
4 | E | 2 |
我正在尝试使用每个 ID 及其 parent_id 创建一个更好看的 table,包括层次结构中的多个级别。我使用 UNION 和 self-join 来完成这个,但我觉得应该有一种更好的方式来使用 BigQuery 的标准 SQL.
来查询它在下面的查询中我去了两个级别,但你可以想象我想去 5-6 个级别。
WITH T1 as (
select 0 as id, 'A' as name, null as parent_id union all
select 1 as id, 'B' as name, 0 as parent_id union all
select 2 as id, 'C' as name, 1 as parent_id union all
select 3 as id, 'D' as name, 1 as parent_id union all
select 4 as id, 'E' as name, 2 as parent_id
)
SELECT
a.id as id,
a.name as req_name,
FROM T1 as a
UNION ALL
SELECT
a.id as id,
b.name as req_name,
FROM T1 as a
JOIN T1 as b ON a.parent_id = b.id
UNION ALL
SELECT
a.id as id,
c.name as req_name,
FROM T1 as a
JOIN T1 as b on a.parent_id = b.id
JOIN T1 as c on b.parent_id = c.id
导致 table
id | req_name |
---|---|
0 | 一个 |
1 | B |
2 | C |
3 | D |
4 | E |
2 | 一个 |
3 | 一个 |
4 | B |
1 | 一个 |
2 | B |
3 | B |
4 | C |
如有任何见解,我将不胜感激!
BigQuery(尚)不支持递归或分层查询。所以你的方法实际上很好。如果愿意,您可以使用 left join
s:
with t as (
select 0 as id, 'A' as name, null as parent_id union all
select 1 as id, 'B' as name, 0 as parent_id union all
select 2 as id, 'C' as name, 1 as parent_id union all
select 3 as id, 'D' as name, 1 as parent_id union all
select 4 as id, 'E' as name, 2 as parent_id
)
select distinct id, t1.name
from t t1 left join
t t2
on t2.parent_id = t1.id left join
t t3
on t3.parent_id = t2.id cross join
unnest(array[t1.id, t2.id, t3.id]) id
where id is not null;
您仍然需要显式连接到数据的最大深度。
另一种方法是使用脚本语言中提供的循环结构。