如何在 BigQuery 中显示多层次树结构
How to display multiple hierarchical tree structure in BigQuery
我正在处理主管及其下属员工的树状层次结构。难的是有的主管是其他主管下属的员工,而且还很多
对于SQL我从class获得的查询,只有简单的自连接,可能只有两个级别:A被B监督,仅此而已。
但现实世界的问题要复杂得多。有多个级别,我不确定确切的数量。比如A被B监管,B被C监管,C被D监管等等
我假设只有 5 个或更多级别的监督。原始数据可能是这样的:
Employee Supervisor
A B
C B
D B
B V
E V
F E
G V
V (Blank which indicates no boss)
H A
BigQuery专家提供的代码如下:
#standardSQL
SELECT t.Supervisor,
IF(t.Supervisor = t5.Supervisor,
STRUCT(Employee2 AS Employee1, NULL AS Employee2),
STRUCT(t5.Supervisor AS Employee1, Employee2 AS Employee2)
).*
FROM (
SELECT t1.Employee Supervisor,
COALESCE(t4.Employee, t3.Employee, t2.Employee) Employee2
FROM `project.dataset.table` t1
LEFT JOIN `project.dataset.table` t2 ON t2.Supervisor = t1.Employee
LEFT JOIN `project.dataset.table` t3 ON t3.Supervisor = t2.Employee
LEFT JOIN `project.dataset.table` t4 ON t4.Supervisor = t3.Employee
WHERE t1.Supervisor IS NULL
) t
LEFT JOIN `project.dataset.table` t5 ON t5.Employee = t.Employee2
结果变成了这样:
Row Supervisor Employee1 Employee2
1 V B A
2 V B C
3 V B D
4 V E F
5 V G null
但我们想要的是:
Row Supervisor Employee1 Employee2 Employee3
1 V B A H
2 V B C Null
3 V B D Null
4 V E F Null
5 V G null Null
那么如果我想有更多的层次结构,如何更改代码呢?这意味着如果我想添加 employee3 或 4,我该如何编辑它?谢谢!
以下适用于 BigQuery 标准 SQL
#standardSQL
WITH e0 AS (
SELECT Employee AS Supervisor FROM `project.dataset.table` WHERE Supervisor IS NULL
), e1 AS (
SELECT e.Supervisor, Employee AS Employee1
FROM e0 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Supervisor
), e2 AS (
SELECT e.Supervisor, Employee1, Employee AS Employee2
FROM e1 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Employee1
), e3 AS (
SELECT e.Supervisor, Employee1, Employee2, Employee AS Employee3
FROM e2 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Employee2
)
SELECT * FROM e3
如果应用于您问题中的示例数据 - result/output 是
Row Supervisor Employee1 Employee2 Employee3
1 V B A H
2 V B C null
3 V B D null
4 V E F null
5 V G null null
你可以很容易地在上面添加更多级别,如下所示(用 4、5、6、7 等相应的数字替换 和 )显然可以合理扩展
e<N> AS (
SELECT e.Supervisor, Employee1, Employee2, Employee3, ... , Employee AS Employee<N>
FROM e<N-1> e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Employee<N-1>
)
SELECT * FROM e<N>
例如
#standardSQL
WITH e0 AS (
SELECT Employee AS Supervisor FROM `project.dataset.table` WHERE Supervisor IS NULL
), e1 AS (
SELECT e.Supervisor, Employee AS Employee1
FROM e0 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Supervisor
), e2 AS (
SELECT e.Supervisor, Employee1, Employee AS Employee2
FROM e1 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Employee1
), e3 AS (
SELECT e.Supervisor, Employee1, Employee2, Employee AS Employee3
FROM e2 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Employee2
), e4 AS (
SELECT e.Supervisor, Employee1, Employee2, Employee3, Employee AS Employee4
FROM e3 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Employee3
), e5 AS (
SELECT e.Supervisor, Employee1, Employee2, Employee3, Employee4, Employee AS Employee5
FROM e4 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Employee4
)
SELECT * FROM e5
我正在处理主管及其下属员工的树状层次结构。难的是有的主管是其他主管下属的员工,而且还很多
对于SQL我从class获得的查询,只有简单的自连接,可能只有两个级别:A被B监督,仅此而已。
但现实世界的问题要复杂得多。有多个级别,我不确定确切的数量。比如A被B监管,B被C监管,C被D监管等等
我假设只有 5 个或更多级别的监督。原始数据可能是这样的:
Employee Supervisor
A B
C B
D B
B V
E V
F E
G V
V (Blank which indicates no boss)
H A
BigQuery专家提供的代码如下:
#standardSQL
SELECT t.Supervisor,
IF(t.Supervisor = t5.Supervisor,
STRUCT(Employee2 AS Employee1, NULL AS Employee2),
STRUCT(t5.Supervisor AS Employee1, Employee2 AS Employee2)
).*
FROM (
SELECT t1.Employee Supervisor,
COALESCE(t4.Employee, t3.Employee, t2.Employee) Employee2
FROM `project.dataset.table` t1
LEFT JOIN `project.dataset.table` t2 ON t2.Supervisor = t1.Employee
LEFT JOIN `project.dataset.table` t3 ON t3.Supervisor = t2.Employee
LEFT JOIN `project.dataset.table` t4 ON t4.Supervisor = t3.Employee
WHERE t1.Supervisor IS NULL
) t
LEFT JOIN `project.dataset.table` t5 ON t5.Employee = t.Employee2
结果变成了这样:
Row Supervisor Employee1 Employee2
1 V B A
2 V B C
3 V B D
4 V E F
5 V G null
但我们想要的是:
Row Supervisor Employee1 Employee2 Employee3
1 V B A H
2 V B C Null
3 V B D Null
4 V E F Null
5 V G null Null
那么如果我想有更多的层次结构,如何更改代码呢?这意味着如果我想添加 employee3 或 4,我该如何编辑它?谢谢!
以下适用于 BigQuery 标准 SQL
#standardSQL
WITH e0 AS (
SELECT Employee AS Supervisor FROM `project.dataset.table` WHERE Supervisor IS NULL
), e1 AS (
SELECT e.Supervisor, Employee AS Employee1
FROM e0 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Supervisor
), e2 AS (
SELECT e.Supervisor, Employee1, Employee AS Employee2
FROM e1 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Employee1
), e3 AS (
SELECT e.Supervisor, Employee1, Employee2, Employee AS Employee3
FROM e2 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Employee2
)
SELECT * FROM e3
如果应用于您问题中的示例数据 - result/output 是
Row Supervisor Employee1 Employee2 Employee3
1 V B A H
2 V B C null
3 V B D null
4 V E F null
5 V G null null
你可以很容易地在上面添加更多级别,如下所示(用 4、5、6、7 等相应的数字替换 和 )显然可以合理扩展
e<N> AS (
SELECT e.Supervisor, Employee1, Employee2, Employee3, ... , Employee AS Employee<N>
FROM e<N-1> e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Employee<N-1>
)
SELECT * FROM e<N>
例如
#standardSQL
WITH e0 AS (
SELECT Employee AS Supervisor FROM `project.dataset.table` WHERE Supervisor IS NULL
), e1 AS (
SELECT e.Supervisor, Employee AS Employee1
FROM e0 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Supervisor
), e2 AS (
SELECT e.Supervisor, Employee1, Employee AS Employee2
FROM e1 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Employee1
), e3 AS (
SELECT e.Supervisor, Employee1, Employee2, Employee AS Employee3
FROM e2 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Employee2
), e4 AS (
SELECT e.Supervisor, Employee1, Employee2, Employee3, Employee AS Employee4
FROM e3 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Employee3
), e5 AS (
SELECT e.Supervisor, Employee1, Employee2, Employee3, Employee4, Employee AS Employee5
FROM e4 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Employee4
)
SELECT * FROM e5