如何在 BigQuery 的 Left Join 查询中按所有字段嵌套表进行分组?
How to Group By all fields nested tables in a Left Join query in BigQuery?
我有大约 10 个 table,我使用以下查询逐轮嵌套 table:
R1 AS(
SELECT ANY_VALUE(Table1).*, ARRAY_AGG(( SELECT AS STRUCT Table2.* EXCEPT(ID))) AS Table2
FROM Table1 LEFT JOIN Table2 USING(ID)
GROUP BY Table1.ID),
R2 AS(
SELECT ANY_VALUE(R1).*, ARRAY_AGG(( SELECT AS STRUCT Table3.* EXCEPT(ID))) AS Table3
FROM R1 LEFT JOIN Table3 USING(ID)
GROUP BY R1.ID),
...
SELECT ANY_VALUE(R9).*, ARRAY_AGG(( SELECT AS STRUCT Table10.* EXCEPT(ID))) AS Table10
FROM R9 LEFT JOIN Table10 USING(ID)
例如,在我的第一个 table 中,我可以有两个具有相同 ID 的记录,但其他一些字段会有所不同,我想将它们视为两个不同的记录,因此按所有人分组我加入时 table 的字段。
然后我想对所有“sub-table”(查询中的 R tables)做同样的事情,这样我就可以按嵌套 [=22= 的所有字段进行分组]s.
我怎样才能轻松做到?
我试过了 GROUP BY Table1.*
但是没用...
提前致谢
您似乎想要这样的东西:
select *
from table1 t1 left join
(select t2.*
from table2 t2
where true
qualify row_number() over (partition by t2.id order by t2.id) = 0
) t2
using (id)
这使用 qualify
而不是 group by
来获取一行。
如果您不想要 table1
中的所有行,您也可以削减它们:
select *
from (select t1.*
from table1 t1
where true
qualify row_number() over (partition by id, col1, col2 order by id) = 1
) t1 left join
(select t2.*
from table2 t2
where true
qualify row_number() over (partition by t2.id order by t2.id) = 0
) t2
using (id)
How to Group By all fields ...?
I tried GROUP BY Table1.* but it doesn't work...
考虑下面的例子
SELECT ANY_VALUE(t1).*,
ARRAY_AGG(( SELECT AS STRUCT t2.* EXCEPT(ID))) AS Table2
FROM Table1 t1 LEFT JOIN Table2 t2 USING(ID)
GROUP BY FORMAT('%t', t1)
尝试 to_json_string:
...
FROM Table1 t1
...
GROUP BY to_json_string(t1)
我有大约 10 个 table,我使用以下查询逐轮嵌套 table:
R1 AS(
SELECT ANY_VALUE(Table1).*, ARRAY_AGG(( SELECT AS STRUCT Table2.* EXCEPT(ID))) AS Table2
FROM Table1 LEFT JOIN Table2 USING(ID)
GROUP BY Table1.ID),
R2 AS(
SELECT ANY_VALUE(R1).*, ARRAY_AGG(( SELECT AS STRUCT Table3.* EXCEPT(ID))) AS Table3
FROM R1 LEFT JOIN Table3 USING(ID)
GROUP BY R1.ID),
...
SELECT ANY_VALUE(R9).*, ARRAY_AGG(( SELECT AS STRUCT Table10.* EXCEPT(ID))) AS Table10
FROM R9 LEFT JOIN Table10 USING(ID)
例如,在我的第一个 table 中,我可以有两个具有相同 ID 的记录,但其他一些字段会有所不同,我想将它们视为两个不同的记录,因此按所有人分组我加入时 table 的字段。 然后我想对所有“sub-table”(查询中的 R tables)做同样的事情,这样我就可以按嵌套 [=22= 的所有字段进行分组]s.
我怎样才能轻松做到?
我试过了 GROUP BY Table1.*
但是没用...
提前致谢
您似乎想要这样的东西:
select *
from table1 t1 left join
(select t2.*
from table2 t2
where true
qualify row_number() over (partition by t2.id order by t2.id) = 0
) t2
using (id)
这使用 qualify
而不是 group by
来获取一行。
如果您不想要 table1
中的所有行,您也可以削减它们:
select *
from (select t1.*
from table1 t1
where true
qualify row_number() over (partition by id, col1, col2 order by id) = 1
) t1 left join
(select t2.*
from table2 t2
where true
qualify row_number() over (partition by t2.id order by t2.id) = 0
) t2
using (id)
How to Group By all fields ...?
I tried GROUP BY Table1.* but it doesn't work...
考虑下面的例子
SELECT ANY_VALUE(t1).*,
ARRAY_AGG(( SELECT AS STRUCT t2.* EXCEPT(ID))) AS Table2
FROM Table1 t1 LEFT JOIN Table2 t2 USING(ID)
GROUP BY FORMAT('%t', t1)
尝试 to_json_string:
...
FROM Table1 t1
...
GROUP BY to_json_string(t1)