BigQuery 中的子查询(在同一 Table 上加入)
Subquery in BigQuery (JOIN on same Table)
我有一个包含此数据的 BigQuery table
client spent balance date
A 20 500 2022-01-01
A 10 490 2022-01-02
A 50 440 2022-01-03
B 200 1000 1995-07-09
B 300 700 1998-08-11
B 100 600 2002-04-17
C 2 100 2021-01-04
C 10 90 2021-06-06
C 70 20 2021-10-07
我需要每个客户基于日期的最新余额:
client spent balance date
A 50 440 2022-01-03
B 100 600 2002-04-17
C 70 20 2021-10-07
distinct
不像 sql 那样工作,客户端上的组也不起作用,因为我在使用组时需要对其他列进行计数、求和等。
我只使用了一个客户:
SELECT balance FROM `table` WHERE client = "A" ORDER BY date DESC LIMIT 1.
但是我怎样才能在一个语句中为每个客户获取这些数据。
我试过 subselect
SELECT client,
(SELECT balance FROM ` table ` WHERE client = tb. client ORDER by date DESC limit 1) AS bal
FROM `table` AS tb;
并得到错误:
Correlated subqueries that reference other tables are not supported
unless they can be de-correlated, such as by transforming them into an
efficient JOIN.
我不知道如何从这个子查询中创建 JOIN 以使其工作。
希望你有一个想法。
你试过使用row_numberwindow函数吗?
select client, spent, balance, date
from (
select client, spent, balance, date
, ROW_NUMBER() OVER (PARTITION BY client ORDER BY date DESC) AS row_num -- adding row number, starting from latest date
from table
)
where row_num = 1 -- filter out only the latest date
下面使用
select * from your_table
qualify 1 = row_number() over(partition by client order by date desc)
如果应用于您问题中的示例数据 - 输出为
我有一个包含此数据的 BigQuery table
client spent balance date
A 20 500 2022-01-01
A 10 490 2022-01-02
A 50 440 2022-01-03
B 200 1000 1995-07-09
B 300 700 1998-08-11
B 100 600 2002-04-17
C 2 100 2021-01-04
C 10 90 2021-06-06
C 70 20 2021-10-07
我需要每个客户基于日期的最新余额:
client spent balance date
A 50 440 2022-01-03
B 100 600 2002-04-17
C 70 20 2021-10-07
distinct
不像 sql 那样工作,客户端上的组也不起作用,因为我在使用组时需要对其他列进行计数、求和等。
我只使用了一个客户:
SELECT balance FROM `table` WHERE client = "A" ORDER BY date DESC LIMIT 1.
但是我怎样才能在一个语句中为每个客户获取这些数据。
我试过 subselect
SELECT client,
(SELECT balance FROM ` table ` WHERE client = tb. client ORDER by date DESC limit 1) AS bal
FROM `table` AS tb;
并得到错误:
Correlated subqueries that reference other tables are not supported unless they can be de-correlated, such as by transforming them into an efficient JOIN.
我不知道如何从这个子查询中创建 JOIN 以使其工作。
希望你有一个想法。
你试过使用row_numberwindow函数吗?
select client, spent, balance, date
from (
select client, spent, balance, date
, ROW_NUMBER() OVER (PARTITION BY client ORDER BY date DESC) AS row_num -- adding row number, starting from latest date
from table
)
where row_num = 1 -- filter out only the latest date
下面使用
select * from your_table
qualify 1 = row_number() over(partition by client order by date desc)
如果应用于您问题中的示例数据 - 输出为