BigQuery SQL:前 N 个相关项的总和

BigQuery SQL: Sum of first N related items

我想知道相关 table 中前 n 项中某个值的总和。比如我想获取某公司前6张发票的总和(发票可以按ID升序排列)

当前 SQL:

SELECT invoices.company_id, SUM(invoices.amount)
FROM invoices
JOIN companies on invoices.company_id = companies.id
GROUP BY invoices.company_id

这看起来很简单,但我无法理解它。

您可以根据发票 ID 为分区中的行创建订单行号并对其进行过滤,如下所示:

with array_table as (
 select 'a' field, * from unnest([3, 2, 1 ,4, 6, 3]) id
 union all
 select 'b' field, * from unnest([1, 2, 1, 7]) id
)

select field, sum(id) from (
   select field, id, row_number() over (partition by a.field order by id desc) rownum
   from array_table a
) 
where rownum < 3
group by field

这里有更多解析示例:

https://medium.com/@aliz_ai/analytic-functions-in-google-bigquery-part-1-basics-745d97958fe2

https://cloud.google.com/bigquery/docs/reference/standard-sql/analytic-function-concepts

同时考虑以下方法

select company_id, (
    select sum(amount)
    from t.amounts amount
  ) as top_six_invoices_amount
from (
  select invoices.company_id, 
    array_agg(invoices.amount order by invoices.invoice_id limit 6) amounts
  from your_table invoices
  group by invoices.company_id
) t