将 LAG 限制为特定的行条件
Restrict LAG to specific row condition
我正在使用以下查询 return 给定站点 ID 本月和上个月之间的百分比差异。
SELECT
reporting_month,
total_revenue,
invoice_count,
--total_revenue_prev,
--invoice_count_prev,
ROUND(SAFE_DIVIDE(total_revenue,total_revenue_prev)-1,4) AS actual_growth,
site_name
FROM (
SELECT DATE_TRUNC(table.date, MONTH) AS reporting_month,
ROUND(SUM(table.revenue),2) AS total_revenue,
COUNT(*) AS invoice_count,
ROUND(IFNULL(
LAG(SUM(table.revenue)) OVER (ORDER BY MIN(DATE_TRUNC(table.date, MONTH))) ,
0),2) AS total_revenue_prev,
IFNULL(
LAG(COUNT(*)) OVER (ORDER BY MIN(DATE_TRUNC(table.date, MONTH))) ,
0) AS invoice_count_prev,
tbl_sites.name AS site_name
FROM table
LEFT JOIN tbl_sites ON tbl_sites.id = table.site
WHERE table.site = '123'
GROUP BY site_name, reporting_month
ORDER BY reporting_month
)
这工作正常,正在打印:
reporting_month
total revenue
invoice_count
actual_growth
site_name
2020-11-01 00:00:00 UTC
100.00
10
0.571
SiteNameString
2020-12-01 00:00:00 UTC
125.00
7
0.2500
SiteNameString
不过,我希望能够 运行 对所有网站进行相同的查询。当我从子查询中删除 WHERE table.site = '123'
时,我假设是 LAG
的使用导致数字报告不正确。有没有办法将 LAG 限制在 'current' 行站点?
您只需在 LAG 语句中添加 PARTITION BY 子句即可定义 window 函数 :
LAG(SUM(table.revenue)) OVER (PARTITION BY table.site ORDER BY table.date, MONTH)
Here 是相关的 BigQuery 文档页面
“PARTITION BY:将输入行分解为单独的分区,在这些分区上独立评估分析函数。”
我正在使用以下查询 return 给定站点 ID 本月和上个月之间的百分比差异。
SELECT
reporting_month,
total_revenue,
invoice_count,
--total_revenue_prev,
--invoice_count_prev,
ROUND(SAFE_DIVIDE(total_revenue,total_revenue_prev)-1,4) AS actual_growth,
site_name
FROM (
SELECT DATE_TRUNC(table.date, MONTH) AS reporting_month,
ROUND(SUM(table.revenue),2) AS total_revenue,
COUNT(*) AS invoice_count,
ROUND(IFNULL(
LAG(SUM(table.revenue)) OVER (ORDER BY MIN(DATE_TRUNC(table.date, MONTH))) ,
0),2) AS total_revenue_prev,
IFNULL(
LAG(COUNT(*)) OVER (ORDER BY MIN(DATE_TRUNC(table.date, MONTH))) ,
0) AS invoice_count_prev,
tbl_sites.name AS site_name
FROM table
LEFT JOIN tbl_sites ON tbl_sites.id = table.site
WHERE table.site = '123'
GROUP BY site_name, reporting_month
ORDER BY reporting_month
)
这工作正常,正在打印:
reporting_month | total revenue | invoice_count | actual_growth | site_name |
---|---|---|---|---|
2020-11-01 00:00:00 UTC | 100.00 | 10 | 0.571 | SiteNameString |
2020-12-01 00:00:00 UTC | 125.00 | 7 | 0.2500 | SiteNameString |
不过,我希望能够 运行 对所有网站进行相同的查询。当我从子查询中删除 WHERE table.site = '123'
时,我假设是 LAG
的使用导致数字报告不正确。有没有办法将 LAG 限制在 'current' 行站点?
您只需在 LAG 语句中添加 PARTITION BY 子句即可定义 window 函数 :
LAG(SUM(table.revenue)) OVER (PARTITION BY table.site ORDER BY table.date, MONTH)
Here 是相关的 BigQuery 文档页面
“PARTITION BY:将输入行分解为单独的分区,在这些分区上独立评估分析函数。”