抓取不晚于另一列日期的最新日期记录 SQL BigQuery

grab the latest date record that is not later than date in another column SQL BigQuery

我在 BQ table 中有两个日期列。 pageview_dateedited_date,以及 id 列。我需要逐行输出数据,对于每条记录,我想从 edited_date 列中获取一个值,该值是该列中的最新日期,但不晚于 pageview_date 值本身。如果两个日期相等,则保持原样。它还必须与 ID 相对应。数据如下所示:

id         pageview_date         edited_date
A            03/01/22               02/28/22
A            03/01/22               02/02/22
A            03/01/22               02/02/22
B            03/01/22               01/01/22
B            03/01/22               01/01/22
B            03/01/22               01/31/22
C            03/01/22               04/01/22
C            03/01/22               03/25/22
C            03/01/22               03/01/22

期望的输出是:

id         pageview_date         edited_date
A            03/01/22               02/28/22
A            03/01/22               02/28/22
A            03/01/22               02/28/22
B            03/01/22               01/31/22
B            03/01/22               01/31/22
B            03/01/22               01/31/22
C            03/01/22               03/01/22
C            03/01/22               03/01/22
C            03/01/22               03/01/22

一种方法是在由 id:

分区的 edited_date 列中使用 MAX window 函数
with sample as (
  select 'a' as id, DATE('2022-03-01') as pageview_date, DATE('2022-02-28') as edited_date
  UNION ALL
  select 'a' as id, DATE('2022-03-01') as pageview_date, DATE('2022-03-28') as edited_date
  UNION ALL
  select 'a' as id, DATE('2022-03-01') as pageview_date, DATE('2022-01-28') as edited_date
)
SELECT
  id,
  pageview_date,
  MAX(IF(edited_date <= pageview_date, edited_date, null)) OVER (PARTITION BY id) as new_edited_date
FROM sample

请注意,如果 pageview_date 之前没有 edited_date,则 new_edited_date 将是 null