DATE_DIFF() 在 BigQuery 中计算行之间的时间
DATE_DIFF() in BigQuery to calculate time between rows
我想计算几个客户购买之间的时间延迟。但是,每次购买都保存在单独的行中。数据集类似于以下内容:
customer | order_id | purchase_date | product | sequencen| ... |
customer1 | 1247857 | 2020-01-30 | ProdA, ProdB | 1 | ... |
customer2 | 4454874 | 2020-02-07 | ProdA | 1 | ... |
customer3 | 3424556 | 2020-02-28 | ProdA | 1 | ... |
customer4 | 5678889 | 2020-03-14 | ProdB | 1 | ... |
customer3 | 5853778 | 2020-03-22 | ProdA, ProdB | 2 | ... |
customer4 | 7578345 | 2020-03-30 | ProdA, ProdB | 2 | ... |
customer2 | 4892978 | 2020-05-10 | ProdA | 2 | ... |
customer5 | 4834789 | 2020-07-05 | ProdA, ProdB | 1 | ... |
customer5 | 9846726 | 2020-07-27 | ProdB | 2 | ... |
customer1 | 1774783 | 2020-12-12 | ProdB | 2 | ... |
对于每个客户,我希望得到一个 table 来计算某次购买与实际购买之间的 时间差(以天为单位)前。基本上,我想知道客户第一次和第二次购买、第二次和第三次购买等之间的时间延迟(延迟)。结果应如下所示:
customer | order_id | purchase_date | product | sequencen| ... | purchase_latency
customer1 | 1247857 | 2020-01-30 | ProdA, ProdB | 1 | ... |
customer1 | 1774783 | 2020-12-12 | ProdB | 2 | ... | 317
customer2 | 4454874 | 2020-02-07 | ProdA | 1 | ... |
customer2 | 4892978 | 2020-05-10 | ProdA | 2 | ... | 93
customer3 | 3424556 | 2020-02-28 | ProdA | 1 | ... |
customer3 | 5853778 | 2020-03-22 | ProdA, ProdB | 2 | ... | 23
customer4 | 5678889 | 2020-03-14 | ProdB | 1 | ... |
customer4 | 7578345 | 2020-03-30 | ProdA, ProdB | 2 | ... | 16
customer5 | 4834789 | 2020-07-05 | ProdA, ProdB | 1 | ... |
customer5 | 9846726 | 2020-07-27 | ProdB | 2 | ... | 22
我正在努力将 purchase_latency 计算添加到我当前的查询中,因为该计算需要我对行进行计算。有什么想法可以将其添加到我当前的查询中吗?:
SELECT
order_id
max(customer) as customer,
max(purchase_date) as purchase_date,
STRING_AGG(product, ",") as product,
...,
FROM SELECT(
od.order_number as order_id,
od.customer_email as customer,
od.order_date as purchase_date
dd.sku as product,
ROW_NUMBER() OVER (PARTITION BY od.customer_email ORDER BY od.order_date ) as sequencen
FROM orders_data od
JOIN detail_data dd
ON od.order_number = dd.order_number
where od.price> 0 AND
od.sku in ("ProdA","ProdB"))
GROUP BY order_id
您是否尝试过 LAG 等行导航功能?
WITH finishers AS
(SELECT 'Sophia Liu' as name,
TIMESTAMP '2016-10-18 2:51:45' as finish_time,
'F30-34' as division
UNION ALL SELECT 'Lisa Stelzner', TIMESTAMP '2016-10-18 2:54:11', 'F35-39'
UNION ALL SELECT 'Nikki Leith', TIMESTAMP '2016-10-18 2:59:01', 'F30-34'
UNION ALL SELECT 'Lauren Matthews', TIMESTAMP '2016-10-18 3:01:17', 'F35-39'
UNION ALL SELECT 'Desiree Berry', TIMESTAMP '2016-10-18 3:05:42', 'F35-39'
UNION ALL SELECT 'Suzy Slane', TIMESTAMP '2016-10-18 3:06:24', 'F35-39'
UNION ALL SELECT 'Jen Edwards', TIMESTAMP '2016-10-18 3:06:36', 'F30-34'
UNION ALL SELECT 'Meghan Lederer', TIMESTAMP '2016-10-18 3:07:41', 'F30-34'
UNION ALL SELECT 'Carly Forte', TIMESTAMP '2016-10-18 3:08:58', 'F25-29'
UNION ALL SELECT 'Lauren Reasoner', TIMESTAMP '2016-10-18 3:10:14', 'F30-34')
SELECT name,
finish_time,
division,
LAG(name)
OVER (PARTITION BY division ORDER BY finish_time ASC) AS preceding_runner
FROM finishers;
+-----------------+-------------+----------+------------------+
| name | finish_time | division | preceding_runner |
+-----------------+-------------+----------+------------------+
| Carly Forte | 03:08:58 | F25-29 | NULL |
| Sophia Liu | 02:51:45 | F30-34 | NULL |
| Nikki Leith | 02:59:01 | F30-34 | Sophia Liu |
| Jen Edwards | 03:06:36 | F30-34 | Nikki Leith |
| Meghan Lederer | 03:07:41 | F30-34 | Jen Edwards |
| Lauren Reasoner | 03:10:14 | F30-34 | Meghan Lederer |
| Lisa Stelzner | 02:54:11 | F35-39 | NULL |
| Lauren Matthews | 03:01:17 | F35-39 | Lisa Stelzner |
| Desiree Berry | 03:05:42 | F35-39 | Lauren Matthews |
| Suzy Slane | 03:06:24 | F35-39 | Desiree Berry |
+-----------------+-------------+----------+------------------+
我想计算几个客户购买之间的时间延迟。但是,每次购买都保存在单独的行中。数据集类似于以下内容:
customer | order_id | purchase_date | product | sequencen| ... |
customer1 | 1247857 | 2020-01-30 | ProdA, ProdB | 1 | ... |
customer2 | 4454874 | 2020-02-07 | ProdA | 1 | ... |
customer3 | 3424556 | 2020-02-28 | ProdA | 1 | ... |
customer4 | 5678889 | 2020-03-14 | ProdB | 1 | ... |
customer3 | 5853778 | 2020-03-22 | ProdA, ProdB | 2 | ... |
customer4 | 7578345 | 2020-03-30 | ProdA, ProdB | 2 | ... |
customer2 | 4892978 | 2020-05-10 | ProdA | 2 | ... |
customer5 | 4834789 | 2020-07-05 | ProdA, ProdB | 1 | ... |
customer5 | 9846726 | 2020-07-27 | ProdB | 2 | ... |
customer1 | 1774783 | 2020-12-12 | ProdB | 2 | ... |
对于每个客户,我希望得到一个 table 来计算某次购买与实际购买之间的 时间差(以天为单位)前。基本上,我想知道客户第一次和第二次购买、第二次和第三次购买等之间的时间延迟(延迟)。结果应如下所示:
customer | order_id | purchase_date | product | sequencen| ... | purchase_latency
customer1 | 1247857 | 2020-01-30 | ProdA, ProdB | 1 | ... |
customer1 | 1774783 | 2020-12-12 | ProdB | 2 | ... | 317
customer2 | 4454874 | 2020-02-07 | ProdA | 1 | ... |
customer2 | 4892978 | 2020-05-10 | ProdA | 2 | ... | 93
customer3 | 3424556 | 2020-02-28 | ProdA | 1 | ... |
customer3 | 5853778 | 2020-03-22 | ProdA, ProdB | 2 | ... | 23
customer4 | 5678889 | 2020-03-14 | ProdB | 1 | ... |
customer4 | 7578345 | 2020-03-30 | ProdA, ProdB | 2 | ... | 16
customer5 | 4834789 | 2020-07-05 | ProdA, ProdB | 1 | ... |
customer5 | 9846726 | 2020-07-27 | ProdB | 2 | ... | 22
我正在努力将 purchase_latency 计算添加到我当前的查询中,因为该计算需要我对行进行计算。有什么想法可以将其添加到我当前的查询中吗?:
SELECT
order_id
max(customer) as customer,
max(purchase_date) as purchase_date,
STRING_AGG(product, ",") as product,
...,
FROM SELECT(
od.order_number as order_id,
od.customer_email as customer,
od.order_date as purchase_date
dd.sku as product,
ROW_NUMBER() OVER (PARTITION BY od.customer_email ORDER BY od.order_date ) as sequencen
FROM orders_data od
JOIN detail_data dd
ON od.order_number = dd.order_number
where od.price> 0 AND
od.sku in ("ProdA","ProdB"))
GROUP BY order_id
您是否尝试过 LAG 等行导航功能?
WITH finishers AS
(SELECT 'Sophia Liu' as name,
TIMESTAMP '2016-10-18 2:51:45' as finish_time,
'F30-34' as division
UNION ALL SELECT 'Lisa Stelzner', TIMESTAMP '2016-10-18 2:54:11', 'F35-39'
UNION ALL SELECT 'Nikki Leith', TIMESTAMP '2016-10-18 2:59:01', 'F30-34'
UNION ALL SELECT 'Lauren Matthews', TIMESTAMP '2016-10-18 3:01:17', 'F35-39'
UNION ALL SELECT 'Desiree Berry', TIMESTAMP '2016-10-18 3:05:42', 'F35-39'
UNION ALL SELECT 'Suzy Slane', TIMESTAMP '2016-10-18 3:06:24', 'F35-39'
UNION ALL SELECT 'Jen Edwards', TIMESTAMP '2016-10-18 3:06:36', 'F30-34'
UNION ALL SELECT 'Meghan Lederer', TIMESTAMP '2016-10-18 3:07:41', 'F30-34'
UNION ALL SELECT 'Carly Forte', TIMESTAMP '2016-10-18 3:08:58', 'F25-29'
UNION ALL SELECT 'Lauren Reasoner', TIMESTAMP '2016-10-18 3:10:14', 'F30-34')
SELECT name,
finish_time,
division,
LAG(name)
OVER (PARTITION BY division ORDER BY finish_time ASC) AS preceding_runner
FROM finishers;
+-----------------+-------------+----------+------------------+
| name | finish_time | division | preceding_runner |
+-----------------+-------------+----------+------------------+
| Carly Forte | 03:08:58 | F25-29 | NULL |
| Sophia Liu | 02:51:45 | F30-34 | NULL |
| Nikki Leith | 02:59:01 | F30-34 | Sophia Liu |
| Jen Edwards | 03:06:36 | F30-34 | Nikki Leith |
| Meghan Lederer | 03:07:41 | F30-34 | Jen Edwards |
| Lauren Reasoner | 03:10:14 | F30-34 | Meghan Lederer |
| Lisa Stelzner | 02:54:11 | F35-39 | NULL |
| Lauren Matthews | 03:01:17 | F35-39 | Lisa Stelzner |
| Desiree Berry | 03:05:42 | F35-39 | Lauren Matthews |
| Suzy Slane | 03:06:24 | F35-39 | Desiree Berry |
+-----------------+-------------+----------+------------------+