在 sql 中自定义编码的行号
Row number custom coded in sql
我正在使用 bigquery #standardsql
处理 table。 table 将为在第 9 个月和第 10 个月购买商品的用户记录转化 (1)。对于在第 10 个月未购买的用户,他们的行中将只有 0
[=24] =]
到目前为止,这是 custom_coded
的查询
(case when row_number()
over (partition by customer_id order by purchase_date asc) =
count(*) over (partition by customer_id)
then 1 else 0 END) AS custom_coded
这是目前的结果
我预计 customer_id = 288
在 custom_coded
中只有 0
,因为他没有在下个月或第 10 个月购买。并且 customer_id = 879
预计有1
他最近 purchase_date
因为他有第 10 个月的购买记录
这是预期的结果
我之前在这个帖子中问过 (),但是数据集不满足我将要执行的分析的想法
以下适用于 BigQuery 标准 SQL
#standardSQL
SELECT customer_id, item_purchased, purchase_date,
(CASE WHEN
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY purchase_date ASC) =
COUNT(*) OVER (PARTITION BY customer_id)
AND SUM(DISTINCT (CASE FORMAT_DATE('%Y%m', purchase_date)
WHEN '201709' THEN 1 WHEN '201710' THEN 2 ELSE 0 END))
OVER(PARTITION BY customer_id) = 3
THEN 1 ELSE 0
END) AS custom_coded
FROM `project.dataset.table`
您可以使用问题中的虚拟数据测试/玩上面的内容
#standardSQL
WITH `project.dataset.table` AS (
SELECT 288 customer_id, 'Rice' item_purchased, DATE '2017-09-02' purchase_date UNION ALL
SELECT 288, 'Rice', DATE '2017-09-02' UNION ALL
SELECT 288, 'Rice', DATE '2017-09-06' UNION ALL
SELECT 879, 'Plate', DATE '2017-09-01' UNION ALL
SELECT 879, 'Plate', DATE '2017-09-25' UNION ALL
SELECT 879, 'Plate', DATE '2017-10-25' UNION ALL
SELECT 879, 'Plate', DATE '2017-10-27'
)
SELECT customer_id, item_purchased, purchase_date,
(CASE WHEN
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY purchase_date ASC) =
COUNT(*) OVER (PARTITION BY customer_id)
AND SUM(DISTINCT (CASE FORMAT_DATE('%Y%m', purchase_date)
WHEN '201709' THEN 1 WHEN '201710' THEN 2 ELSE 0 END))
OVER(PARTITION BY customer_id) = 3
THEN 1 ELSE 0
END) AS custom_coded
FROM `project.dataset.table`
ORDER BY customer_id, purchase_date
结果是
customer_id item_purchased purchase_date custom_coded
288 Rice 2017-09-02 0
288 Rice 2017-09-02 0
288 Rice 2017-09-06 0
879 Plate 2017-09-01 0
879 Plate 2017-09-25 0
879 Plate 2017-10-25 0
879 Plate 2017-10-27 1
我正在使用 bigquery #standardsql
处理 table。 table 将为在第 9 个月和第 10 个月购买商品的用户记录转化 (1)。对于在第 10 个月未购买的用户,他们的行中将只有 0
[=24] =]
到目前为止,这是 custom_coded
(case when row_number()
over (partition by customer_id order by purchase_date asc) =
count(*) over (partition by customer_id)
then 1 else 0 END) AS custom_coded
这是目前的结果
我预计 customer_id = 288
在 custom_coded
中只有 0
,因为他没有在下个月或第 10 个月购买。并且 customer_id = 879
预计有1
他最近 purchase_date
因为他有第 10 个月的购买记录
这是预期的结果
我之前在这个帖子中问过 (
以下适用于 BigQuery 标准 SQL
#standardSQL
SELECT customer_id, item_purchased, purchase_date,
(CASE WHEN
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY purchase_date ASC) =
COUNT(*) OVER (PARTITION BY customer_id)
AND SUM(DISTINCT (CASE FORMAT_DATE('%Y%m', purchase_date)
WHEN '201709' THEN 1 WHEN '201710' THEN 2 ELSE 0 END))
OVER(PARTITION BY customer_id) = 3
THEN 1 ELSE 0
END) AS custom_coded
FROM `project.dataset.table`
您可以使用问题中的虚拟数据测试/玩上面的内容
#standardSQL
WITH `project.dataset.table` AS (
SELECT 288 customer_id, 'Rice' item_purchased, DATE '2017-09-02' purchase_date UNION ALL
SELECT 288, 'Rice', DATE '2017-09-02' UNION ALL
SELECT 288, 'Rice', DATE '2017-09-06' UNION ALL
SELECT 879, 'Plate', DATE '2017-09-01' UNION ALL
SELECT 879, 'Plate', DATE '2017-09-25' UNION ALL
SELECT 879, 'Plate', DATE '2017-10-25' UNION ALL
SELECT 879, 'Plate', DATE '2017-10-27'
)
SELECT customer_id, item_purchased, purchase_date,
(CASE WHEN
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY purchase_date ASC) =
COUNT(*) OVER (PARTITION BY customer_id)
AND SUM(DISTINCT (CASE FORMAT_DATE('%Y%m', purchase_date)
WHEN '201709' THEN 1 WHEN '201710' THEN 2 ELSE 0 END))
OVER(PARTITION BY customer_id) = 3
THEN 1 ELSE 0
END) AS custom_coded
FROM `project.dataset.table`
ORDER BY customer_id, purchase_date
结果是
customer_id item_purchased purchase_date custom_coded
288 Rice 2017-09-02 0
288 Rice 2017-09-02 0
288 Rice 2017-09-06 0
879 Plate 2017-09-01 0
879 Plate 2017-09-25 0
879 Plate 2017-10-25 0
879 Plate 2017-10-27 1