根据排名将行转换为列
Converting rows into columns based on rank
我正在尝试根据客户的购买日期(排名)将行转换为列。目标是找到客户购买的第一、第二、第三、第四和第五个产品类别。我还想知道客户购买了什么产品。
销售额table
purchaseDate | productCategory | product | customer_id | customer_phonenumber | customer_email
2020-05-05 Electronics iPhone A001 1234567890 aoo1@abc.com
2020-05-06 Clothing T-shirt A001 1234567890 aoo1@abc.com
2020-05-07 Electronics Keyboard A001 1234567890 aoo1@abc.com
2020-05-08 Accessories iPhone Case A001 1234567890 aoo1@abc.com
结果
customer_id | customer_phoneNumber | customer_email | first_product_category | second_product_category | third_product_category | fourth_product_category | fifth_product_category | first_product | second_product | third_product | fourth_product | fifth_product
A001 1234567890 a001@abc.com Electonics Clothing Electronics Accessories NULL iPhone T-shirt Keyboard iPhone Case NULL
我想知道是否有任何其他方法可以做到这一点,因为我当前的查询花费的时间太长了。
这是我的查询:
with ranked_order as (
select
purchaseDate
, productCategory
, product
, customer_id
, customer_phonenumber
, customer_email
, row_number() over(partition by productCategory order by purchaseDate desc) rank
from sales_table
)
select
customer_id
, customer_phonenumber
, customer_email
, max(case when rank = 1 then productCategory end) first_product_category
, max(case when rank = 2 then productCategory end) second_product_category
, max(case when rank = 3 then productCategory end) third_product_category
, max(case when rank = 4 then productCategory end) fourth_product_category
, max(case when rank = 5 then productCategory end) fifth_product_category
, max(case when rank = 1 then product end) first_product
, max(case when rank = 2 then product end) second_product
, max(case when rank = 3 then product end) third_product
, max(case when rank = 4 then product end) fourth_product
, max(case when rank = 5 then product end) fifth_product
from
ranked_order
group by 1,2,3
我会将值放入数组中:
select customer_id, customer_phonenumber, customer_email,
array_agg(productCategory order by purchaseDate limit 5) as productCategorys_5,
array_agg(product order by purchaseDate limit 5) as products_5
from sales_table
group by 1,2,3
这会将值放在数组中而不是列中。如果这具有足够好的性能,您可以只对数组进行索引以获取单个元素(尽管我可能会发现数组更方便)。
以下适用于 BigQuery 标准 SQL
#standardSQL
SELECT
customer_id,
customer_phonenumber,
customer_email,
top5[SAFE_OFFSET(0)].productCategory AS first_product_category,
top5[SAFE_OFFSET(1)].productCategory AS second_product_category,
top5[SAFE_OFFSET(2)].productCategory AS third_product_category,
top5[SAFE_OFFSET(3)].productCategory AS fourth_product_category,
top5[SAFE_OFFSET(4)].productCategory AS fifth_product_category,
top5[SAFE_OFFSET(0)].product AS first_product,
top5[SAFE_OFFSET(1)].product AS second_product,
top5[SAFE_OFFSET(2)].product AS third_product,
top5[SAFE_OFFSET(3)].product AS fourth_product,
top5[SAFE_OFFSET(4)].product AS fifth_product
FROM (
SELECT customer_id, customer_phonenumber, customer_email,
ARRAY_AGG(STRUCT(productCategory, product) ORDER BY purchaseDate LIMIT 5) AS top5
FROM `project.dataset.sales_table`
GROUP BY customer_id, customer_phonenumber, customer_email
)
我正在尝试根据客户的购买日期(排名)将行转换为列。目标是找到客户购买的第一、第二、第三、第四和第五个产品类别。我还想知道客户购买了什么产品。
销售额table
purchaseDate | productCategory | product | customer_id | customer_phonenumber | customer_email
2020-05-05 Electronics iPhone A001 1234567890 aoo1@abc.com
2020-05-06 Clothing T-shirt A001 1234567890 aoo1@abc.com
2020-05-07 Electronics Keyboard A001 1234567890 aoo1@abc.com
2020-05-08 Accessories iPhone Case A001 1234567890 aoo1@abc.com
结果
customer_id | customer_phoneNumber | customer_email | first_product_category | second_product_category | third_product_category | fourth_product_category | fifth_product_category | first_product | second_product | third_product | fourth_product | fifth_product
A001 1234567890 a001@abc.com Electonics Clothing Electronics Accessories NULL iPhone T-shirt Keyboard iPhone Case NULL
我想知道是否有任何其他方法可以做到这一点,因为我当前的查询花费的时间太长了。
这是我的查询:
with ranked_order as (
select
purchaseDate
, productCategory
, product
, customer_id
, customer_phonenumber
, customer_email
, row_number() over(partition by productCategory order by purchaseDate desc) rank
from sales_table
)
select
customer_id
, customer_phonenumber
, customer_email
, max(case when rank = 1 then productCategory end) first_product_category
, max(case when rank = 2 then productCategory end) second_product_category
, max(case when rank = 3 then productCategory end) third_product_category
, max(case when rank = 4 then productCategory end) fourth_product_category
, max(case when rank = 5 then productCategory end) fifth_product_category
, max(case when rank = 1 then product end) first_product
, max(case when rank = 2 then product end) second_product
, max(case when rank = 3 then product end) third_product
, max(case when rank = 4 then product end) fourth_product
, max(case when rank = 5 then product end) fifth_product
from
ranked_order
group by 1,2,3
我会将值放入数组中:
select customer_id, customer_phonenumber, customer_email,
array_agg(productCategory order by purchaseDate limit 5) as productCategorys_5,
array_agg(product order by purchaseDate limit 5) as products_5
from sales_table
group by 1,2,3
这会将值放在数组中而不是列中。如果这具有足够好的性能,您可以只对数组进行索引以获取单个元素(尽管我可能会发现数组更方便)。
以下适用于 BigQuery 标准 SQL
#standardSQL
SELECT
customer_id,
customer_phonenumber,
customer_email,
top5[SAFE_OFFSET(0)].productCategory AS first_product_category,
top5[SAFE_OFFSET(1)].productCategory AS second_product_category,
top5[SAFE_OFFSET(2)].productCategory AS third_product_category,
top5[SAFE_OFFSET(3)].productCategory AS fourth_product_category,
top5[SAFE_OFFSET(4)].productCategory AS fifth_product_category,
top5[SAFE_OFFSET(0)].product AS first_product,
top5[SAFE_OFFSET(1)].product AS second_product,
top5[SAFE_OFFSET(2)].product AS third_product,
top5[SAFE_OFFSET(3)].product AS fourth_product,
top5[SAFE_OFFSET(4)].product AS fifth_product
FROM (
SELECT customer_id, customer_phonenumber, customer_email,
ARRAY_AGG(STRUCT(productCategory, product) ORDER BY purchaseDate LIMIT 5) AS top5
FROM `project.dataset.sales_table`
GROUP BY customer_id, customer_phonenumber, customer_email
)