通过将值聚合到一个 table 中来构建新的 table
Constructing new table by aggregating values in one table
我想根据一个 table 中看到的数据生成另一个 table。
Customer product_1 product_2 product_3
cust_1 0 1 0
cust_2 1 1 0
cust_3 1 1 1
我有兴趣计算每种产品的购买率。
例如,product_1 的购买率为
(count of (where product_1 = 1)/(count of (where product_1 = 0) + count of (where product_1 = 1))) * 100
assuming 1 = purchase, 0 = no purchase
i.e. 2/3 * 100 = 66.67
输出table应该如下-
Products Purchased_quantity Not_purchased_quantity Purchase_rate
product_1 2 1 66.67
product_2 3 0 100
product_3 1 2 33.33
您想逆透视和聚合。这是一种方法:
select product, sum(quantity), sum(1 - quantity),
avg(quantity)
from ((select 'product1' as product, product1 as quantity from t) union all
(select 'product2' as product, product2 as quantity from t) union all
(select 'product3' as product, product3 as quantity from t)
) p
group by product;
这假设列只有 0 和 1 -- 可能称为标志比数量更合适。
以下是针对 BigQuery Standard SQL 的,不需要硬编码名称,而是动态执行 - 对于任何 [合理] 数量的产品列 ...
#standardSQL
SELECT product,
SUM(purchase) AS Purchased_quantity,
SUM(1 - purchase) AS Not_purchased_quantity,
ROUND(100 * AVG(purchase), 2) AS Purchase_rate
FROM (
SELECT
SPLIT(kv, ':')[OFFSET(0)] product,
CAST(SPLIT(kv, ':')[OFFSET(1)] AS INT64) purchase
FROM `project.dataset.table` t,
UNNEST(SPLIT(REPLACE(TRIM(TO_JSON_STRING(t), '{}'), '"', ''))) kv
WHERE SPLIT(kv, ':')[OFFSET(0)] != 'Customer'
)
GROUP BY product
如果应用于来自你的问题的样本数据 - 输出是
Row product Purchased_quantity Not_purchased_quantity Purchase_rate
1 product_1 2 1 66.67
2 product_2 3 0 100.0
3 product_3 1 2 33.33
我想根据一个 table 中看到的数据生成另一个 table。
Customer product_1 product_2 product_3
cust_1 0 1 0
cust_2 1 1 0
cust_3 1 1 1
我有兴趣计算每种产品的购买率。
例如,product_1 的购买率为
(count of (where product_1 = 1)/(count of (where product_1 = 0) + count of (where product_1 = 1))) * 100
assuming 1 = purchase, 0 = no purchase
i.e. 2/3 * 100 = 66.67
输出table应该如下-
Products Purchased_quantity Not_purchased_quantity Purchase_rate
product_1 2 1 66.67
product_2 3 0 100
product_3 1 2 33.33
您想逆透视和聚合。这是一种方法:
select product, sum(quantity), sum(1 - quantity),
avg(quantity)
from ((select 'product1' as product, product1 as quantity from t) union all
(select 'product2' as product, product2 as quantity from t) union all
(select 'product3' as product, product3 as quantity from t)
) p
group by product;
这假设列只有 0 和 1 -- 可能称为标志比数量更合适。
以下是针对 BigQuery Standard SQL 的,不需要硬编码名称,而是动态执行 - 对于任何 [合理] 数量的产品列 ...
#standardSQL
SELECT product,
SUM(purchase) AS Purchased_quantity,
SUM(1 - purchase) AS Not_purchased_quantity,
ROUND(100 * AVG(purchase), 2) AS Purchase_rate
FROM (
SELECT
SPLIT(kv, ':')[OFFSET(0)] product,
CAST(SPLIT(kv, ':')[OFFSET(1)] AS INT64) purchase
FROM `project.dataset.table` t,
UNNEST(SPLIT(REPLACE(TRIM(TO_JSON_STRING(t), '{}'), '"', ''))) kv
WHERE SPLIT(kv, ':')[OFFSET(0)] != 'Customer'
)
GROUP BY product
如果应用于来自你的问题的样本数据 - 输出是
Row product Purchased_quantity Not_purchased_quantity Purchase_rate
1 product_1 2 1 66.67
2 product_2 3 0 100.0
3 product_3 1 2 33.33