通过将值聚合到一个 table 中来构建新的 table

Constructing new table by aggregating values in one table

我想根据一个 table 中看到的数据生成另一个 table。

Customer  product_1      product_2     product_3
  cust_1    0               1            0
  cust_2    1               1            0
  cust_3    1               1            1

我有兴趣计算每种产品的购买率。

例如,product_1 的购买率为

(count of (where product_1 = 1)/(count of (where product_1 = 0) + count of (where product_1 = 1))) * 100 

assuming 1 = purchase, 0 = no purchase

i.e. 2/3 * 100 = 66.67

输出table应该如下-

 Products       Purchased_quantity      Not_purchased_quantity      Purchase_rate
product_1               2                        1                       66.67
product_2               3                        0                       100
product_3               1                        2                       33.33

您想逆透视和聚合。这是一种方法:

select product, sum(quantity), sum(1 - quantity),
       avg(quantity)
from ((select 'product1' as product, product1 as quantity from t) union all
      (select 'product2' as product, product2 as quantity from t) union all
      (select 'product3' as product, product3 as quantity from t) 
     ) p
group by product;

这假设列只有 0 和 1 -- 可能称为标志比数量更合适。

以下是针对 BigQuery Standard SQL 的,不需要硬编码名称,而是动态执行 - 对于任何 [合理] 数量的产品列 ...

#standardSQL
SELECT product,
  SUM(purchase) AS Purchased_quantity,
  SUM(1 - purchase) AS Not_purchased_quantity, 
  ROUND(100 * AVG(purchase), 2) AS Purchase_rate
FROM (
  SELECT 
    SPLIT(kv, ':')[OFFSET(0)] product,
    CAST(SPLIT(kv, ':')[OFFSET(1)] AS INT64) purchase
  FROM `project.dataset.table` t,
  UNNEST(SPLIT(REPLACE(TRIM(TO_JSON_STRING(t), '{}'), '"', ''))) kv
  WHERE SPLIT(kv, ':')[OFFSET(0)] != 'Customer'
)
GROUP BY product   

如果应用于来自你的问题的样本数据 - 输出是

Row product     Purchased_quantity  Not_purchased_quantity  Purchase_rate    
1   product_1   2                   1                       66.67    
2   product_2   3                   0                       100.0    
3   product_3   1                   2                       33.33