BigQuery 按条件平均列

Question

假设我在 BigQuery 中有以下简化数据：

WITH sales_log AS
 (
  SELECT 'John' as employee, 'ABC' client, 1234.56 sales, "phone" sale_type UNION ALL
  SELECT 'John' as employee, 'ABC' client, 9857.56 sales, "online" sale_type UNION ALL
  SELECT 'John' as employee, 'XYZ' client, 5678.56 sales, "phone" sale_type UNION ALL
  SELECT 'John' as employee, 'XYZ' client, 64875.25 sales, "online" sale_type UNION ALL
  SELECT 'Mary' as employee, 'ABC' client, 456.58 sales, "phone" sale_type UNION ALL
  SELECT 'Mary' as employee, 'ABC' client, 11585.58 sales, "online" sale_type UNION ALL
  SELECT 'Mary' as employee, 'XYZ' client, 4578.52 sales, "phone" sale_type UNION ALL
  SELECT 'Mary' as employee, 'XYZ' client, 56853.45 sales, "online" sale_type
  )
SELECT employee, AVG(sales) AS avg_sales
FROM sales_log
GROUP BY employee

我可以很容易地得到员工销售额的平均值。

是否有一种简单的方法也可以在一行中获得每种销售类型的平均值？这样输出就像：

employee	avg_sales	avg_phone_sales	avg_online_sales
John	20411.4825	3456.56	37366.405
Mary	18368.5325	2517.55	34219.515

提前谢谢你。

Answer 1

夫妇接近：

DEMO 在底部：
如果类型有限，您可以为每一列使用 case 表达式。
如果它们是“动态”的，那么您需要一个动态枢轴。（存在几个在线示例；但这意味着动态 SQL 容易被 SQL 注入）
注意：通常这样的数据格式化是在 UI 而不是 SQL.
注意：使用 Null 可确保“空白”值不会影响您的平均值。

.

WITH sales_log AS
 (
  SELECT 'John' as employee, 'ABC' client, 1234.56 sales, 'phone' sale_type UNION ALL
  SELECT 'John' as employee, 'ABC' client, 9857.56 sales, 'online' sale_type UNION ALL
  SELECT 'John' as employee, 'XYZ' client, 5678.56 sales, 'phone' sale_type UNION ALL
  SELECT 'John' as employee, 'XYZ' client, 64875.25 sales, 'online' sale_type UNION ALL
  SELECT 'Mary' as employee, 'ABC' client, 456.58 sales, 'phone' sale_type UNION ALL
  SELECT 'Mary' as employee, 'ABC' client, 11585.58 sales, 'online' sale_type UNION ALL
  SELECT 'Mary' as employee, 'XYZ' client, 4578.52 sales, 'phone' sale_type UNION ALL
  SELECT 'Mary' as employee, 'XYZ' client, 56853.45 sales, 'online' sale_type
  )
SELECT employee, AVG(sales) AS avg_sales, 
       AVG(case when sale_type = 'phone' then sales else NULL end) as AVG_PhoneSales, 
       AVG(case when sale_type = 'online' then sales else NULL end) as AVG_OnLineSales
FROM sales_log
GROUP BY employee

给我们：

+----------+--------------------+-----------------------+--------------------+
| employee |     avg_sales      |  AVG_phonesales       | AVG_onlinesales    |
+----------+--------------------+-----------------------+--------------------+
| Mary     | 18368.532500000000 | 2517.5500000000000000 | 34219.515000000000 |
| John     | 20411.482500000000 | 3456.5600000000000000 | 37366.405000000000 |
+----------+--------------------+-----------------------+--------------------+

DEMO ：虽然不是 google big-query 语法在这个用例中是相同的。动态 SQL 或动态枢轴会有所不同。

动态示例：

Answer 2

考虑以下方法

select * from (
  select employee, sale_type, avg(sales) as avg_sales
  from sales_log
  group by rollup (employee, sale_type)
  having not employee is null
)
pivot (min(avg_sales) avg for ifnull(sale_type || '_sales', 'sales') in ('sales', 'phone_sales', 'online_sales'))

如果应用于我们问题中的样本数据 - 输出是

BigQuery 按条件平均列

BigQuery average column by condition

sql

google-bigquery