SQL 使用带标准差的 GROUP BY 查询?

SQL query using GROUP BY with standard deviation?

我在使用标准差函数时遇到了一些问题(特别是在 MonetDB 中 stddev_samp)。我尝试了以下查询但没有成功:

    select industry, avg(marketcap) as industryavg, stddev_samp(marketcap) as industrysd from cumulativeview group by industry
    select  stddev_samp(marketcap) as industrysd from cumulativeview group by industry

每个都给了我一个非常奇怪的异常,stddev 函数似乎不能按子集分组工作,但是单独使用 avg 函数似乎在按子集分组时工作得很好,如下面的查询:

    select industry, avg(marketcap) as industryavg  from cumulativeview group by industry

当我使用 where 子句而不是 group by 时,标准偏差函数工作得很好:

    select  stddev_samp(marketcap) as industrysd from cumulativeview where industry='Diversified Investments'

是否有另一种方法可以编写一个查询,该查询可以同时给出每个行业的平均值和标准差,而不必为每个行业编写单独的查询?我很困惑为什么平均函数与 group by 一起工作而 stddev 不...

刚刚使用 2014 年 10 月发布的 MonetDB 对此进行了测试。根据您的查询,我推断出以下 table 结构:

CREATE TABLE cumulativeview (industry string, company string, marketcap double);

一些示例数据:

INSERT INTO cumulativeview VALUES ('Automotive', 'Daimler', 84784.62), 
('Automotive', 'BMW', 66852.15), ('Automotive', 'VW', 95378.54), ('Chemical', 'BASF', 70438.13), ('Chemical', 'Bayer', 105766.62);

以及您的查询

SELECT industry, avg(marketcap) AS industryavg, stddev_samp(marketcap) AS industrysd FROM cumulativeview GROUP BY industry;

结果

+------------+--------------------------+--------------------------+
| industry   | industryavg              | industrysd               |
+============+==========================+==========================+
| Automotive |       82338.436666666661 |       14419.659887918069 |
| Chemical   |                88102.375 |       24981.014848081126 |
+------------+--------------------------+--------------------------+

正如 Anthony 所建议的那样,该错误似乎已修复。