MySQL:GROUP BY 上的非聚合字段会怎样?

MySQL: What happens to non-aggregated fields upon a GROUP BY?

我对 MySQL 中的以下行为有一个非常基本的问题。

假设我们执行以下操作 GROUP BY:

SELECT a, b, SUM(c)
FROM table
GROUP BY b;

字段 a 发生了什么变化,它既不聚合也不包含在 GROUP BY 字段中?

MySQL 是否只是隐含地将 FIRST(a) 应用于 a?如果是这样,这种行为是一致的还是从 a 的所有值中获取随机值?

这是查询处理器从存储介质中取回的第一个结果值,具体取决于所选的查询策略。从技术上讲,这是未定义的,但你的 table 除了它的关键之外没有其他指标,这是相当确定的,但你受优化器的支配。

依靠这个后果自负。这就是发明 window 函数的原因。

MySQL > 5.7.5

这是一个非法查询。

您将收到如下错误:

ERROR 1055 (42000): Expression #1 of SELECT list is not in 
GROUP BY clause and contains nonaggregated column 'a' 
which is not functionally dependent on columns in 
GROUP BY clause; this is incompatible with sql_mode=only_full_group_by

MySQL 5.7.5 and up implements detection of functional dependence. If the ONLY_FULL_GROUP_BY SQL mode is enabled (which it is by default), MySQL rejects queries for which the select list, HAVING condition, or ORDER BY list refer to nonaggregated columns that are neither named in the GROUP BY clause nor are functionally dependent on them

更多详情请见 https://dev.mysql.com/doc/refman/5.7/en/group-by-handling.html


MySQL < 5.7.5

简答:这是一个有效的查询,但服务器可以return任意值

阅读此 https://dev.mysql.com/doc/refman/5.0/en/group-by-handling.html 表明:

12.16.3 MySQL Handling of GROUP BY In standard SQL, a query that includes a GROUP BY clause cannot refer to nonaggregated columns in the select list that are not named in the GROUP BY clause. For example, this query is illegal in standard SQL because the name column in the select list does not appear in the GROUP BY:

SELECT o.custid, c.name, MAX(o.payment)
FROM orders AS o, customers AS c
WHERE o.custid = c.custid
GROUP BY o.custid;

For the query to be legal, the name column must be omitted from the select list or named in the GROUP BY clause.

MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. This means that the preceding query is legal in MySQL. You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. Furthermore, the selection of values from each group cannot be influenced by adding an ORDER BY clause. Sorting of the result set occurs after values have been chosen, and ORDER BY does not affect which values within each group the server chooses.

因此无法确定该值是多少(如果该组有更多值)