为什么这个 GROUP BY 语句需要 2 列？

Question

我有 3 个表：

客户。

cust_id cust_name

1000000001 Village Toys

1000000002 Kids Place

1000000003 Fun4All

1000000004 Fun4All

1000000005 The Toy Store
订单。

cust_id order_num

1000000001 20005

1000000003 20006

1000000005 20008

1000000001 20009

订单项。

quantity	item_price	order_num
100	5.49	20005
100	10.99	20005
20	5.99	20006
10	8.99	20006
10	11.99	20006
5	4.99	20008
5	11.99	20008
10	3.49	20008
10	3.49	20008
10	3.49	20008
250	2.49	20009
250	2.49	20009
250	2.49	20009

以及以下代码：

SELECT cust_name, Orders.order_num, SUM(item_price * quantity) AS OrderTotal
FROM Customers
INNER JOIN Orders ON Customers.cust_id = Orders.cust_id
INNER JOIN OrderItems ON Orders.order_num = OrderItems.order_num
GROUP BY cust_name, Orders.order_num
ORDER BY cust_name, order_num;

结果是：

cust_name	order_num	OrderTotal
Fun4All	20006	329.60
The Toy Store	20008	189.60
Village Toys	20005	1648.00
Village Toys	20009	1867.50

如果我从 GROUP BY 语句中删除 cust_id，我将得到一个错误

The column "Customers.cust_name" is not allowed in the select list because it is not contained in either the aggregate function or the GROUP BY clause.

正如我所说，我无法理解为什么 GROUP BY 与 2 列一起使用。我可以理解仅使用 GROUP BY Orders.order_num，因为它将 OrderTotal 按 order_num 分组。我也不知道 GROUP BY 处理了哪些顺序列。是cust_name先处理还是Orders.order_num？

Answer 1

分组时，服务器需要知道将哪个函数应用于聚合查询的剩余字段。如果您仅按 order_num 分组，它需要知道应用哪个函数来汇总 cust_name 字段。一种简单的方法是对 select 中的 cust_name 使用 min() 函数（像这样 SELECT min(cust_name) ...）。这只是意味着，对每个 order_num 组中的记录取最小的 cust_name（字母排序）。

从 table 结构（上面给出的数据）可以看出，您不会为多个客户发出相同的 order_num（这应该是正常做法）。因此，您是对的，如果您按 order_num.

分组，则不需要按客户 (cust_name) 分组

始终从左到右处理按键分组，结果按该顺序分层分组。

顺便说一句：如果索引字段（id 字段）可用，请使用索引字段而不是字符串字段进行分组（cust_id vs cust_name）——处理索引列总是更快，尤其是.汇总结果时。

Answer 2

你的问题很有道理。如果单独按订单分组，并且每个订单恰好有一个客户，为什么不能select与订单相关的客户名称？

答案是：你应该可以这样做。（前提是在您的数据库中正确设置了唯一约束和外键引用。）此示例中的客户名称在功能上取决于订单号，因此根据 SQL 标准，这应该没有问题。

您收到此错误消息只是因为 SQL 服务器不符合此处的 SQL 标准。他们只要求您 GROUP BY 子句的 select 列和聚合（SUM(...)、COUNT(...) 等）。

因此，要么将客户保留在 GROUP BY 子句中，要么将客户保留在 select MIN(cust_name) 或 MAX(cust_name) 中，而不是仅仅 cust_name 以使 DBMS 满意.

为什么这个 GROUP BY 语句需要 2 列？

Why does this GROUP BY statement need 2 columns?

sql

sql-server

group-by

inner-join

cust_id	cust_name
1000000001	Village Toys
1000000002	Kids Place
1000000003	Fun4All
1000000004	Fun4All
1000000005	The Toy Store