对深度层次化的数据进行分组时如何处理 SQL 聚合函数

How to deal with SQL Aggregate Functions when Grouping deeply hiearched data

给定以下场景:

我有一个包含 5 个表的数据库:

  1. 货币 (iso_number, iso_code),
  2. 产品(id,名称,current_price),
  3. 销售 (id, time_of_sale, currency_items_sold_in),
  4. sale_lines (id, sale_id, product_id, price_paid, 数量),
  5. cash_transactions (id, sale_id, received_currency_id, converted_currency_id, received_amount, converted_amount)

该设置允许存储客户最初提供的货币种类、内部兑换成的货币以及原始金额和兑换(转换)金额。

我希望能够找到符合特定条件(时间段、卖家、商店)等((为简单起见而省略))的所有销售。

对于所有这些销售,我将加入相关数据,即 sale_lines 和 cash_transactions。现在 sale_lines 上的货币始终与相关销售中的货币相匹配。 但是,对于 cash_transactions,received_amount/received_currency 可能与销售货币不同。尽管 converted_currency/converted_amount 存储在 cash_transaction 行中,但它应该跟随销售。

当我尝试对某些字段执行 SUM 时,当您开始加入一对多关系然后执行聚合函数(如 SUM)时出现问题,即使您在幕后指定了正确的 GROUP BY SQL 如果我们不使用 GROUP BY,服务器仍然会对显示数据所需的重复行求和。

这里也描述了这个问题: https://wikido.isoftdata.com/index.php/The_GROUPing_pitfall

按照上面文章的解决方案,在我的例子中,我应该将每次销售的汇总结果左联接到外部查询中。

但是当 sale_lines 货币与销售匹配,但 cash_transactions 货币可能与销售不同时,我该怎么办?

我尝试创建以下 SQL Fiddle 插入一些测试数据并突出显示问题:http://sqlfiddle.com/#!17/54a7b/15

在 fiddle 中,我创建了 2 个销售项目,其中的商品以丹麦克朗 (208) 和 752(瑞典克朗) 出售。 第一次销售有2条销售线,2次现金交易第一次交易直接DKK => DKK,第二次交易SEK => DKK。

第二次sale也有2条sale lines,2笔现金交易,第一笔交易NOK => DKK,第二笔交易直接DKK => DKK。

在 fiddle 的最后一个查询中,可以观察到 total_received_amount 是假的,因为它是 DKK、SEK 和 NOK 的混合,没有提供太多价值。

我想要有关如何正确获取数据的建议,我不在乎是否必须在服务器端 (PHP) 执行额外的“逻辑”以删除一些重复数据只要总和正确的数据。

非常感谢任何建议。

DDL 来自 FIDDLE

CREATE TABLE currency (
  iso_number CHARACTER VARYING(3) PRIMARY KEY,
  iso_code CHARACTER VARYING(3)
);

INSERT INTO currency(iso_number, iso_code) VALUES ('208','DKK'), ('752','SEK'), ('572','NOK');

CREATE TABLE product (
  id SERIAL PRIMARY KEY,
  name CHARACTER VARYING(12),
  current_price INTEGER
);

INSERT INTO product(id,name,current_price) VALUES (1,'icecream',200), (2,'sunglasses',300);

CREATE TABLE sale (
  id SERIAL PRIMARY KEY,
  time_of_sale TIMESTAMP,
  currency_items_sold_in CHARACTER VARYING(3)
);

INSERT INTO sale(id, time_of_sale, currency_items_sold_in) 
VALUES 
(1, CURRENT_TIMESTAMP, '208'),
(2, CURRENT_TIMESTAMP, '752')
;

CREATE TABLE sale_lines (
  id SERIAL PRIMARY KEY,
  sale_id INTEGER,
  product_id INTEGER,
  price_paid INTEGER,
  quantity FLOAT
);

INSERT INTO sale_lines(id, sale_id, product_id, price_paid, quantity)
VALUES 
(1, 1, 1, 200, 1.0),
(2, 1, 2, 300, 1.0),

(3, 2, 1, 100, 1.0),
(4, 2, 1, 100, 1.0)
;
        


CREATE TABLE cash_transactions (
  id SERIAL PRIMARY KEY,
  sale_id INTEGER,
  received_currency_id CHARACTER VARYING(3),
  converted_currency_id CHARACTER VARYING(3),
  received_amount INTEGER,
  converted_amount INTEGER
);


INSERT INTO cash_transactions(id, sale_id, received_currency_id, converted_currency_id, received_amount, converted_amount)
VALUES
(1, 1, '208', '208', 200, 200),
(2, 1, '752', '208', 400, 300),

(3, 2, '572', '208', 150, 100),
(4, 2, '208', '208', 100, 100)
;

来自 FIDDLE

的查询
--SELECT * FROM currency;
--SELECT * FROM product;
--SELECT * FROM sale;
--SELECT * FROM sale_lines;
--SELECT * FROM cash_transactions;


--- Showing the sales with duplicated lines to 
--- fit joined data for OneToMany SaleLines, and OneToMany cash transactions.
SELECT *
FROM sale s
LEFT JOIN sale_lines sl ON sl.sale_id = s.id
LEFT JOIN cash_transactions ct ON ct.sale_id = s.id;



--- Grouping the data by important identifier "currency_items_sold_in".
--- The SUM of sl.price_paid is wrong as it SUMS the duplicated lines as well.
SELECT 
  s.currency_items_sold_in, 
  SUM(sl.price_paid) as "price_paid"
FROM sale s
LEFT JOIN sale_lines sl ON sl.sale_id = s.id
LEFT JOIN cash_transactions ct ON ct.sale_id = s.id
GROUP BY s.currency_items_sold_in;

--- To solve this the SUM can be joined via the "Monkey-Poop" method.
--- Here the problem arises, the SUMS for cash_transaction.received_amount and cash_transaction.converted_amount cannot be relied upon
--- As those fields themselves rely on cash_transaction.received_currency_id and cash_transaction.converted_currency_id
SELECT 
  s.currency_items_sold_in, 
  SUM(sale_line_aggregates.price_paid) as "total_price_paid",
  SUM(cash_transaction_aggregates.converted_amount) as "total_converted_amount",
  SUM(cash_transaction_aggregates.received_amount) as "total_received_amount"
FROM sale s
LEFT JOIN (
  SELECT 
    sale_id,
    SUM(price_paid) AS price_paid
  FROM sale_lines
  GROUP BY sale_id
) AS sale_line_aggregates ON sale_line_aggregates.sale_id = s.id

LEFT JOIN (
  SELECT
    sale_id,
    SUM(converted_amount) as converted_amount,
    SUM(received_amount) as received_amount
  FROM cash_transactions
  GROUP BY sale_id
) AS cash_transaction_aggregates ON cash_transaction_aggregates.sale_id = s.id
GROUP BY s.currency_items_sold_in;

您可以计算子查询中按货币分组的每个金额。 那就和他们一起上币吧。

使用 CTE,您可以确保每个子查询使用相同的销售额。

WITH CTE_SALE AS (
  SELECT
   id as sale_id, 
   currency_items_sold_in AS iso_number
  FROM sale
)
SELECT curr.iso_code AS currency
, COALESCE(line.price_paid, 0)  as total_price_paid
, COALESCE(received.amount, 0)  as total_received_amount
, COALESCE(converted.amount, 0) as total_converted_amount
FROM currency AS curr
LEFT JOIN (
  SELECT s.iso_number
  , SUM(sl.price_paid) AS price_paid
  FROM sale_lines sl
  JOIN CTE_SALE s ON s.sale_id = sl.sale_id
  GROUP BY s.iso_number
) AS line 
  ON line.iso_number = curr.iso_number
LEFT JOIN (
  SELECT tr.received_currency_id as iso_number
  , SUM(tr.received_amount) AS amount
  FROM cash_transactions tr
  JOIN CTE_SALE s ON s.sale_id = tr.sale_id
  GROUP BY tr.received_currency_id
) AS received
  ON received.iso_number = curr.iso_number
LEFT JOIN (
  SELECT tr.converted_currency_id as iso_number
  , SUM(tr.converted_amount) AS amount
  FROM cash_transactions AS tr
  JOIN CTE_SALE s ON s.sale_id = tr.sale_id
  GROUP BY tr.converted_currency_id
) AS converted
  ON converted.iso_number = curr.iso_number;
currency | total_price_paid | total_received_amount | total_converted_amount
:------- | ---------------: | --------------------: | ---------------------:
DKK      |              500 |                   300 |                    700
SEK      |              200 |                   400 |                      0
NOK      |                0 |                   150 |                      0

db<>fiddle here