印象中这两个 SQL 查询会给出相同的输出,但结果却截然不同

Under the impression that these two SQL queries would give the same output, yet they have wildly different results

我正在使用 pandasql。第一个按预期转换值,但第二个 return 甚至不应该存在的东西。但是,我希望它们 return 具有相同的值。在我看来,唯一的区别是,在第一个中, grouping/sum 出现在子查询中,而在第二个中它出现在子查询之外。我错过了什么?感谢您的帮助! (输出在底部)

第一个查询(正确的)

SELECT a.'Name', a.Q1, b.Q2, (a.Q1 + b.Q2) AS Total
FROM
    (SELECT c.'Name', SUM(c.'Paid Amount') AS Q1
     FROM some_data AS c
     WHERE c.'Quarter' = 'Q1'
     GROUP BY c.'Name') AS a
JOIN
    (SELECT d.'Name', SUM(d.'Paid Amount') AS Q2
     FROM some_data AS d
     WHERE d.'Quarter' = 'Q2'
     GROUP BY d.'Name') AS b
ON a.'Name' = b.'Name'
ORDER BY Total DESC
LIMIT 5;

第二次查询(不好的)

SELECT a.'Name' as Label, SUM(a.'Paid Amount') AS Q1, SUM(b.'Paid Amount') AS Q2, (SUM(a.'Paid Amount') + SUM(b.'Paid Amount')) as Total
FROM 
    (SELECT c.'Name', c.'Paid Amount'
     FROM some_data AS c
     WHERE c.'Quarter' = 'Q1') AS a
JOIN
    (SELECT c.'Name', c.'Paid Amount'
     FROM some_data AS c
     WHERE c.'Quarter' = 'Q2') AS b
ON a.'Name' = b.'Name'
GROUP BY Label
ORDER BY Total DESC
LIMIT 5;

我把一些随机数据放在一起来证明这个问题。

第一个查询的输出(预期)

第二次查询的输出(有问题)

这就是我所说的一厢情愿的编码。

我希望您意识到在 加入 之前进行聚合会产生正确的答案。

问题是 JOIN 既可以乘以行数也可以删除行。在您的情况下,问题是一个或两个表都有 name 的多行,这会乘以行数。 SUM() 只是将 JOIN.

产生的所有值相加

注意:条件聚合是一种更简单的查询编写方式:

SELECT c.Name,
       SUM(CASE WHEN c.Quarter = 'Q1' THEN c.PaidAmount END) AS Q1
       SUM(CASE WHEN c.Quarter = 'Q2' THEN c.PaidAmount END) AS Q2
FROM some_data AS c
WHERE c.Quarter IN ('Q1', 'Q2')
GROUP BY c.Name