GROUP BY 函数取消 DISTINCT

GROUP BY function cancels DISTINCT

我有一个 SQL 查询,它连接了几个在其中两行上产生重复项的表。我使用 DISTINCT 关键字来消除重复项:

SELECT DISTINCT
          o.day as day,
          g.id AS id,
          g.name AS name,
          o.num AS num,
          o.version as version
        FROM
          table_one o
          INNER JOIN table_two t ON
            o.ID = t.ID
          INNER JOIN table_three g ON
            t.ID = g.ID
          INNER JOIN table_four gs ON
            g.ID = gs.ID
            AND
          INNER JOIN table_five s ON
            gs.ID = s.ID
          INNER JOIN table_six z ON
            s.ID = z.ID
          INNER JOIN table_seven bg ON
            bg.ID = g.ID;

这 returns 我想要的两行,否则如果我不使用 DISTINCT:

我会看到重复的
1/2/19, 5, first, 25, 1
1/5/19, 7, second, 20, 1

如果我删除 DISTINCT,那么这两行就会重复,得到四行:

1/2/19, 5, first, 25, 1
1/2/19, 5, first, 25, 1
1/5/19, 7, second, 20, 1
1/5/19, 7, second, 20, 1

所以我的最终目标是使用 GROUP BY 函数,这样我就可以将我的 o.num 字段相加并按其余字段对它们进行分组。如果我像这样向上面的查询添加一个 GROUP BY 函数:

SELECT DISTINCT
          o.day as day,
          g.id AS id,
          g.name AS name,
          SUM(o.num) AS num,
          o.version as version
        FROM
          table_one o
          INNER JOIN table_two t ON
            o.ID = t.ID
          INNER JOIN table_three g ON
            t.ID = g.ID
          INNER JOIN table_four gs ON
            g.ID = gs.ID
            AND
          INNER JOIN table_five s ON
            gs.ID = s.ID
          INNER JOIN table_six z ON
            s.ID = z.ID
          INNER JOIN table_seven bg ON
            bg.ID = g.ID
        GROUP BY
          o.day as day,
          g.id AS id,
          g.name AS name,
          o.version as version;

我得到了两行,但是 o.num 数量增加了一倍(基本上执行 GROUP BY 而没有 DISTINCT:

1/2/19, 5, first, 50, 1
1/5/19, 7, second, 40, 1

注意:您可能想知道为什么当我通过第一个查询获得我想要的结果时要尝试使用 GROUP BY。我只包含了被复制的行。由于某种原因,所有其他行都没有看到此行为。有没有办法让 GROUP BYDISTINCT 一起工作?

如果您想删除重复项然后对值求和,请将您的查询插入到子查询中。

select day, id, name, sum(num) num, version
from (
  -- your query here with DISTINCT clause 
)
group by day, id, name, version

如果出现重复,可能是连接条件有问题。对我来说不难判断,不懂数据集

你可以使用 SUM(DISTINCT o.num):

SELECT o.day as day,
       g.id AS id,
       g.name AS name,
       SUM(DISTINCT o.num) AS num,
       o.version as version
FROM table_one o
INNER JOIN table_two t ON o.ID = t.ID
INNER JOIN table_three g ON t.ID = g.ID
INNER JOIN table_four gs ON g.ID = gs.ID
INNER JOIN table_five s ON gs.ID = s.ID
INNER JOIN table_six z ON s.ID = z.ID
INNER JOIN table_seven bg ON bg.ID = g.ID
GROUP BY o.day as day,
         g.id AS id,
         g.name AS name,
         o.version as version;