找出不同组的值总和

Find the sum of values from distinct groups

我有一个 table 如下所示。

id account_id sha256 size
1 1 abc 120
2 1 abc 120
3 1 bcd 150
4 2 abc 120
5 2 def 80
6 3 fed 100
7 3 fed 100

我需要找到大小列的总和,但是一个帐户的相同 sha256 应该只添加一次。要添加的行应如下所示。

id account_id sha256 size
1 1 abc 120
3 1 bcd 150
4 2 abc 120
5 2 def 80
6 3 fed 100

由于同一帐户的 sha256 值重复,第 2 行和第 7 行已被删除。第 4 行没有被删除,因为它属于不同的帐户,即使它具有相同的 sha256,总和应该是 570。

尝试了以下查询,但在“distinct”处或附近给出语法错误。

SELECT SUM(f.size) FROM 
(SELECT account_id, DISTINCT sha256, size FROM files GROUP BY account_id, sha256, size) f

使用DISTINCT ON:

SELECT SUM(size) AS total
FROM
(
    SELECT DISTINCT ON (account_id, sha256, size) size
    FROM FILES
    ORDER BY account_id, sha256, size, id
) t;

以上逻辑为每组(account_id, sha256, size)值保留了对应于最低id值的单个记录。然后将这组记录按大小相加得到总数。

您可以对 account_idsha256size 的不同组合求和:

SELECT SUM(size) total_size
FROM (SELECT DISTINCT account_id, sha256, size FROM files) f; 

参见demo

您的数据

CREATE TABLE test(
   id         INTEGER  NOT NULL 
  ,account_id INTEGER  NOT NULL
  ,sha256     VARCHAR(40) NOT NULL
  ,size       INTEGER  NOT NULL
);
INSERT INTO test
(id,account_id,sha256,size) VALUES 
(1,1,'abc',120),
(2,1,'abc',120),
(3,1,'bcd',150),
(4,2,'abc',120),
(5,2,'def',80),
(6,3,'fed',100),
(7,3,'fed',100);

使用Row_number区分重复值

SELECT SUM(f.size) AS total
FROM   (SELECT id,
               account_id,
               sha256,
               size,
               Row_number ()
                 OVER (
                   partition BY account_id, sha256, size
                   ORDER BY id ASC ) rn
        FROM   test) f
WHERE  rn = 1  

dbfiddle

归结为:

SELECT sum(size) AS total
FROM  (SELECT DISTINCT ON (account_id, sha256) size FROM files) sub;

关于DISTINCT ON

  • Select first row in each GROUP BY group?

您没有涉及 size(account_id, sha256) 不同的情况。我想那是因为某些未公开的原因永远不会发生这种情况。 如果它可能发生,您需要定义确切的操作...

这是另一个..

WITH
    files AS
        (
            Select 1 "ID",  1 "ACCOUNT_ID", 'abc' "SHA256", 120 "SZ" From Dual UNION ALL
            Select 2 "ID",  1 "ACCOUNT_ID", 'abc' "SHA256", 120 "SZ" From Dual UNION ALL
            Select 3 "ID",  1 "ACCOUNT_ID", 'bcd' "SHA256", 150 "SZ" From Dual UNION ALL
            Select 4 "ID",  2 "ACCOUNT_ID", 'abc' "SHA256", 120 "SZ" From Dual UNION ALL
            Select 5 "ID",  2 "ACCOUNT_ID", 'def' "SHA256", 80 "SZ" From Dual UNION ALL
            Select 6 "ID",  3 "ACCOUNT_ID", 'fed' "SHA256", 100 "SZ" From Dual UNION ALL
            Select 7 "ID",  3 "ACCOUNT_ID", 'fed' "SHA256", 100 "SZ" From Dual 
        )
SELECT DISTINCT
    Min(f.ID) OVER(PARTITION BY f.ACCOUNT_ID, f.SHA256 ORDER BY f.ACCOUNT_ID, f.SHA256) "ID",
    f.ACCOUNT_ID "ACCOUNT_ID", 
    f.SHA256 "SHA256",
    f.SZ "SIZE"      
FROM 
    files f
ORDER BY 
    f.ACCOUNT_ID, 
    f.SHA256
--
--  Result
-- ID   ACCOUNT_ID  SHA256     SIZE
--  1           1   abc         120
--  3           1   bcd         150
--  4           2   abc         120
--  5           2   def          80
--  6           3   fed         100

Min(f.ID) OVER(.....) 是组的第一个 ID (ACCOUNT_ID, SHA256), DISTINCT 只给我们不同的行,其余的是你的值要求。如果对大小求和,您将得到 570...问候...