找出不同组的值总和
Find the sum of values from distinct groups
我有一个 table 如下所示。
id
account_id
sha256
size
1
1
abc
120
2
1
abc
120
3
1
bcd
150
4
2
abc
120
5
2
def
80
6
3
fed
100
7
3
fed
100
我需要找到大小列的总和,但是一个帐户的相同 sha256 应该只添加一次。要添加的行应如下所示。
id
account_id
sha256
size
1
1
abc
120
3
1
bcd
150
4
2
abc
120
5
2
def
80
6
3
fed
100
由于同一帐户的 sha256 值重复,第 2 行和第 7 行已被删除。第 4 行没有被删除,因为它属于不同的帐户,即使它具有相同的 sha256,总和应该是 570。
尝试了以下查询,但在“distinct”处或附近给出语法错误。
SELECT SUM(f.size) FROM
(SELECT account_id, DISTINCT sha256, size FROM files GROUP BY account_id, sha256, size) f
使用DISTINCT ON
:
SELECT SUM(size) AS total
FROM
(
SELECT DISTINCT ON (account_id, sha256, size) size
FROM FILES
ORDER BY account_id, sha256, size, id
) t;
以上逻辑为每组(account_id, sha256, size)
值保留了对应于最低id
值的单个记录。然后将这组记录按大小相加得到总数。
您可以对 account_id
、sha256
和 size
的不同组合求和:
SELECT SUM(size) total_size
FROM (SELECT DISTINCT account_id, sha256, size FROM files) f;
参见demo。
您的数据
CREATE TABLE test(
id INTEGER NOT NULL
,account_id INTEGER NOT NULL
,sha256 VARCHAR(40) NOT NULL
,size INTEGER NOT NULL
);
INSERT INTO test
(id,account_id,sha256,size) VALUES
(1,1,'abc',120),
(2,1,'abc',120),
(3,1,'bcd',150),
(4,2,'abc',120),
(5,2,'def',80),
(6,3,'fed',100),
(7,3,'fed',100);
使用Row_number
区分重复值
SELECT SUM(f.size) AS total
FROM (SELECT id,
account_id,
sha256,
size,
Row_number ()
OVER (
partition BY account_id, sha256, size
ORDER BY id ASC ) rn
FROM test) f
WHERE rn = 1
归结为:
SELECT sum(size) AS total
FROM (SELECT DISTINCT ON (account_id, sha256) size FROM files) sub;
关于DISTINCT ON
:
- Select first row in each GROUP BY group?
您没有涉及 size
与 (account_id, sha256)
不同的情况。我想那是因为某些未公开的原因永远不会发生这种情况。 如果它可能发生,您需要定义确切的操作...
这是另一个..
WITH
files AS
(
Select 1 "ID", 1 "ACCOUNT_ID", 'abc' "SHA256", 120 "SZ" From Dual UNION ALL
Select 2 "ID", 1 "ACCOUNT_ID", 'abc' "SHA256", 120 "SZ" From Dual UNION ALL
Select 3 "ID", 1 "ACCOUNT_ID", 'bcd' "SHA256", 150 "SZ" From Dual UNION ALL
Select 4 "ID", 2 "ACCOUNT_ID", 'abc' "SHA256", 120 "SZ" From Dual UNION ALL
Select 5 "ID", 2 "ACCOUNT_ID", 'def' "SHA256", 80 "SZ" From Dual UNION ALL
Select 6 "ID", 3 "ACCOUNT_ID", 'fed' "SHA256", 100 "SZ" From Dual UNION ALL
Select 7 "ID", 3 "ACCOUNT_ID", 'fed' "SHA256", 100 "SZ" From Dual
)
SELECT DISTINCT
Min(f.ID) OVER(PARTITION BY f.ACCOUNT_ID, f.SHA256 ORDER BY f.ACCOUNT_ID, f.SHA256) "ID",
f.ACCOUNT_ID "ACCOUNT_ID",
f.SHA256 "SHA256",
f.SZ "SIZE"
FROM
files f
ORDER BY
f.ACCOUNT_ID,
f.SHA256
--
-- Result
-- ID ACCOUNT_ID SHA256 SIZE
-- 1 1 abc 120
-- 3 1 bcd 150
-- 4 2 abc 120
-- 5 2 def 80
-- 6 3 fed 100
Min(f.ID) OVER(.....) 是组的第一个 ID (ACCOUNT_ID, SHA256), DISTINCT 只给我们不同的行,其余的是你的值要求。如果对大小求和,您将得到 570...问候...
我有一个 table 如下所示。
id | account_id | sha256 | size |
---|---|---|---|
1 | 1 | abc | 120 |
2 | 1 | abc | 120 |
3 | 1 | bcd | 150 |
4 | 2 | abc | 120 |
5 | 2 | def | 80 |
6 | 3 | fed | 100 |
7 | 3 | fed | 100 |
我需要找到大小列的总和,但是一个帐户的相同 sha256 应该只添加一次。要添加的行应如下所示。
id | account_id | sha256 | size |
---|---|---|---|
1 | 1 | abc | 120 |
3 | 1 | bcd | 150 |
4 | 2 | abc | 120 |
5 | 2 | def | 80 |
6 | 3 | fed | 100 |
由于同一帐户的 sha256 值重复,第 2 行和第 7 行已被删除。第 4 行没有被删除,因为它属于不同的帐户,即使它具有相同的 sha256,总和应该是 570。
尝试了以下查询,但在“distinct”处或附近给出语法错误。
SELECT SUM(f.size) FROM
(SELECT account_id, DISTINCT sha256, size FROM files GROUP BY account_id, sha256, size) f
使用DISTINCT ON
:
SELECT SUM(size) AS total
FROM
(
SELECT DISTINCT ON (account_id, sha256, size) size
FROM FILES
ORDER BY account_id, sha256, size, id
) t;
以上逻辑为每组(account_id, sha256, size)
值保留了对应于最低id
值的单个记录。然后将这组记录按大小相加得到总数。
您可以对 account_id
、sha256
和 size
的不同组合求和:
SELECT SUM(size) total_size
FROM (SELECT DISTINCT account_id, sha256, size FROM files) f;
参见demo。
您的数据
CREATE TABLE test(
id INTEGER NOT NULL
,account_id INTEGER NOT NULL
,sha256 VARCHAR(40) NOT NULL
,size INTEGER NOT NULL
);
INSERT INTO test
(id,account_id,sha256,size) VALUES
(1,1,'abc',120),
(2,1,'abc',120),
(3,1,'bcd',150),
(4,2,'abc',120),
(5,2,'def',80),
(6,3,'fed',100),
(7,3,'fed',100);
使用Row_number
区分重复值
SELECT SUM(f.size) AS total
FROM (SELECT id,
account_id,
sha256,
size,
Row_number ()
OVER (
partition BY account_id, sha256, size
ORDER BY id ASC ) rn
FROM test) f
WHERE rn = 1
归结为:
SELECT sum(size) AS total
FROM (SELECT DISTINCT ON (account_id, sha256) size FROM files) sub;
关于DISTINCT ON
:
- Select first row in each GROUP BY group?
您没有涉及 size
与 (account_id, sha256)
不同的情况。我想那是因为某些未公开的原因永远不会发生这种情况。 如果它可能发生,您需要定义确切的操作...
这是另一个..
WITH
files AS
(
Select 1 "ID", 1 "ACCOUNT_ID", 'abc' "SHA256", 120 "SZ" From Dual UNION ALL
Select 2 "ID", 1 "ACCOUNT_ID", 'abc' "SHA256", 120 "SZ" From Dual UNION ALL
Select 3 "ID", 1 "ACCOUNT_ID", 'bcd' "SHA256", 150 "SZ" From Dual UNION ALL
Select 4 "ID", 2 "ACCOUNT_ID", 'abc' "SHA256", 120 "SZ" From Dual UNION ALL
Select 5 "ID", 2 "ACCOUNT_ID", 'def' "SHA256", 80 "SZ" From Dual UNION ALL
Select 6 "ID", 3 "ACCOUNT_ID", 'fed' "SHA256", 100 "SZ" From Dual UNION ALL
Select 7 "ID", 3 "ACCOUNT_ID", 'fed' "SHA256", 100 "SZ" From Dual
)
SELECT DISTINCT
Min(f.ID) OVER(PARTITION BY f.ACCOUNT_ID, f.SHA256 ORDER BY f.ACCOUNT_ID, f.SHA256) "ID",
f.ACCOUNT_ID "ACCOUNT_ID",
f.SHA256 "SHA256",
f.SZ "SIZE"
FROM
files f
ORDER BY
f.ACCOUNT_ID,
f.SHA256
--
-- Result
-- ID ACCOUNT_ID SHA256 SIZE
-- 1 1 abc 120
-- 3 1 bcd 150
-- 4 2 abc 120
-- 5 2 def 80
-- 6 3 fed 100
Min(f.ID) OVER(.....) 是组的第一个 ID (ACCOUNT_ID, SHA256), DISTINCT 只给我们不同的行,其余的是你的值要求。如果对大小求和,您将得到 570...问候...