SQL / Presto SQL:在同一列中按组求和
SQL / Presto SQL: sum by group in a same column
我正在尝试解决如下问题:
有一个table是这样的:
logtime
name
seconds
flag
1629302433
a
30
1-1
1629302463
a
30
1-1
1629302483
a
20
0-1
1629302513
a
30
1-1
1629302533
a
20
0-1
1629302553
a
30
1-1
作为标志 = 0-1,数据将被分成 3 部分,并将每一部分的秒列值相加,如下所示:
(logtime 是时间戳)
name
seconds
a
60
a
30
a
30
计算每行所属的组号作为 运行 标志“0-1”出现的总和。然后按名称和组号聚合组。
演示:
with mytable as (
SELECT * FROM (
VALUES
(1629302433, 'a', 30, '1-1'),
(1629302463, 'a', 30, '1-1'),
(1629302483, 'a', 20, '0-1'),
(1629302513, 'a', 30, '1-1'),
(1629302533, 'a', 20, '0-1'),
(1629302553, 'a', 30, '1-1')
) AS t (logtime, name, seconds, flag)
)
select name,
sum(seconds) seconds
from
(--calculate group number as running sum of 0-1 occurances
select logtime, name, seconds, flag,
sum(case when flag='0-1' then 1 else 0 end) over(partition by name order by logtime) as group_nbr
from mytable
)s
where flag='1-1' --do not sum '0-1' records
group by name, group_nbr
order by name, group_nbr --remove ordering if not necessary
结果:
name seconds
a 60
a 30
a 30
您可以使用 lag()
函数找到值变化的位置,然后进行累加和分配组,然后对组求和:
WITH dataset AS (
SELECT *
FROM
(
VALUES
(1629302433, 'a', 30, '1-1'),
(1629302463, 'a', 30, '1-1'),
(1629302483, 'a', 20, '0-1'),
(1629302513, 'a', 30, '1-1'),
(1629302533, 'a', 20, '0-1'),
(1629302553, 'a', 30, '1-1')
) AS t (logtime, name, seconds, flag)
)
select name, sum(seconds) seconds
from (
select *,
sum(case when flag = prev_flag then 0 else 1 end) over (partition by name order by logtime) as grp
from (
select logtime,
name,
seconds,
flag,
lag(flag) over (partition by name order by logtime) as prev_flag
from dataset
)
)
where flag = '1-1'
group by name, grp
输出:
name
seconds
a
60
a
30
a
30
我正在尝试解决如下问题:
有一个table是这样的:
logtime | name | seconds | flag |
---|---|---|---|
1629302433 | a | 30 | 1-1 |
1629302463 | a | 30 | 1-1 |
1629302483 | a | 20 | 0-1 |
1629302513 | a | 30 | 1-1 |
1629302533 | a | 20 | 0-1 |
1629302553 | a | 30 | 1-1 |
作为标志 = 0-1,数据将被分成 3 部分,并将每一部分的秒列值相加,如下所示: (logtime 是时间戳)
name | seconds |
---|---|
a | 60 |
a | 30 |
a | 30 |
计算每行所属的组号作为 运行 标志“0-1”出现的总和。然后按名称和组号聚合组。
演示:
with mytable as (
SELECT * FROM (
VALUES
(1629302433, 'a', 30, '1-1'),
(1629302463, 'a', 30, '1-1'),
(1629302483, 'a', 20, '0-1'),
(1629302513, 'a', 30, '1-1'),
(1629302533, 'a', 20, '0-1'),
(1629302553, 'a', 30, '1-1')
) AS t (logtime, name, seconds, flag)
)
select name,
sum(seconds) seconds
from
(--calculate group number as running sum of 0-1 occurances
select logtime, name, seconds, flag,
sum(case when flag='0-1' then 1 else 0 end) over(partition by name order by logtime) as group_nbr
from mytable
)s
where flag='1-1' --do not sum '0-1' records
group by name, group_nbr
order by name, group_nbr --remove ordering if not necessary
结果:
name seconds
a 60
a 30
a 30
您可以使用 lag()
函数找到值变化的位置,然后进行累加和分配组,然后对组求和:
WITH dataset AS (
SELECT *
FROM
(
VALUES
(1629302433, 'a', 30, '1-1'),
(1629302463, 'a', 30, '1-1'),
(1629302483, 'a', 20, '0-1'),
(1629302513, 'a', 30, '1-1'),
(1629302533, 'a', 20, '0-1'),
(1629302553, 'a', 30, '1-1')
) AS t (logtime, name, seconds, flag)
)
select name, sum(seconds) seconds
from (
select *,
sum(case when flag = prev_flag then 0 else 1 end) over (partition by name order by logtime) as grp
from (
select logtime,
name,
seconds,
flag,
lag(flag) over (partition by name order by logtime) as prev_flag
from dataset
)
)
where flag = '1-1'
group by name, grp
输出:
name | seconds |
---|---|
a | 60 |
a | 30 |
a | 30 |