计算 MySQL8 中重复条目组的最大值和最小值之和
Calculate sum of maximum and minimum value for repeated group of entries in MySQL8
LogDateAndTime
BatchDate
TagLetter
Totaliser
ExpectedResult
10-11-2020 09:06:14
10-11-2020 08:29:55
A
6319
31
10-11-2020 09:06:24
10-11-2020 08:29:55
A
6337
31
10-11-2020 09:08:14
10-11-2020 08:29:55
B
6355
31
10-11-2020 09:08:24
10-11-2020 08:29:55
B
6372
31
10-11-2020 09:08:34
10-11-2020 08:29:55
B
6378
31
10-11-2020 09:08:44
10-11-2020 08:29:55
A
6383
31
10-11-2020 09:09:14
10-11-2020 08:29:55
A
6388
31
10-11-2020 09:09:24
10-11-2020 08:29:55
A
6396
31
10-11-2020 09:09:34
10-11-2020 08:29:55
B
6409
31
10-11-2020 09:09:44
10-11-2020 08:29:55
B
6426
31
10-11-2020 09:10:24
10-11-2020 08:29:55
B
6442
31
上面的 table 有 LogDateAndTime(Primary_Key) 列,其中包含唯一的日期时间条目。 BatchDate 列在整个批次中包含相同的日期时间值。我需要为每个 TagLetter=A 实例计算 MAX(Totaliser)-MIN(Totaliser) 的总和,以便我应该忽略 TagLetter=B 中的值。在这种情况下,我的 ExpectedResult 将是 SUM[(6337-6319)+(6396-6383)]=31。我尝试了以下查询,但没有得到预期的结果。
SELECT SUM(
CASE
WHEN TagLetter='A' THEN MAX(Totaliser)-MIN(Totaliser)
ELSE 0.0
END
) OVER (PARTITION BY BatchDate) AS ExpectedResult
在这种情况下,它正在计算 6396-6319=77 这不是预期的结果。有人可以帮助我得到正确的结果吗?
首先使用 window 函数 LAG()
和 SUM()
创建连续 'A'
的组,然后聚合这些组:
WITH cte AS (
SELECT DISTINCT SUM(MAX(Totaliser) - MIN(Totaliser)) OVER () ExpectedResult
FROM (
SELECT *, SUM(flag) OVER (ORDER BY LogDateAndTime) grp
FROM (
SELECT *, LAG(TagLetter, 1, '') OVER (ORDER BY LogDateAndTime) <> 'A' flag
FROM tablename
) t
WHERE TagLetter = 'A'
) t
GROUP BY grp
)
SELECT t.*, c.ExpectedResult
FROM tablename t CROSS JOIN cte c
或者如果您想要每个 BatchDate
的结果:
WITH cte AS (
SELECT DISTINCT BatchDate,
SUM(MAX(Totaliser) - MIN(Totaliser)) OVER () ExpectedResult
FROM (
SELECT *, SUM(flag) OVER (PARTITION BY BatchDate ORDER BY LogDateAndTime) grp
FROM (
SELECT *, LAG(TagLetter, 1, '') OVER (PARTITION BY BatchDate ORDER BY LogDateAndTime) <> 'A' flag
FROM tablename
) t
WHERE TagLetter = 'A'
) t
GROUP BY BatchDate, grp
)
SELECT t.*, c.ExpectedResult
FROM tablename t LEFT JOIN cte c
ON c.BatchDate = t.BatchDate
参见demo。
结果:
> LogDateAndTime | BatchDate | TagLetter | Totaliser | ExpectedResult
> :------------------ | :------------------ | :-------- | --------: | -------------:
> 10-11-2020 09:06:14 | 10-11-2020 08:29:55 | A | 6319 | 31
> 10-11-2020 09:06:24 | 10-11-2020 08:29:55 | A | 6337 | 31
> 10-11-2020 09:08:14 | 10-11-2020 08:29:55 | B | 6355 | 31
> 10-11-2020 09:08:24 | 10-11-2020 08:29:55 | B | 6372 | 31
> 10-11-2020 09:08:34 | 10-11-2020 08:29:55 | B | 6378 | 31
> 10-11-2020 09:08:44 | 10-11-2020 08:29:55 | A | 6383 | 31
> 10-11-2020 09:09:14 | 10-11-2020 08:29:55 | A | 6388 | 31
> 10-11-2020 09:09:24 | 10-11-2020 08:29:55 | A | 6396 | 31
> 10-11-2020 09:09:34 | 10-11-2020 08:29:55 | B | 6409 | 31
> 10-11-2020 09:09:44 | 10-11-2020 08:29:55 | B | 6426 | 31
> 10-11-2020 09:10:24 | 10-11-2020 08:29:55 | B | 6442 | 31
考虑到您想进行此计算BatchDate
明智的做法是,下面是使用旧方法的解决方案。
with cte as (
select
test.*,
@rn := (if (@rt =TagLetter, @rn, @rn+1)) rank_,
@rt :=TagLetter
from test , (select @rn := 1, @rt := '') t
)
select t1.*,t2.ExpectedResult from test t1
left join (
select distinct BatchDate, sum(CASE
WHEN TagLetter='A' THEN MAX(Totaliser)-MIN(Totaliser)
ELSE 0.0
END) over ()
AS ExpectedResult
from cte
group by BatchDate,TagLetter,rank_) t2 on t1.BatchDate=t2.BatchDate
LogDateAndTime | BatchDate | TagLetter | Totaliser | ExpectedResult |
---|---|---|---|---|
10-11-2020 09:06:14 | 10-11-2020 08:29:55 | A | 6319 | 31 |
10-11-2020 09:06:24 | 10-11-2020 08:29:55 | A | 6337 | 31 |
10-11-2020 09:08:14 | 10-11-2020 08:29:55 | B | 6355 | 31 |
10-11-2020 09:08:24 | 10-11-2020 08:29:55 | B | 6372 | 31 |
10-11-2020 09:08:34 | 10-11-2020 08:29:55 | B | 6378 | 31 |
10-11-2020 09:08:44 | 10-11-2020 08:29:55 | A | 6383 | 31 |
10-11-2020 09:09:14 | 10-11-2020 08:29:55 | A | 6388 | 31 |
10-11-2020 09:09:24 | 10-11-2020 08:29:55 | A | 6396 | 31 |
10-11-2020 09:09:34 | 10-11-2020 08:29:55 | B | 6409 | 31 |
10-11-2020 09:09:44 | 10-11-2020 08:29:55 | B | 6426 | 31 |
10-11-2020 09:10:24 | 10-11-2020 08:29:55 | B | 6442 | 31 |
上面的 table 有 LogDateAndTime(Primary_Key) 列,其中包含唯一的日期时间条目。 BatchDate 列在整个批次中包含相同的日期时间值。我需要为每个 TagLetter=A 实例计算 MAX(Totaliser)-MIN(Totaliser) 的总和,以便我应该忽略 TagLetter=B 中的值。在这种情况下,我的 ExpectedResult 将是 SUM[(6337-6319)+(6396-6383)]=31。我尝试了以下查询,但没有得到预期的结果。
SELECT SUM(
CASE
WHEN TagLetter='A' THEN MAX(Totaliser)-MIN(Totaliser)
ELSE 0.0
END
) OVER (PARTITION BY BatchDate) AS ExpectedResult
在这种情况下,它正在计算 6396-6319=77 这不是预期的结果。有人可以帮助我得到正确的结果吗?
首先使用 window 函数 LAG()
和 SUM()
创建连续 'A'
的组,然后聚合这些组:
WITH cte AS (
SELECT DISTINCT SUM(MAX(Totaliser) - MIN(Totaliser)) OVER () ExpectedResult
FROM (
SELECT *, SUM(flag) OVER (ORDER BY LogDateAndTime) grp
FROM (
SELECT *, LAG(TagLetter, 1, '') OVER (ORDER BY LogDateAndTime) <> 'A' flag
FROM tablename
) t
WHERE TagLetter = 'A'
) t
GROUP BY grp
)
SELECT t.*, c.ExpectedResult
FROM tablename t CROSS JOIN cte c
或者如果您想要每个 BatchDate
的结果:
WITH cte AS (
SELECT DISTINCT BatchDate,
SUM(MAX(Totaliser) - MIN(Totaliser)) OVER () ExpectedResult
FROM (
SELECT *, SUM(flag) OVER (PARTITION BY BatchDate ORDER BY LogDateAndTime) grp
FROM (
SELECT *, LAG(TagLetter, 1, '') OVER (PARTITION BY BatchDate ORDER BY LogDateAndTime) <> 'A' flag
FROM tablename
) t
WHERE TagLetter = 'A'
) t
GROUP BY BatchDate, grp
)
SELECT t.*, c.ExpectedResult
FROM tablename t LEFT JOIN cte c
ON c.BatchDate = t.BatchDate
参见demo。
结果:
> LogDateAndTime | BatchDate | TagLetter | Totaliser | ExpectedResult
> :------------------ | :------------------ | :-------- | --------: | -------------:
> 10-11-2020 09:06:14 | 10-11-2020 08:29:55 | A | 6319 | 31
> 10-11-2020 09:06:24 | 10-11-2020 08:29:55 | A | 6337 | 31
> 10-11-2020 09:08:14 | 10-11-2020 08:29:55 | B | 6355 | 31
> 10-11-2020 09:08:24 | 10-11-2020 08:29:55 | B | 6372 | 31
> 10-11-2020 09:08:34 | 10-11-2020 08:29:55 | B | 6378 | 31
> 10-11-2020 09:08:44 | 10-11-2020 08:29:55 | A | 6383 | 31
> 10-11-2020 09:09:14 | 10-11-2020 08:29:55 | A | 6388 | 31
> 10-11-2020 09:09:24 | 10-11-2020 08:29:55 | A | 6396 | 31
> 10-11-2020 09:09:34 | 10-11-2020 08:29:55 | B | 6409 | 31
> 10-11-2020 09:09:44 | 10-11-2020 08:29:55 | B | 6426 | 31
> 10-11-2020 09:10:24 | 10-11-2020 08:29:55 | B | 6442 | 31
考虑到您想进行此计算BatchDate
明智的做法是,下面是使用旧方法的解决方案。
with cte as (
select
test.*,
@rn := (if (@rt =TagLetter, @rn, @rn+1)) rank_,
@rt :=TagLetter
from test , (select @rn := 1, @rt := '') t
)
select t1.*,t2.ExpectedResult from test t1
left join (
select distinct BatchDate, sum(CASE
WHEN TagLetter='A' THEN MAX(Totaliser)-MIN(Totaliser)
ELSE 0.0
END) over ()
AS ExpectedResult
from cte
group by BatchDate,TagLetter,rank_) t2 on t1.BatchDate=t2.BatchDate