计算 MySQL8 中重复条目组的最大值和最小值之和

Calculate sum of maximum and minimum value for repeated group of entries in MySQL8

LogDateAndTime BatchDate TagLetter Totaliser ExpectedResult
10-11-2020 09:06:14 10-11-2020 08:29:55 A 6319 31
10-11-2020 09:06:24 10-11-2020 08:29:55 A 6337 31
10-11-2020 09:08:14 10-11-2020 08:29:55 B 6355 31
10-11-2020 09:08:24 10-11-2020 08:29:55 B 6372 31
10-11-2020 09:08:34 10-11-2020 08:29:55 B 6378 31
10-11-2020 09:08:44 10-11-2020 08:29:55 A 6383 31
10-11-2020 09:09:14 10-11-2020 08:29:55 A 6388 31
10-11-2020 09:09:24 10-11-2020 08:29:55 A 6396 31
10-11-2020 09:09:34 10-11-2020 08:29:55 B 6409 31
10-11-2020 09:09:44 10-11-2020 08:29:55 B 6426 31
10-11-2020 09:10:24 10-11-2020 08:29:55 B 6442 31

上面的 table 有 LogDateAndTime(Primary_Key) 列,其中包含唯一的日期时间条目。 BatchDate 列在整个批次中包含相同的日期时间值。我需要为每个 TagLetter=A 实例计算 MAX(Totaliser)-MIN(Totaliser) 的总和,以便我应该忽略 TagLetter=B 中的值。在这种情况下,我的 ExpectedResult 将是 SUM[(6337-6319)+(6396-6383)]=31。我尝试了以下查询,但没有得到预期的结果。

SELECT SUM(
           CASE 
             WHEN TagLetter='A' THEN MAX(Totaliser)-MIN(Totaliser) 
             ELSE 0.0 
           END
          ) OVER (PARTITION BY BatchDate) AS ExpectedResult

在这种情况下,它正在计算 6396-6319=77 这不是预期的结果。有人可以帮助我得到正确的结果吗?

首先使用 window 函数 LAG()SUM() 创建连续 'A' 的组,然后聚合这些组:

WITH cte AS (
  SELECT DISTINCT SUM(MAX(Totaliser) - MIN(Totaliser)) OVER () ExpectedResult
  FROM (
    SELECT *, SUM(flag) OVER (ORDER BY LogDateAndTime) grp
    FROM (
      SELECT *, LAG(TagLetter, 1, '') OVER (ORDER BY LogDateAndTime) <> 'A' flag
      FROM tablename 
    ) t
    WHERE TagLetter = 'A'
  ) t
  GROUP BY grp
)
SELECT t.*, c.ExpectedResult
FROM tablename t CROSS JOIN cte c

或者如果您想要每个 BatchDate 的结果:

WITH cte AS (
  SELECT DISTINCT BatchDate,
         SUM(MAX(Totaliser) - MIN(Totaliser)) OVER () ExpectedResult
  FROM (
    SELECT *, SUM(flag) OVER (PARTITION BY BatchDate ORDER BY LogDateAndTime) grp
    FROM (
      SELECT *, LAG(TagLetter, 1, '') OVER (PARTITION BY BatchDate ORDER BY LogDateAndTime) <> 'A' flag
      FROM tablename 
    ) t
    WHERE TagLetter = 'A'
  ) t
  GROUP BY BatchDate, grp
)
SELECT t.*, c.ExpectedResult
FROM tablename t LEFT JOIN cte c
ON c.BatchDate = t.BatchDate

参见demo
结果:

> LogDateAndTime      | BatchDate           | TagLetter | Totaliser | ExpectedResult
> :------------------ | :------------------ | :-------- | --------: | -------------:
> 10-11-2020 09:06:14 | 10-11-2020 08:29:55 | A         |      6319 |             31
> 10-11-2020 09:06:24 | 10-11-2020 08:29:55 | A         |      6337 |             31
> 10-11-2020 09:08:14 | 10-11-2020 08:29:55 | B         |      6355 |             31
> 10-11-2020 09:08:24 | 10-11-2020 08:29:55 | B         |      6372 |             31
> 10-11-2020 09:08:34 | 10-11-2020 08:29:55 | B         |      6378 |             31
> 10-11-2020 09:08:44 | 10-11-2020 08:29:55 | A         |      6383 |             31
> 10-11-2020 09:09:14 | 10-11-2020 08:29:55 | A         |      6388 |             31
> 10-11-2020 09:09:24 | 10-11-2020 08:29:55 | A         |      6396 |             31
> 10-11-2020 09:09:34 | 10-11-2020 08:29:55 | B         |      6409 |             31
> 10-11-2020 09:09:44 | 10-11-2020 08:29:55 | B         |      6426 |             31
> 10-11-2020 09:10:24 | 10-11-2020 08:29:55 | B         |      6442 |             31

考虑到您想进行此计算BatchDate明智的做法是,下面是使用旧方法的解决方案。

with cte as (
select 
test.*, 
@rn := (if (@rt =TagLetter, @rn, @rn+1)) rank_,
@rt :=TagLetter
from test , (select @rn := 1,  @rt := '') t
)

select t1.*,t2.ExpectedResult from test t1
left join (
select  distinct BatchDate, sum(CASE 
             WHEN TagLetter='A' THEN MAX(Totaliser)-MIN(Totaliser) 
             ELSE 0.0 
           END) over ()
           AS ExpectedResult
          
          from cte
          
group by BatchDate,TagLetter,rank_) t2 on t1.BatchDate=t2.BatchDate

DEMO