Select 行,对于其他两个列的每个唯一组合,具有最大值(列值)
Select rows with Max(Column Value) for each unique combination of two other columns
我正在使用如下示例 table。数据集有多个组,每次写入 table 时,数据集的 RunNumber 以及每个组的数据和总数都会增加。每个 Dataset/Group 组合通常会有多行,示例如下:
RunNumber
Group
Dataset
Total
1
Group1
Dataset A
10
1
Group1
Dataset A
20
2
Group1
Dataset A
30
2
Group2
Dataset A
15
1
Group1
Dataset B
5
1
Group2
Dataset B
10
1
Group3
Dataset A
30
2
Group3
Dataset A
30
1
Group1
Dataset C
15
1
Group2
Dataset C
50
2
Group2
Dataset C
70
2
Group2
Dataset C
90
我想要做的对于数据集和组的每个组合都是必不可少的,return 具有给定 Dataset/Group 组合的最大(RunNumber)的行的所有数据。因此,例如,上面的示例将 return this:
RunNumber
Group
Dataset
Total
2
Group1
Dataset A
30
2
Group2
Dataset A
15
1
Group1
Dataset B
5
1
Group2
Dataset B
10
2
Group3
Dataset A
30
1
Group1
Dataset C
15
2
Group2
Dataset C
70
2
Group2
Dataset C
90
在 Dataset/Group 匹配的地方,所有行都保留给定组合的最大 RunNumber。
现在,我将其拆分为 2 个单独的查询,其中我首先查询所有不同 Dataset/Group 组合的 max(RunNumber),然后对所有匹配项执行 select *。任何帮助将不胜感激,提前致谢!
在MySQL 5.x中可以使用子查询。
SELECT *
FROM your_table
WHERE (`Group`, Dataset, RunNumber) IN (
SELECT `Group`, Dataset, MAX(RunNumber) AS MaxRunNumber
FROM your_table
GROUP BY `Group`, Dataset
);
测试 db<>fiddle here
备选方案
--
-- LEFT JOIN on bigger
--
SELECT t.*
FROM your_table t
LEFT JOIN your_table t2
ON t2.`Group` = t.`Group`
AND t2.Dataset = t.Dataset
AND t2.RunNumber > t.RunNumber
WHERE t2.RunNumber IS NULL
ORDER BY t.`Group`, t.Dataset;
--
-- where NOT EXISTS on bigger
--
SELECT *
FROM your_table t
WHERE NOT EXISTS (
SELECT 1
FROM your_table t2
WHERE t2.`Group` = t.`Group`
AND t2.Dataset = t.Dataset
AND t2.RunNumber > t.RunNumber
)
ORDER BY `Group`, Dataset;
--
-- Emulating DENSE_RANK = 1 with variables
-- Works also in 5.x
--
SELECT RunNumber, `Group`, Dataset, Total
FROM
(
SELECT
@rnk:=IF(@ds=Dataset AND @grp=`Group`, IF(@run=RunNumber, @rnk, @rnk+1), 1) AS Rnk
, @grp := `Group` as `Group`
, @ds := Dataset as Dataset
, @run := RunNumber as RunNumber
, Total
FROM your_table t
CROSS JOIN (SELECT @grp:=null, @ds:=null, @run:=null, @rnk := 0) var
ORDER BY `Group`, Dataset, RunNumber DESC
) q
WHERE Rnk = 1
ORDER BY `Group`, Dataset;
--
-- DENSE_RANK = 1
-- MySql 8 and beyond.
--
SELECT *
FROM
(
SELECT *
, DENSE_RANK() OVER (PARTITION BY `Group`, Dataset ORDER BY RunNumber DESC) AS rnk
FROM your_table
) q
WHERE rnk = 1
ORDER BY `Group`, Dataset;
我正在使用如下示例 table。数据集有多个组,每次写入 table 时,数据集的 RunNumber 以及每个组的数据和总数都会增加。每个 Dataset/Group 组合通常会有多行,示例如下:
RunNumber | Group | Dataset | Total |
---|---|---|---|
1 | Group1 | Dataset A | 10 |
1 | Group1 | Dataset A | 20 |
2 | Group1 | Dataset A | 30 |
2 | Group2 | Dataset A | 15 |
1 | Group1 | Dataset B | 5 |
1 | Group2 | Dataset B | 10 |
1 | Group3 | Dataset A | 30 |
2 | Group3 | Dataset A | 30 |
1 | Group1 | Dataset C | 15 |
1 | Group2 | Dataset C | 50 |
2 | Group2 | Dataset C | 70 |
2 | Group2 | Dataset C | 90 |
我想要做的对于数据集和组的每个组合都是必不可少的,return 具有给定 Dataset/Group 组合的最大(RunNumber)的行的所有数据。因此,例如,上面的示例将 return this:
RunNumber | Group | Dataset | Total |
---|---|---|---|
2 | Group1 | Dataset A | 30 |
2 | Group2 | Dataset A | 15 |
1 | Group1 | Dataset B | 5 |
1 | Group2 | Dataset B | 10 |
2 | Group3 | Dataset A | 30 |
1 | Group1 | Dataset C | 15 |
2 | Group2 | Dataset C | 70 |
2 | Group2 | Dataset C | 90 |
在 Dataset/Group 匹配的地方,所有行都保留给定组合的最大 RunNumber。 现在,我将其拆分为 2 个单独的查询,其中我首先查询所有不同 Dataset/Group 组合的 max(RunNumber),然后对所有匹配项执行 select *。任何帮助将不胜感激,提前致谢!
在MySQL 5.x中可以使用子查询。
SELECT *
FROM your_table
WHERE (`Group`, Dataset, RunNumber) IN (
SELECT `Group`, Dataset, MAX(RunNumber) AS MaxRunNumber
FROM your_table
GROUP BY `Group`, Dataset
);
测试 db<>fiddle here
备选方案
--
-- LEFT JOIN on bigger
--
SELECT t.*
FROM your_table t
LEFT JOIN your_table t2
ON t2.`Group` = t.`Group`
AND t2.Dataset = t.Dataset
AND t2.RunNumber > t.RunNumber
WHERE t2.RunNumber IS NULL
ORDER BY t.`Group`, t.Dataset;
--
-- where NOT EXISTS on bigger
--
SELECT *
FROM your_table t
WHERE NOT EXISTS (
SELECT 1
FROM your_table t2
WHERE t2.`Group` = t.`Group`
AND t2.Dataset = t.Dataset
AND t2.RunNumber > t.RunNumber
)
ORDER BY `Group`, Dataset;
--
-- Emulating DENSE_RANK = 1 with variables
-- Works also in 5.x
--
SELECT RunNumber, `Group`, Dataset, Total
FROM
(
SELECT
@rnk:=IF(@ds=Dataset AND @grp=`Group`, IF(@run=RunNumber, @rnk, @rnk+1), 1) AS Rnk
, @grp := `Group` as `Group`
, @ds := Dataset as Dataset
, @run := RunNumber as RunNumber
, Total
FROM your_table t
CROSS JOIN (SELECT @grp:=null, @ds:=null, @run:=null, @rnk := 0) var
ORDER BY `Group`, Dataset, RunNumber DESC
) q
WHERE Rnk = 1
ORDER BY `Group`, Dataset;
--
-- DENSE_RANK = 1
-- MySql 8 and beyond.
--
SELECT *
FROM
(
SELECT *
, DENSE_RANK() OVER (PARTITION BY `Group`, Dataset ORDER BY RunNumber DESC) AS rnk
FROM your_table
) q
WHERE rnk = 1
ORDER BY `Group`, Dataset;