SQL 按字段分组,每个分组仅 return 一个连接行

SQL group by a field and only return one joined row for each grouping

Table data

+-----+----------------+--------+----------------+
| ID  |  Required_by   |  Name  |  Another_Field |
+-----+----------------+--------+----------------+
| 1   |  7 August      |  cat   |  X             |
| 2   |  7 August      |  cat   |  Y             |
| 3   |  10 August     |  cat   |  Z             |
| 4   |  11 August     |  dog   |  A             |
+-----+----------------+--------+----------------+

我想要做的是按名称分组,然后为每个组选择日期要求最早的行之一。

对于这个数据集,我想以第 1 行和第 4 行或第 2 行和第 4 行结束。

预期结果:

+-----+----------------+--------+----------------+
| ID  |  Required_by   |  Name  |  Another_Field |
+-----+----------------+--------+----------------+
| 1   |  7 August      |  cat   |  X             |
| 4   |  11 August     |  dog   |  A             |
+-----+----------------+--------+----------------+

+-----+----------------+--------+----------------+
| ID  |  Required_by   |  Name  |  Another_Field |
+-----+----------------+--------+----------------+
| 2   |  7 August      |  cat   |  Y             |
| 4   |  11 August     |  dog   |  A             |
+-----+----------------+--------+----------------+

我有一些 returns 1,2 和 4,但我不确定如何只从第一组中选择一个来获得所需的结果。我加入了 data table 的分组,这样我可以在分组后得到 IDanother_field

SELECT d.id, d.name, d.required_by, d.another_field
FROM 
(
  SELECT min(required_by) as min_date, name
  FROM data
  GROUP BY name
) agg
INNER JOIN 
data d
on d.required_by = agg.min_date AND d.name = agg.name

这通常使用 window 函数解决:

select d.id, d.name, d.required_by, d.another_field
from (
  select id, name, required_by, another_field, 
         row_number() over (partition by name order by required_by) as rn
  from data
) d
where d.rn = 1;

在 Postgres 中使用 distinct on() 通常更快:

select distinct on (name) *
from data
order by name, required_by

Online example

SELECT [id]
      ,[date]
      ,[name]
  FROM [test].[dbo].[data]  
  WHERE date IN (SELECT min(date) FROM data GROUP BY name)

enter image description here