使用聚合函数进行过滤

Filtering using aggregation functions

我想通过 MIN() 函数过滤我的 table,但仍保留无法分组的列。

我有 table:

+----+----------+----------------------+
| ID | distance |         geom         |
+----+----------+----------------------+
|  1 | 2        | DSDGSAsd23423DSFF    |
|  2 | 11.2     | SXSADVERG678BNDVS4   |
|  2 | 2        | XCZFETEFD567687SDF   |
|  3 | 24       | SADASDSVG3423FD      |
|  3 | 10       | SDFSDFSDF343DFDGF    |
|  4 | 34       | SFDHGHJ546GHJHJHJ    |
|  5 | 22       | SDFSGTHHGHGFHUKJYU45 |
|  6 | 78       | SDFDGDHKIKUI45       |
|  6 | 15       | DSGDHHJGHJKHGKHJKJ65 |
+----+----------+----------------------+

这是我想要实现的:

+----+----------+----------------------+
| ID | distance |         geom         |
+----+----------+----------------------+
| 1  |        2 | DSDGSAsd23423DSFF    |
|  2 |        2 | XCZFETEFD567687SDF   |
|  3 |       10 | SDFSDFSDF343DFDGF    |
|  4 |       34 | SFDHGHJ546GHJHJHJ    |
|  5 |       22 | SDFSGTHHGHGFHUKJYU45 |
|  6 |       15 | DSGDHHJGHJKHGKHJKJ65 |
+----+----------+----------------------+

当我在距离列上使用 MIN() 并按 ID 分组时,这是可能的,但随后我丢失了我的 geom,这是必不可少的。

查询如下所示:

SELECT "ID", MIN(distance) AS distance FROM somefile GROUP BY "ID" 

结果是:

+----+----------+
| ID | distance |
+----+----------+
| 1  |        2 |
|  2 |        2 |
|  3 |       10 |
|  4 |       34 |
|  5 |       22 |
|  6 |       15 |
+----+----------+

但这不是我想要的

有什么建议吗?

您需要一个 window 函数来执行此操作:

SELECT "ID", distance, geom
FROM (
  SELECT "ID", distance, geom, rank() OVER (PARTITION BY "ID" ORDER BY distance) AS rnk
  FROM somefile) sub
WHERE rnk = 1;

这有效地首先按 "ID" 值对整组行进行排序,然后按距离和 returns 距离最小的每个 "ID" 的记录 - 不需要做一个 GROUP BY.

一种常见的方法是在您加入的派生 table 中找到最小值:

SELECT somefile."ID", somefile.distance, somefile.geom 
FROM somefile 
JOIN (
    SELECT "ID", MIN(distance) AS distance FROM somefile GROUP BY "ID" 
) t ON t.distance = somefile.distance AND t.ID = somefile.ID;

Sample SQL Fiddle

select a.*,b.geom from
(SELECT ID, MIN(distance) AS distance FROM somefile  GROUP BY ID) as a
inner join somefile  as b on a.id=b.id and a.distance=b.distance

您可以使用 PostgreSQL 的 "distinct on" 子句。

select distinct on(id) id, distance, geom from table_name order by distance;

我认为这正是您要找的。

有关 "distinct on" 工作原理的更多详细信息,请参阅 documentation and the example

但是,请记住,使用 "distinct on" 不符合 SQL 标准。