找到一公里内最大的点簇

Find the largest cluster of points within one kilometer

我在处理空间日期时 运行 在 Postgres 中遇到了一个问题。我有一个 table 列:对象 ID 及其坐标(纬度和经度,数据类型是几何)。我需要找到一公里内最大的点群。我该怎么做?

您可以使用 ST_ClusterDBSCAN 创建具有给定距离和最少点数的聚类。在下面的查询中,我创建了 1000 米内的点簇,最少有两个点,在外部查询中,我对它们进行了计数:

SELECT 
  cluster,count(*),ST_Union(geom)
FROM (
  SELECT 
    geom,
    ST_ClusterDBSCAN(geom,1000,2) OVER () AS cluster FROM t) j
WHERE cluster IS NOT NULL
GROUP BY 1
ORDER BY 2 DESC
FETCH FIRST ROW WITH TIES;

注意ST_ClusterDBSCAN期望距离以底层SRS为单位!因为我使用的是 EPSG:26986,它是米。根据您的 SRS 调整此参数。另请注意,FETCH FIRST ROW WITH TIES (PostgreSQL 13+) 使您能够从 ORDER BY 子句中获取第一个结果,以防它们共享相同的值。换句话说,如果两个或多个聚类包含相同数量的点,它们将全部列出。如果你使用 LIMIT 1FETCH FIRST ROW ONLY 你将只得到一个(任意)记录。

演示:db<>fiddle

CREATE TABLE t (geom geometry(point,26986));
-- 100 random points over the given polygon
INSERT INTO t 
SELECT (ST_Dump(ST_GeneratePoints('SRID=26986;POLYGON((4216635.436744224 3906228.6933973604,4225112.992686871 3914713.9388259286,4237223.322966506 3902640.089526773,4228720.260658542 3894129.3145975843,4216635.436744224 3906228.6933973604))',100,42))).geom;

Select 最大的集群(有联系),至少有 2 个点在 1 公里的距离内。

SELECT count(*),ST_Union(geom)
FROM (
  SELECT geom, ST_ClusterDBSCAN(geom,1000,2) OVER () AS cluster 
  FROM t) j
WHERE cluster IS NOT NULL
GROUP BY cluster
ORDER BY 1 DESC
FETCH FIRST ROW WITH TIES

 count |                                                                                                                                        st_union                                                                                                                                        
-------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
     6 | 01040000206A690000060000000101000000DEF0FA99831C5041605685812CC14D410101000000949040228D1C50418B6D583ACCC24D4101010000008441B6498F1C50414E04C49555C44D41010100000020AD7813131D50410179F7E239C24D41010100000015A4562E231D5041B07FE58158C04D4101010000009F78C907301D50414A726F118CC44D41

如果我们将这个结果集的最小边界圆与原始数据集重叠,我们可以更好地可视化聚类区域(仅供说明):

WITH i AS (
  SELECT count(*),ST_MinimumBoundingCircle(ST_Union(geom)) AS geom
  FROM (
    SELECT geom, ST_ClusterDBSCAN(geom,1000,2) OVER () AS cluster 
    FROM t) j
  WHERE cluster IS NOT NULL
  GROUP BY cluster
  ORDER BY 1 DESC
  FETCH FIRST ROW WITH TIES
)
SELECT geom FROM i
UNION 
SELECT geom FROM t

延伸阅读: