如何在网格上聚集一组 3D 点？

Question

我在网格上有一组 3D 点，我想知道如何对它们进行聚类，使给定聚类中的每个点最多与该聚类中的至少一个点相距一定距离（欧氏距离）群集。

例如，假设有5个点：{A, B, C, D, E}。如果点 A 距离 B 至多 1.414 个单位，则它们属于同一簇，如果点 C 距离 A 或 [=12] 至多 1.414 个单位=]，那么它也属于那个集群。另一方面，如果 D 与 A、B 或 C 的距离超过 1.414 个单位，则它不属于该集群，依此类推.

可以找到输入数据here。
我需要的输出是每个簇的大小、每个簇的中心（质心）以及特定簇中的点。

虽然这不是严格意义上的聚类问题，但我尝试了 k-means、DBSCAN 聚类方法，但无法获得最佳结果。
关于如何进行的任何想法？

Answer 1

使用 hierarchical clustering 可以提供更高程度的控制，尽管对于大型数据集它可能比其他方法慢。

但是，您的集群规则似乎彼此不兼容：

If point A is at most 1.414 units from B, then they belong in the same cluster and if point C is at most 1.414 units from B or C then it also belongs to that cluster.

If on the other hand, D is more than 1.414 units from either A, B, or C, then it doesn't belong in that cluster and so on.

例如，在 1D 中，根据规则 1，点 [1,2,3] 属于同一个簇。点 [2,3,4] 也属于同一个簇。

但是，根据规则 2，点 [1,2,3,4] 不会属于同一簇，而点 2 和 3 属于簇 [1,2,3] 和 [2,3,4]。

如果您将第二条规则改写为

If on the other hand, D is more than 1.414 units from all A, B, or C, then it doesn't belong in that cluster and so on.

然后您指定了所谓的 single linkage method，您可以像这样使用它：

>>> X = [[0, 0], [0, 1], [1, 0],
         [0, 4], [0, 3], [1, 4],
         [4, 0], [3, 0], [4, 1],
         [4, 4], [3, 4], [4, 3]]
>>> from scipy.cluster.hierarchy import fcluster, single
>>> from scipy.spatial.distance import pdist
>>> y = pdist(X)
>>> Z = single(y)
>>> fcluster(Z, 1.414, criterion='distance')
[3 3 3 4 4 4 2 2 2 1 1 1]

表示点 [0,1,2] 形成簇 3，点 [3,4,5] 形成簇 4 等等。

如何在网格上聚集一组 3D 点？

How to cluster a set of 3D points on a grid?

python

numpy

cluster-analysis

scipy

scikit-learn