DBSCAN 用于绘制坐标数据簇
DBSCAN for plotting clusters of coordinate data
我有一组坐标数据(在 Web Mercator Eastings 和 Northings 中,因此以米为单位)如下所示:
array([[ -232372.201264, 6785082.61011 ],
[ -233396.451899, 6784865.49884 ],
[ -234045.110572, 6784642.2575 ],
...,
[ -234473.356653, 6778646.81953 ],
[ -234918.300657, 6778772.69366 ],
[ -230900.668915, 6778369.2902 ]])
这个数组存储为变量'coords'。
我正在尝试使用 Scikit Learn 和 DBSCAN 计算并绘制该数据集中的聚类(感谢 post 让我走到这一步)。
我使用的代码取自 this 教程,但是我得到了一个属性错误。代码和错误如下所示:
db = DBSCAN(eps=0.2, min_samples=1, metric="precomputed")
cluster_labels = db.labels_
num_clusters = len(set(cluster_labels))
clusters = pd.Series([coords[cluster_labels == n] for n in range(num_clusters)])
print('Number of clusters: {}'.format(num_clusters))
...
AttributeError: 'DBSCAN' object has no attribute 'labels_'
谁能解释我哪里出错了?
你必须这样称呼它
db=DBSCAN(eps=0.2, min_samples=1, metric="precomputed").fit(mymatrix)
(请注意fit()
函数)
你错过了fit
:
db = DBSCAN(eps=0.2, min_samples=1, metric="precomputed")
db.fit(data)
cluster_labels = db.labels_
num_clusters = len(set(cluster_labels))
clusters = pd.Series([coords[cluster_labels == n] for n in range(num_clusters)])
print('Number of clusters: {}'.format(num_clusters))
我有一组坐标数据(在 Web Mercator Eastings 和 Northings 中,因此以米为单位)如下所示:
array([[ -232372.201264, 6785082.61011 ],
[ -233396.451899, 6784865.49884 ],
[ -234045.110572, 6784642.2575 ],
...,
[ -234473.356653, 6778646.81953 ],
[ -234918.300657, 6778772.69366 ],
[ -230900.668915, 6778369.2902 ]])
这个数组存储为变量'coords'。
我正在尝试使用 Scikit Learn 和 DBSCAN 计算并绘制该数据集中的聚类(感谢
我使用的代码取自 this 教程,但是我得到了一个属性错误。代码和错误如下所示:
db = DBSCAN(eps=0.2, min_samples=1, metric="precomputed")
cluster_labels = db.labels_
num_clusters = len(set(cluster_labels))
clusters = pd.Series([coords[cluster_labels == n] for n in range(num_clusters)])
print('Number of clusters: {}'.format(num_clusters))
...
AttributeError: 'DBSCAN' object has no attribute 'labels_'
谁能解释我哪里出错了?
你必须这样称呼它
db=DBSCAN(eps=0.2, min_samples=1, metric="precomputed").fit(mymatrix)
(请注意fit()
函数)
你错过了fit
:
db = DBSCAN(eps=0.2, min_samples=1, metric="precomputed")
db.fit(data)
cluster_labels = db.labels_
num_clusters = len(set(cluster_labels))
clusters = pd.Series([coords[cluster_labels == n] for n in range(num_clusters)])
print('Number of clusters: {}'.format(num_clusters))