如何使用肘法在k-medoids中选择k值?

how to choose k value in k-medoids using elbow method?

我正在尝试这段代码: https://gist.github.com/jaganadhg/9a25fb531df47beb13e3

import pylab as plt
import numpy as np
from scipy.spatial.distance import cdist, pdist
from sklearn.cluster import KMeans
from sklearn.datasets import load_iris

iris = load_iris()

k = range(1,11)

clusters = [KMeans(n_clusters = c,init = 'k-means++').fit(iris.data) for c in k]
centr_lst = [cc.cluster_centers_ for cc in clusters]

k_distance = [cdist(iris.data, cent, 'euclidean') for cent in centr_lst]
clust_indx = [np.argmin(kd,axis=1) for kd in k_distance]
distances = [np.min(kd,axis=1) for kd in k_distance]
avg_within = [np.sum(dist)/iris.data.shape[0] for dist in distances]

with_in_sum_square = [np.sum(dist ** 2) for dist in distances]
to_sum_square = np.sum(pdist(iris.data) ** 2)/iris.data.shape[0]
bet_sum_square = to_sum_square - with_in_sum_square

kidx = 2

fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(k, avg_within, 'g*-')
ax.plot(k[kidx], avg_within[kidx], marker='o', markersize=12, \
markeredgewidth=2, markeredgecolor='r', markerfacecolor='None')
plt.grid(True)
plt.xlabel('Number of clusters')
plt.ylabel('Average within-cluster sum of squares')
plt.title('Elbow for KMeans clustering (IRIS Data)')

并且对 k-means 没有问题。 但是当我将 k-means 更改为 k-medoids

clusters = [KMedoids(n_clusters = c).fit(iris.data) for c in k]

(我使用 pyclust k-medoids https://github.com/mirjalil/pyclust/blob/master/pyclust/_kmedoids.py

我得到了"None"数组

[None, None, None, None, None, None, None]

怎么了? 有人可以帮忙吗?

因为您下载但不理解的代码的 fit 方法没有 return 值。

添加一个 return selfreturn self._clusters 或类似于 fit 的方法,它的行为可能符合预期,至少对于这一步。