集群中心的颜色与其数据点的颜色不匹配
Color of the center of the clusters do not match with the color of its data points
我有一个使用 Pandas 和 Sci-kit 学习的 Mean Shift 聚类的工作示例。我是 Python 的新手,所以我想我在这里缺少一些基本的东西。这是我的工作代码:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from sklearn.cluster import MeanShift
from matplotlib import style
style.use("ggplot")
filepath = "./Probes1.xlsx"
X = pd.read_excel(filepath, usecols="B:I", header=1)
df=pd.DataFrame(data=X)
np_array = df.values
print(np_array)
ms=MeanShift()
ms.fit(np_array)
labels= ms.labels_
cluster_centers = ms.cluster_centers_
print("cluster centers:")
print(cluster_centers)
labels_unique = np.unique(labels)
n_clusters_=len(labels_unique)
print("number of estimated clusters : %d" % n_clusters_)
#colors = 10*['r.','g.','b.','c.','k.','y.','m.']
for i in range(len(np_array)):
plt.scatter(np_array[i][0], np_array[i][1], edgecolors='face' )
plt.scatter(cluster_centers[:,0],cluster_centers[:,1],c='b',
marker = "x", s = 20, linewidths = 5, zorder = 10)
plt.show()
这是我从这段代码中得到的情节:
然而,集群中心的颜色与其数据点不匹配。任何帮助,将不胜感激。目前我已将我的中心颜色设置为蓝色 ('b')。谢谢!
编辑:
我能够创造这个!
编辑 2:
from itertools import cycle
import numpy as np
import pandas as pd
from sklearn.cluster import MeanShift
from sklearn.datasets.samples_generator import make_blobs
from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
filepath = "./Probes1.xlsx"
X = pd.read_excel(filepath, usecols="B:I", header=1) #import excel data
df=pd.DataFrame(data=X) #excel to dataframe to use in ML
np_array = df.values #dataframe
print(np_array) #printing dataframe
ms = MeanShift()
ms.fit(X) #Clustering
labels=ms.labels_
cluster_centers = ms.cluster_centers_ #coordinates of cluster centers
print("cluster centers:")
print(cluster_centers)
labels_unique = np.unique(labels)
n_clusters_=len(labels_unique) #no. of clusters
print("number of estimated clusters : %d" % n_clusters_)
# ################################# Plotting
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
colors=cycle('bgrkmycbgrkmycbgrkmycbgrkyc')
for k, col in zip(range(n_clusters_), colors):
my_members= labels == k
cluster_center = cluster_centers[k]
ax.scatter(np_array[my_members, 0], np_array[my_members, 1], np_array[my_members, 2], col + '.')
ax.scatter(cluster_centers[:,0], cluster_centers[:,1], cluster_centers[:,2], marker='o', s=300, linewidth=1, zorder=0)
print(col) #prints b g r k in the respective iterations
plt.title('Estimated number of clusters: %d' % n_clusters_)
plt.grid()
plt.show()
绘制此图:
同样颜色不匹配,在散点图中是否有替代 plt.plot 的 'markerfacecolor' 以便我可以将聚类的颜色与其数据点相匹配?
编辑 3:得到所需的结果:
您正在使用 c='b'
:
将聚类中心颜色设置为蓝色
plt.scatter(cluster_centers[:,0], cluster_centers[:,1], c='b', marker='x', s=20, linewidths=5, zorder=10)
要匹配两个散点的颜色,您必须为两者指定它们。
我有一个使用 Pandas 和 Sci-kit 学习的 Mean Shift 聚类的工作示例。我是 Python 的新手,所以我想我在这里缺少一些基本的东西。这是我的工作代码:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from sklearn.cluster import MeanShift
from matplotlib import style
style.use("ggplot")
filepath = "./Probes1.xlsx"
X = pd.read_excel(filepath, usecols="B:I", header=1)
df=pd.DataFrame(data=X)
np_array = df.values
print(np_array)
ms=MeanShift()
ms.fit(np_array)
labels= ms.labels_
cluster_centers = ms.cluster_centers_
print("cluster centers:")
print(cluster_centers)
labels_unique = np.unique(labels)
n_clusters_=len(labels_unique)
print("number of estimated clusters : %d" % n_clusters_)
#colors = 10*['r.','g.','b.','c.','k.','y.','m.']
for i in range(len(np_array)):
plt.scatter(np_array[i][0], np_array[i][1], edgecolors='face' )
plt.scatter(cluster_centers[:,0],cluster_centers[:,1],c='b',
marker = "x", s = 20, linewidths = 5, zorder = 10)
plt.show()
这是我从这段代码中得到的情节:
然而,集群中心的颜色与其数据点不匹配。任何帮助,将不胜感激。目前我已将我的中心颜色设置为蓝色 ('b')。谢谢!
编辑:
我能够创造这个!
编辑 2:
from itertools import cycle
import numpy as np
import pandas as pd
from sklearn.cluster import MeanShift
from sklearn.datasets.samples_generator import make_blobs
from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
filepath = "./Probes1.xlsx"
X = pd.read_excel(filepath, usecols="B:I", header=1) #import excel data
df=pd.DataFrame(data=X) #excel to dataframe to use in ML
np_array = df.values #dataframe
print(np_array) #printing dataframe
ms = MeanShift()
ms.fit(X) #Clustering
labels=ms.labels_
cluster_centers = ms.cluster_centers_ #coordinates of cluster centers
print("cluster centers:")
print(cluster_centers)
labels_unique = np.unique(labels)
n_clusters_=len(labels_unique) #no. of clusters
print("number of estimated clusters : %d" % n_clusters_)
# ################################# Plotting
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
colors=cycle('bgrkmycbgrkmycbgrkmycbgrkyc')
for k, col in zip(range(n_clusters_), colors):
my_members= labels == k
cluster_center = cluster_centers[k]
ax.scatter(np_array[my_members, 0], np_array[my_members, 1], np_array[my_members, 2], col + '.')
ax.scatter(cluster_centers[:,0], cluster_centers[:,1], cluster_centers[:,2], marker='o', s=300, linewidth=1, zorder=0)
print(col) #prints b g r k in the respective iterations
plt.title('Estimated number of clusters: %d' % n_clusters_)
plt.grid()
plt.show()
绘制此图:
同样颜色不匹配,在散点图中是否有替代 plt.plot 的 'markerfacecolor' 以便我可以将聚类的颜色与其数据点相匹配?
编辑 3:得到所需的结果:
您正在使用 c='b'
:
plt.scatter(cluster_centers[:,0], cluster_centers[:,1], c='b', marker='x', s=20, linewidths=5, zorder=10)
要匹配两个散点的颜色,您必须为两者指定它们。