由jaccard系数构造的相似矩阵的谱聚类
Spectral clustering with Similarity matrix constructed by jaccard coefficient
我有一个分类数据集,我正在对其执行谱聚类。但是我没有得到很好的输出。我选择对应于最大特征值的特征向量作为我的 k-means 质心。
请在下面找到我遵循的过程:
1. Create a symmetric similarity matrix (m*m) using jaccard coefficient.
For example, for a data set,
a,b,c,d
a,b,x,y
The similarity matrix I compute would look like :
|1 0.33|
|0.33 1 |
2. Compute the first k eigen vectors corresponding to largest eigen values. where k is the number of cluster.
3. Normalize the symmetric similarity matrix
4. perform the clustering on the normalized similarity matrix using eigen vectors as initial centroids for k-means.
我的问题是:
Is computing Jaccard similarity matrix the right choice for spectral clustering.
Is it the right way of selecting eigen vectors as cluster centroids for spectal clustering because I dont see other options for categorical dataset.
Is there anything wrong with the procedure I follow.
据我所知,您混合并混合了多种方法。难怪没用...
- 您可以简单地使用 jaccard 距离(jaccard 相似度的简单反转)+ 层次聚类
- 你可以做 MDS 来投影你的数据,然后是 k-means(可能是你想要做的)
- 亲和传播等值得一试
我有一个分类数据集,我正在对其执行谱聚类。但是我没有得到很好的输出。我选择对应于最大特征值的特征向量作为我的 k-means 质心。
请在下面找到我遵循的过程:
1. Create a symmetric similarity matrix (m*m) using jaccard coefficient.
For example, for a data set,
a,b,c,d
a,b,x,y
The similarity matrix I compute would look like :
|1 0.33|
|0.33 1 |
2. Compute the first k eigen vectors corresponding to largest eigen values. where k is the number of cluster.
3. Normalize the symmetric similarity matrix
4. perform the clustering on the normalized similarity matrix using eigen vectors as initial centroids for k-means.
我的问题是:
Is computing Jaccard similarity matrix the right choice for spectral clustering.
Is it the right way of selecting eigen vectors as cluster centroids for spectal clustering because I dont see other options for categorical dataset.
Is there anything wrong with the procedure I follow.
据我所知,您混合并混合了多种方法。难怪没用...
- 您可以简单地使用 jaccard 距离(jaccard 相似度的简单反转)+ 层次聚类
- 你可以做 MDS 来投影你的数据,然后是 k-means(可能是你想要做的)
- 亲和传播等值得一试