是否可以在scikit learn中通过多维缩放找到相似的顺序?

Is it possible to find similar orders by multi-dimensional scaling in scikit learn?

我有几个包含 10 个点的 3D 位置的文件(如相应图片中的绘图)。我想使用多维缩放来找到相似的排序并打印出不同的排序。例如,这里从文本文件1、2、4排序完全相同,但与3不同。

import numpy as np

from matplotlib import pyplot as plt
from matplotlib.collections import LineCollection

from sklearn import manifold
from sklearn.metrics import euclidean_distances
from sklearn.decomposition import PCA

A1=[[0.000, 0.000, 0.5],
[0.250, 0.000, 0.5],
[0.125, 0.250, 0.5],
[0.375, 0.250, 0.5],
[0.250, 0.500, 0.5],
[0.500, 0.500, 0.5],
[0.125, 0.750, 0.5],
[0.375, 0.750, 0.5],
[0.000, 1.000, 0.5],
[0.250, 1.000, 0.5]]
A2=[[0.500, 0.000, 0.5],
[0.750, 0.000, 0.5],
[0.375, 0.250, 0.5],
[0.625, 0.250, 0.5],
[0.250, 0.500, 0.5],
[0.500, 0.500, 0.5],
[0.375, 0.750, 0.5],
[0.625, 0.750, 0.5],
[0.500, 1.000, 0.5],
[0.750, 1.000, 0.5]]
A3=[[0.500, 0.000, 0.5],
[0.750, 0.000, 0.5],
[0.625, 0.250, 0.5],
[0.875, 0.250, 0.5],
[0.250, 0.500, 0.5],
[0.500, 0.500, 0.5],
[0.375, 0.750, 0.5],
[0.625, 0.750, 0.5],
[0.500, 1.000, 0.5],
[0.750, 1.000, 0.5]]
A4=[[0.250, 0.000, 0.5],
[0.500, 0.000, 0.5],
[0.375, 0.250, 0.5],
[0.625, 0.250, 0.5],
[0.500, 0.500, 0.5],
[0.750, 0.500, 0.5],
[0.375, 0.750, 0.5],
[0.625, 0.750, 0.5],
[0.250, 1.000, 0.5],
[0.500, 1.000, 0.5]]

print(len(A1), len(A2), len(A3), len(A4))
a1=euclidean_distances(A1)
a2=euclidean_distances(A2)
a3=euclidean_distances(A3)
a4=euclidean_distances(A4)
print(a1)

输出

Number of different orders: 2
A1
A3

设置数据和库:

import numpy as np
import pandas as pd

from matplotlib import pyplot as plt
from matplotlib.collections import LineCollection

from sklearn import manifold
from sklearn.metrics import euclidean_distances
from sklearn.decomposition import PCA

A1=[[0.000, 0.000, 0.5],
[0.250, 0.000, 0.5],
[0.125, 0.250, 0.5],
[0.375, 0.250, 0.5],
[0.250, 0.500, 0.5],
[0.500, 0.500, 0.5],
[0.125, 0.750, 0.5],
[0.375, 0.750, 0.5],
[0.000, 1.000, 0.5],
[0.250, 1.000, 0.5]]
A2=[[0.500, 0.000, 0.5],
[0.750, 0.000, 0.5],
[0.375, 0.250, 0.5],
[0.625, 0.250, 0.5],
[0.250, 0.500, 0.5],
[0.500, 0.500, 0.5],
[0.375, 0.750, 0.5],
[0.625, 0.750, 0.5],
[0.500, 1.000, 0.5],
[0.750, 1.000, 0.5]]
A3=[[0.500, 0.000, 0.5],
[0.750, 0.000, 0.5],
[0.625, 0.250, 0.5],
[0.875, 0.250, 0.5],
[0.250, 0.500, 0.5],
[0.500, 0.500, 0.5],
[0.375, 0.750, 0.5],
[0.625, 0.750, 0.5],
[0.500, 1.000, 0.5],
[0.750, 1.000, 0.5]]
A4=[[0.250, 0.000, 0.5],
[0.500, 0.000, 0.5],
[0.375, 0.250, 0.5],
[0.625, 0.250, 0.5],
[0.500, 0.500, 0.5],
[0.750, 0.500, 0.5],
[0.375, 0.750, 0.5],
[0.625, 0.750, 0.5],
[0.250, 1.000, 0.5],
[0.500, 1.000, 0.5]]

让我们以方便的方式放置数据并计算距离。

""""""
# Number of different elemnts
segments_dic = {'A1': A1,
    'A2': A2,
    'A3': A3,
    'A4': A4,}

# To clasify the elements
segments_distances = []
for i in segments_dic.keys():
    segments_distances.append(round(euclidean_distances(segments_dic[i]).sum(),3))

现在让我们检查哪些是更接近的点组:

"""Number of different elements / orders
I will round the results to make them comparable"""
different_elements = np.unique(segments_distances)
print("number of different orders: ",np.unique(segments_distances).__len__())
print("different orders: ", different_elements)
np.unique(segments_distances).__len__()

for i in different_elements:
    print("For element distance ",i," corresponding groups are: ")
    for j in segments_dic.keys():
        if i == round(euclidean_distances(segments_dic[j]).sum(),3):
            print(j)

输出结果如下:

number of different orders:  2
different orders:  [46.952 48.496]
For element distance  46.952  corresponding groups are: 
A1
A2
A4
For element distance  48.496  corresponding groups are: 
A3

看看我们是否可以用图片验证这个结果。

二维绘图:

"""Plots"""
for i in segments_dic.keys():
    # Rotate the data
    clf = PCA(n_components=2)
    X = clf.fit_transform(segments_dic[i])
    aux = pd.DataFrame(X)
    fig = plt.figure()
    plt.scatter(aux.iloc[:,0],aux.iloc[:,1])
    plt.title('{}'.format(i))
    fig.savefig('{}_representation.svg'.format(i))

上传图片:

结果在图片上得到验证。