使用 python 计算一维数组和 nD 数组之间的距离
Calculate distance between 1D array and nD array using python
我是 python 的初学者,希望您能帮助我解决问题。
我有两个文件 library.csv(9 列)和 cases.csv(8 列)我用 np.loadtxt 阅读了它们。我从库中 select 列将它们放入数组 base[],除了最后一列,我将 cases.csv 放入数组 problems[]。我会计算问题数组中每一行与基础 [] 数组的所有行之间的马哈拉诺比斯距离,并将最小距离存储在 table.
中
这是我的代码:
# Imports
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sn
from keras.models import load_model
from scipy.spatial import distance
# [1] Get the library.csv and cases.scv
library = np.loadtxt("library.csv", delimiter=",")
cases = np.loadtxt("cases.csv", delimiter=",")
problems = np.loadtxt("cases.csv", delimiter=",") #cases copie
# Select columns from library to use as base cases, except solutions
base = library[:, range(library.shape[1] - 1)] # Exclude last column (solution)
# Move through all problem cases
for i in range(problems.shape[0]):
# [3.1] Get inverse covariance matrix for the base cases
covariance_matrix = np.cov(base) # Covariance
inverse_covariance_matrix = np.linalg.pinv(covariance_matrix) # Inverse
# [3.2] Get case row to evaluate
case_row = problems[i, :]
# Empty distances array to store mahalanobis distances obtained comparing each library cases
distances = np.zeros(base.shape[0])
# [3.3] For each base cases rows
for j in range(base.shape[0]):
# Get base case row
base_row = base[j, :]
# [3.4] Calculate mahalanobis distance between case row and base cases, and store it
distances[j] = distance.mahalanobis(case_row, base_row, inverse_covariance_matrix)
# [3.5] Returns the index (row) of the minimum value in distances calculated
min_distance_row = np.argmin(distances)
但是我得到这个错误:
Using TensorFlow backend.
Traceback (most recent call last):
File "C:\Users\HP\Desktop\MyAlgo\mainAlgo.py", line 45, in
distances[j] = distance.mahalanobis(case_row, base_row, inverse_covariance_matrix)
File "C:\Users\HP\AppData\Local\Programs\Python\Python38\lib\site-packages\scipy\spatial\distance.py", line 1083, in mahalanobis
m = np.dot(np.dot(delta, VI), delta)
File "<array_function internals>", line 5, in dot
ValueError: shapes (8,) and (384,384) not aligned: 8 (dim 0) != 384 (dim 0)
您的问题似乎是 base_row
和 case_row
的长度为 8,而 covariance_matrix 包含 384 个变量,这些数字应该相同。
因此无法进行矩阵乘法。
我不了解你的数据和此处的统计属性,但我的猜测是你需要在计算协方差矩阵之前转置 base
。在调用 np.cov(base)
中,base
中的一行应包含单个变量的所有观察值。
https://numpy.org/devdocs/reference/generated/numpy.cov.html
我是 python 的初学者,希望您能帮助我解决问题。
我有两个文件 library.csv(9 列)和 cases.csv(8 列)我用 np.loadtxt 阅读了它们。我从库中 select 列将它们放入数组 base[],除了最后一列,我将 cases.csv 放入数组 problems[]。我会计算问题数组中每一行与基础 [] 数组的所有行之间的马哈拉诺比斯距离,并将最小距离存储在 table.
中这是我的代码:
# Imports
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sn
from keras.models import load_model
from scipy.spatial import distance
# [1] Get the library.csv and cases.scv
library = np.loadtxt("library.csv", delimiter=",")
cases = np.loadtxt("cases.csv", delimiter=",")
problems = np.loadtxt("cases.csv", delimiter=",") #cases copie
# Select columns from library to use as base cases, except solutions
base = library[:, range(library.shape[1] - 1)] # Exclude last column (solution)
# Move through all problem cases
for i in range(problems.shape[0]):
# [3.1] Get inverse covariance matrix for the base cases
covariance_matrix = np.cov(base) # Covariance
inverse_covariance_matrix = np.linalg.pinv(covariance_matrix) # Inverse
# [3.2] Get case row to evaluate
case_row = problems[i, :]
# Empty distances array to store mahalanobis distances obtained comparing each library cases
distances = np.zeros(base.shape[0])
# [3.3] For each base cases rows
for j in range(base.shape[0]):
# Get base case row
base_row = base[j, :]
# [3.4] Calculate mahalanobis distance between case row and base cases, and store it
distances[j] = distance.mahalanobis(case_row, base_row, inverse_covariance_matrix)
# [3.5] Returns the index (row) of the minimum value in distances calculated
min_distance_row = np.argmin(distances)
但是我得到这个错误:
Using TensorFlow backend.
Traceback (most recent call last):
File "C:\Users\HP\Desktop\MyAlgo\mainAlgo.py", line 45, in
distances[j] = distance.mahalanobis(case_row, base_row, inverse_covariance_matrix)
File "C:\Users\HP\AppData\Local\Programs\Python\Python38\lib\site-packages\scipy\spatial\distance.py", line 1083, in mahalanobis
m = np.dot(np.dot(delta, VI), delta)
File "<array_function internals>", line 5, in dot
ValueError: shapes (8,) and (384,384) not aligned: 8 (dim 0) != 384 (dim 0)
您的问题似乎是 base_row
和 case_row
的长度为 8,而 covariance_matrix 包含 384 个变量,这些数字应该相同。
因此无法进行矩阵乘法。
我不了解你的数据和此处的统计属性,但我的猜测是你需要在计算协方差矩阵之前转置 base
。在调用 np.cov(base)
中,base
中的一行应包含单个变量的所有观察值。
https://numpy.org/devdocs/reference/generated/numpy.cov.html