使用矩阵乘法用 numpy 计算 L2 距离
Compute L2 distance with numpy using matrix multiplication
我正在尝试自己完成 Stanford CS231n 2017 CNN course 的作业。
我正在尝试使用 Numpy 仅使用矩阵乘法和求和广播来计算 L2 距离。 L2距离为:
我想如果我使用这个公式我可以做到:
以下代码显示了计算 L2 距离的三种方法。如果我将方法 compute_distances_two_loops
的输出与方法 compute_distances_one_loop
的输出进行比较,两者是相等的。但是我将方法 compute_distances_two_loops
的输出与方法 compute_distances_no_loops
的输出进行比较,其中我仅使用矩阵乘法和求和广播实现了 L2 距离,它们是不同的。
def compute_distances_two_loops(self, X):
"""
Compute the distance between each test point in X and each training point
in self.X_train using a nested loop over both the training data and the
test data.
Inputs:
- X: A numpy array of shape (num_test, D) containing test data.
Returns:
- dists: A numpy array of shape (num_test, num_train) where dists[i, j]
is the Euclidean distance between the ith test point and the jth training
point.
"""
num_test = X.shape[0]
num_train = self.X_train.shape[0]
dists = np.zeros((num_test, num_train))
for i in xrange(num_test):
for j in xrange(num_train):
#####################################################################
# TODO: #
# Compute the l2 distance between the ith test point and the jth #
# training point, and store the result in dists[i, j]. You should #
# not use a loop over dimension. #
#####################################################################
#dists[i, j] = np.sqrt(np.sum((X[i, :] - self.X_train[j, :]) ** 2))
dists[i, j] = np.sqrt(np.sum(np.square(X[i, :] - self.X_train[j, :])))
#####################################################################
# END OF YOUR CODE #
#####################################################################
return dists
def compute_distances_one_loop(self, X):
"""
Compute the distance between each test point in X and each training point
in self.X_train using a single loop over the test data.
Input / Output: Same as compute_distances_two_loops
"""
num_test = X.shape[0]
num_train = self.X_train.shape[0]
dists = np.zeros((num_test, num_train))
for i in xrange(num_test):
#######################################################################
# TODO: #
# Compute the l2 distance between the ith test point and all training #
# points, and store the result in dists[i, :]. #
#######################################################################
dists[i, :] = np.sqrt(np.sum(np.square(self.X_train - X[i, :]), axis = 1))
#######################################################################
# END OF YOUR CODE #
#######################################################################
print(dists.shape)
return dists
def compute_distances_no_loops(self, X):
"""
Compute the distance between each test point in X and each training point
in self.X_train using no explicit loops.
Input / Output: Same as compute_distances_two_loops
"""
num_test = X.shape[0]
num_train = self.X_train.shape[0]
dists = np.zeros((num_test, num_train))
#########################################################################
# TODO: #
# Compute the l2 distance between all test points and all training #
# points without using any explicit loops, and store the result in #
# dists. #
# #
# You should implement this function using only basic array operations; #
# in particular you should not use functions from scipy. #
# #
# HINT: Try to formulate the l2 distance using matrix multiplication #
# and two broadcast sums. #
#########################################################################
dists = np.sqrt(-2 * np.dot(X, self.X_train.T) +
np.sum(np.square(self.X_train), axis=1) +
np.sum(np.square(X), axis=1)[:, np.newaxis])
print(dists.shape)
#########################################################################
# END OF YOUR CODE #
#########################################################################
return dists
您可以找到完整的可测试代码 here。
你知道我在 compute_distances_no_loops
或其他地方做错了什么吗?
更新:
抛出错误信息的代码是:
dists_two = classifier.compute_distances_no_loops(X_test)
# check that the distance matrix agrees with the one we computed before:
difference = np.linalg.norm(dists - dists_two, ord='fro')
print('Difference was: %f' % (difference, ))
if difference < 0.001:
print('Good! The distance matrices are the same')
else:
print('Uh-oh! The distance matrices are different')
错误信息:
Difference was: 372100.327569
Uh-oh! The distance matrices are different
我认为您正在寻找成对距离。
有一个惊人的技巧可以在一行中做到这一点。你要巧妙地玩广播:
X_test = np.expand_dims(X, 0) # shape: [1, num_tests, D]
X_train = np.expand_dims(self.X_train, 1) # shape: [num_train, 1, D]
dists = np.square(X_train - X_test) # Thanks to broadcast [num_train, num_tests, D]
dists = np.sqrt(np.sum(dists, axis=-1)) # [num_train, num_tests]
下面是如何在不创建任何 3 维矩阵的情况下计算 X 和 Y 行之间的成对距离的方法:
def dist(X, Y):
sx = np.sum(X**2, axis=1, keepdims=True)
sy = np.sum(Y**2, axis=1, keepdims=True)
return np.sqrt(-2 * X.dot(Y.T) + sx + sy.T)
我认为问题出在不一致的数组形状上。
#a^2 matrix (500, 1)
alpha = np.sum(np.square(X), axis=1)
alpha = alpha.reshape(len(alpha), 1)
print(alpha.shape)
#b^2 matrix (1, 5000)
beta = np.sum(np.square(self.X_train.T), axis=0)
beta = beta.reshape(1, len(beta))
print(beta.shape)
#ab matrix (500, 5000)
alphabeta = np.dot(X, self.X_train.T)
print(alphabeta.shape)
dists = np.sqrt(-2 * alphabeta + alpha + beta)
这是一个迟到的回复,但我已经用不同的方式解决了它并想 post 它。当我解决这个问题时,我并没有意识到 numpy 的列行向量从矩阵中减去。事实证明,我们可以从 nxm 中减去 nx1 或 1xm 向量,当我们这样做时,从每个行-列向量中减去。如果使用不支持这种行为的库,he/she 可以使用我的。对于这种情况,我算了一下,结果如下:
sum_x_train=np.sum(self.X_train**2,axis=1, keepdims=True)
sum_x_test=np.sum(X**2,axis=1, keepdims=True)
sum_2x_tr_te=np.dot(self.X_train,X.T)*2
sum_x_train=np.dot(sum_x_train,np.ones((1,X.shape[0])))
sum_x_test=np.dot(sum_x_test,np.ones((1,self.X_train.shape[0])))
dists=np.sqrt(sum_x_test.T+sum_x_train-sum_2x_tr_te).T
这种方法的缺点是它会占用更多内存。
这是我针对 OP 要求的功能 compute_distances_no_loops()
的解决方案。出于性能原因,我不使用 sqrt()
函数:
def compute_distances_no_loops(self, X):
num_test = X.shape[0]
num_train = self.X_train.shape[0]
dists = np.zeros((num_test, num_train))
#---------------
# Get square of X and X_train
X_sq = np.sum(X**2, axis=1, keepdims=True)
Xtrain_sq = np.sum(self.X_train**2, axis=1, keepdims=True)
# Calculate (squared) dists as (X_train - X)**2 = X_train**2 - 2*X_train*X + X**2
dists = -2*X.dot(self.X_train.T) + X_sq + Xtrain_sq.T
#---------------
return dists
我正在尝试自己完成 Stanford CS231n 2017 CNN course 的作业。
我正在尝试使用 Numpy 仅使用矩阵乘法和求和广播来计算 L2 距离。 L2距离为:
我想如果我使用这个公式我可以做到:
以下代码显示了计算 L2 距离的三种方法。如果我将方法 compute_distances_two_loops
的输出与方法 compute_distances_one_loop
的输出进行比较,两者是相等的。但是我将方法 compute_distances_two_loops
的输出与方法 compute_distances_no_loops
的输出进行比较,其中我仅使用矩阵乘法和求和广播实现了 L2 距离,它们是不同的。
def compute_distances_two_loops(self, X):
"""
Compute the distance between each test point in X and each training point
in self.X_train using a nested loop over both the training data and the
test data.
Inputs:
- X: A numpy array of shape (num_test, D) containing test data.
Returns:
- dists: A numpy array of shape (num_test, num_train) where dists[i, j]
is the Euclidean distance between the ith test point and the jth training
point.
"""
num_test = X.shape[0]
num_train = self.X_train.shape[0]
dists = np.zeros((num_test, num_train))
for i in xrange(num_test):
for j in xrange(num_train):
#####################################################################
# TODO: #
# Compute the l2 distance between the ith test point and the jth #
# training point, and store the result in dists[i, j]. You should #
# not use a loop over dimension. #
#####################################################################
#dists[i, j] = np.sqrt(np.sum((X[i, :] - self.X_train[j, :]) ** 2))
dists[i, j] = np.sqrt(np.sum(np.square(X[i, :] - self.X_train[j, :])))
#####################################################################
# END OF YOUR CODE #
#####################################################################
return dists
def compute_distances_one_loop(self, X):
"""
Compute the distance between each test point in X and each training point
in self.X_train using a single loop over the test data.
Input / Output: Same as compute_distances_two_loops
"""
num_test = X.shape[0]
num_train = self.X_train.shape[0]
dists = np.zeros((num_test, num_train))
for i in xrange(num_test):
#######################################################################
# TODO: #
# Compute the l2 distance between the ith test point and all training #
# points, and store the result in dists[i, :]. #
#######################################################################
dists[i, :] = np.sqrt(np.sum(np.square(self.X_train - X[i, :]), axis = 1))
#######################################################################
# END OF YOUR CODE #
#######################################################################
print(dists.shape)
return dists
def compute_distances_no_loops(self, X):
"""
Compute the distance between each test point in X and each training point
in self.X_train using no explicit loops.
Input / Output: Same as compute_distances_two_loops
"""
num_test = X.shape[0]
num_train = self.X_train.shape[0]
dists = np.zeros((num_test, num_train))
#########################################################################
# TODO: #
# Compute the l2 distance between all test points and all training #
# points without using any explicit loops, and store the result in #
# dists. #
# #
# You should implement this function using only basic array operations; #
# in particular you should not use functions from scipy. #
# #
# HINT: Try to formulate the l2 distance using matrix multiplication #
# and two broadcast sums. #
#########################################################################
dists = np.sqrt(-2 * np.dot(X, self.X_train.T) +
np.sum(np.square(self.X_train), axis=1) +
np.sum(np.square(X), axis=1)[:, np.newaxis])
print(dists.shape)
#########################################################################
# END OF YOUR CODE #
#########################################################################
return dists
您可以找到完整的可测试代码 here。
你知道我在 compute_distances_no_loops
或其他地方做错了什么吗?
更新:
抛出错误信息的代码是:
dists_two = classifier.compute_distances_no_loops(X_test)
# check that the distance matrix agrees with the one we computed before:
difference = np.linalg.norm(dists - dists_two, ord='fro')
print('Difference was: %f' % (difference, ))
if difference < 0.001:
print('Good! The distance matrices are the same')
else:
print('Uh-oh! The distance matrices are different')
错误信息:
Difference was: 372100.327569
Uh-oh! The distance matrices are different
我认为您正在寻找成对距离。
有一个惊人的技巧可以在一行中做到这一点。你要巧妙地玩广播:
X_test = np.expand_dims(X, 0) # shape: [1, num_tests, D]
X_train = np.expand_dims(self.X_train, 1) # shape: [num_train, 1, D]
dists = np.square(X_train - X_test) # Thanks to broadcast [num_train, num_tests, D]
dists = np.sqrt(np.sum(dists, axis=-1)) # [num_train, num_tests]
下面是如何在不创建任何 3 维矩阵的情况下计算 X 和 Y 行之间的成对距离的方法:
def dist(X, Y):
sx = np.sum(X**2, axis=1, keepdims=True)
sy = np.sum(Y**2, axis=1, keepdims=True)
return np.sqrt(-2 * X.dot(Y.T) + sx + sy.T)
我认为问题出在不一致的数组形状上。
#a^2 matrix (500, 1)
alpha = np.sum(np.square(X), axis=1)
alpha = alpha.reshape(len(alpha), 1)
print(alpha.shape)
#b^2 matrix (1, 5000)
beta = np.sum(np.square(self.X_train.T), axis=0)
beta = beta.reshape(1, len(beta))
print(beta.shape)
#ab matrix (500, 5000)
alphabeta = np.dot(X, self.X_train.T)
print(alphabeta.shape)
dists = np.sqrt(-2 * alphabeta + alpha + beta)
这是一个迟到的回复,但我已经用不同的方式解决了它并想 post 它。当我解决这个问题时,我并没有意识到 numpy 的列行向量从矩阵中减去。事实证明,我们可以从 nxm 中减去 nx1 或 1xm 向量,当我们这样做时,从每个行-列向量中减去。如果使用不支持这种行为的库,he/she 可以使用我的。对于这种情况,我算了一下,结果如下:
sum_x_train=np.sum(self.X_train**2,axis=1, keepdims=True)
sum_x_test=np.sum(X**2,axis=1, keepdims=True)
sum_2x_tr_te=np.dot(self.X_train,X.T)*2
sum_x_train=np.dot(sum_x_train,np.ones((1,X.shape[0])))
sum_x_test=np.dot(sum_x_test,np.ones((1,self.X_train.shape[0])))
dists=np.sqrt(sum_x_test.T+sum_x_train-sum_2x_tr_te).T
这种方法的缺点是它会占用更多内存。
这是我针对 OP 要求的功能 compute_distances_no_loops()
的解决方案。出于性能原因,我不使用 sqrt()
函数:
def compute_distances_no_loops(self, X):
num_test = X.shape[0]
num_train = self.X_train.shape[0]
dists = np.zeros((num_test, num_train))
#---------------
# Get square of X and X_train
X_sq = np.sum(X**2, axis=1, keepdims=True)
Xtrain_sq = np.sum(self.X_train**2, axis=1, keepdims=True)
# Calculate (squared) dists as (X_train - X)**2 = X_train**2 - 2*X_train*X + X**2
dists = -2*X.dot(self.X_train.T) + X_sq + Xtrain_sq.T
#---------------
return dists