是否可以在没有循环的情况下找到矩阵中行之间的相似性?
Is it possible to find similarities between rows in a matrix without loop?
我有一个二维 numpy 数组。我正在尝试计算行之间的相似性并将其放入 similarities
数组中。这可能没有循环吗?感谢您的宝贵时间!
# ratings.shape = (943, 1682)
arri = np.zeros(943)
arri = np.where(arri == 0)[0]
arrj = np.zeros(943)
arrj = np.where(arrj ==0)[0]
similarities = np.zeros((ratings.shape[0], ratings.shape[0]))
similarities[arri, arrj] = np.abs(ratings[arri]-ratings[arrj])
我想制作一个二维数组相似度,相似度[i, j] 是评分中第 i 行和第 j 行之间的差异
[ValueError: 形状不匹配:形状 (943,1682) 的值数组无法广播到形状 (943,) 的索引结果]
[1][1]: https://i.stack.imgur.com/gtst9.png
问题是 numpy 在用两个数组索引二维数组时如何遍历数组。
首先进行一些设置:
import numpy;
ratings = numpy.arange(1, 6)
indicesX = numpy.indices((ratings.shape[0],1))[0]
indicesY = numpy.indices((ratings.shape[0],1))[0]
ratings
: [1 2 3 4 5]
indicesX
: [[0][1][2][3][4]]
indicesY
: [[0][1][2][3][4]]
现在让我们看看您的程序产生了什么:
similarities = numpy.zeros((ratings.shape[0], ratings.shape[0]))
similarities[indicesX, indicesY] = numpy.abs(ratings[indicesX]-ratings[0])
similarities
:
[[0. 0. 0. 0. 0.]
[0. 1. 0. 0. 0.]
[0. 0. 2. 0. 0.]
[0. 0. 0. 3. 0.]
[0. 0. 0. 0. 4.]]
如您所见,numpy 迭代 similarities
基本上如下所示:
for i in range(5):
similarities[indicesX[i], indicesY[i]] = numpy.abs(ratings[i]-ratings[0])
similarities
:
[[0. 0. 0. 0. 0.]
[0. 1. 0. 0. 0.]
[0. 0. 2. 0. 0.]
[0. 0. 0. 3. 0.]
[0. 0. 0. 0. 4.]]
现在我们需要像下面这样的索引来遍历整个数组:
indecesX = [0,1,2,3,4,0,1,2,3,4,0,1,2,3,4,0,1,2,3,4,0,1,2,3,4]
indecesY = [0,0,0,0,0,1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4]
我们执行以下操作:
# Reshape indicesX from (x,1) to (x,). Thats important for numpy.tile().
indicesX = indicesX.reshape(indicesX.shape[0])
indicesX = numpy.tile(indicesX, ratings.shape[0])
indicesY = numpy.repeat(indicesY, ratings.shape[0])
indicesX
: [0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4]
indicesY
: [0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4]
完美!现在再次调用 similarities[indicesX, indicesY] = numpy.abs(ratings[indicesX]-ratings[indicesY])
我们会看到:
similarities
:
[[0. 1. 2. 3. 4.]
[1. 0. 1. 2. 3.]
[2. 1. 0. 1. 2.]
[3. 2. 1. 0. 1.]
[4. 3. 2. 1. 0.]]
这里是整个代码:
import numpy;
ratings = numpy.arange(1, 6)
indicesX = numpy.indices((ratings.shape[0],1))[0]
indicesY = numpy.indices((ratings.shape[0],1))[0]
similarities = numpy.zeros((ratings.shape[0], ratings.shape[0]))
indicesX = indicesX.reshape(indicesX.shape[0])
indicesX = numpy.tile(indicesX, ratings.shape[0])
indicesY = numpy.repeat(indicesY, ratings.shape[0])
similarities[indicesX, indicesY] = numpy.abs(ratings[indicesX]-ratings[indicesY])
print(similarities)
PS
您评论了自己的 post 以改进它。当你想改进它时,你应该编辑你的问题而不是评论它。
我有一个二维 numpy 数组。我正在尝试计算行之间的相似性并将其放入 similarities
数组中。这可能没有循环吗?感谢您的宝贵时间!
# ratings.shape = (943, 1682)
arri = np.zeros(943)
arri = np.where(arri == 0)[0]
arrj = np.zeros(943)
arrj = np.where(arrj ==0)[0]
similarities = np.zeros((ratings.shape[0], ratings.shape[0]))
similarities[arri, arrj] = np.abs(ratings[arri]-ratings[arrj])
我想制作一个二维数组相似度,相似度[i, j] 是评分中第 i 行和第 j 行之间的差异
[ValueError: 形状不匹配:形状 (943,1682) 的值数组无法广播到形状 (943,) 的索引结果] [1][1]: https://i.stack.imgur.com/gtst9.png
问题是 numpy 在用两个数组索引二维数组时如何遍历数组。
首先进行一些设置:
import numpy;
ratings = numpy.arange(1, 6)
indicesX = numpy.indices((ratings.shape[0],1))[0]
indicesY = numpy.indices((ratings.shape[0],1))[0]
ratings
: [1 2 3 4 5]
indicesX
: [[0][1][2][3][4]]
indicesY
: [[0][1][2][3][4]]
现在让我们看看您的程序产生了什么:
similarities = numpy.zeros((ratings.shape[0], ratings.shape[0]))
similarities[indicesX, indicesY] = numpy.abs(ratings[indicesX]-ratings[0])
similarities
:
[[0. 0. 0. 0. 0.]
[0. 1. 0. 0. 0.]
[0. 0. 2. 0. 0.]
[0. 0. 0. 3. 0.]
[0. 0. 0. 0. 4.]]
如您所见,numpy 迭代 similarities
基本上如下所示:
for i in range(5):
similarities[indicesX[i], indicesY[i]] = numpy.abs(ratings[i]-ratings[0])
similarities
:
[[0. 0. 0. 0. 0.]
[0. 1. 0. 0. 0.]
[0. 0. 2. 0. 0.]
[0. 0. 0. 3. 0.]
[0. 0. 0. 0. 4.]]
现在我们需要像下面这样的索引来遍历整个数组:
indecesX = [0,1,2,3,4,0,1,2,3,4,0,1,2,3,4,0,1,2,3,4,0,1,2,3,4]
indecesY = [0,0,0,0,0,1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4]
我们执行以下操作:
# Reshape indicesX from (x,1) to (x,). Thats important for numpy.tile().
indicesX = indicesX.reshape(indicesX.shape[0])
indicesX = numpy.tile(indicesX, ratings.shape[0])
indicesY = numpy.repeat(indicesY, ratings.shape[0])
indicesX
: [0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4]
indicesY
: [0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4]
完美!现在再次调用 similarities[indicesX, indicesY] = numpy.abs(ratings[indicesX]-ratings[indicesY])
我们会看到:
similarities
:
[[0. 1. 2. 3. 4.]
[1. 0. 1. 2. 3.]
[2. 1. 0. 1. 2.]
[3. 2. 1. 0. 1.]
[4. 3. 2. 1. 0.]]
这里是整个代码:
import numpy;
ratings = numpy.arange(1, 6)
indicesX = numpy.indices((ratings.shape[0],1))[0]
indicesY = numpy.indices((ratings.shape[0],1))[0]
similarities = numpy.zeros((ratings.shape[0], ratings.shape[0]))
indicesX = indicesX.reshape(indicesX.shape[0])
indicesX = numpy.tile(indicesX, ratings.shape[0])
indicesY = numpy.repeat(indicesY, ratings.shape[0])
similarities[indicesX, indicesY] = numpy.abs(ratings[indicesX]-ratings[indicesY])
print(similarities)
PS
您评论了自己的 post 以改进它。当你想改进它时,你应该编辑你的问题而不是评论它。