比较多维列表和 return 相交索引
Comparing multidimensional lists and return the intersected index
我尝试根据 internet 中的一些示例编写和编辑代码,如下所示:
from math import sqrt
# calculate the Euclidean distance between two vectors
def euclidean_distance(row1, row2):
distance = 0.0
for i in range(len(row1)-1):
distance += (row1[i] - row2[i])**2
return sqrt(distance)
# Locate the clostest neighbors
def get_neighbors(train, test_row, num_neighbors):
distances = list()
for train_row in train:
dist = euclidean_distance(test_row, train_row)
distances.append((train_row, dist))
distances.sort(key=lambda tup: tup[1])
neighbors = list()
for i in range(num_neighbors):
neighbors.append(distances[i][0])
return neighbors
# Test distance function
dataset = [[2.7810836,2.550537003,0],
[1.465489372,2.362125076,0],
[3.396561688,4.400293529,0],
[1.38807019,1.850220317,0],
[3.06407232,3.005305973,0],
[7.627531214,2.759262235,1],
[5.332441248,2.088626775,1],
[6.922596716,1.77106367,1],
[8.675418651,-0.242068655,1],
[7.673756466,3.508563011,1]]
neighbors = get_neighbors(dataset, dataset[0], 3)
#set(dataset) & set(neighbors)
#type(neighbors) is int
#set(dataset).intersection(neighbors)
for neighbor in neighbors:
print(neighbor)
我想做的是:
- 获取 3 个最近的邻居,
- 将最近的邻居与 'dataset'、
进行比较
- Return匹配的数据点索引。
例如:
从上面的代码,结果是:
[2.7810836, 2.550537003, 0]
[3.06407232, 3.005305973, 0]
[1.465489372, 2.362125076, 0]
我想要的最终结果是:
结果 = [1, 5, 2]
假设数据索引从1开始而不是0,这是距离所选数据点(包括其自身)最近的3个邻居的数据集索引。
您的代码中的错误是您 return 来自您的 get neighbors 函数的行本身。要解决此问题,请更改行:
neighbors.append(distances[i][0])
到
neighbors.append(train.index(distances[i][0]) + 1)
它在完整的行列表中找到该行的索引,并在您希望索引从 1 开始时递增 1。
现在的结果是
[1, 5, 2]
我尝试根据 internet 中的一些示例编写和编辑代码,如下所示:
from math import sqrt
# calculate the Euclidean distance between two vectors
def euclidean_distance(row1, row2):
distance = 0.0
for i in range(len(row1)-1):
distance += (row1[i] - row2[i])**2
return sqrt(distance)
# Locate the clostest neighbors
def get_neighbors(train, test_row, num_neighbors):
distances = list()
for train_row in train:
dist = euclidean_distance(test_row, train_row)
distances.append((train_row, dist))
distances.sort(key=lambda tup: tup[1])
neighbors = list()
for i in range(num_neighbors):
neighbors.append(distances[i][0])
return neighbors
# Test distance function
dataset = [[2.7810836,2.550537003,0],
[1.465489372,2.362125076,0],
[3.396561688,4.400293529,0],
[1.38807019,1.850220317,0],
[3.06407232,3.005305973,0],
[7.627531214,2.759262235,1],
[5.332441248,2.088626775,1],
[6.922596716,1.77106367,1],
[8.675418651,-0.242068655,1],
[7.673756466,3.508563011,1]]
neighbors = get_neighbors(dataset, dataset[0], 3)
#set(dataset) & set(neighbors)
#type(neighbors) is int
#set(dataset).intersection(neighbors)
for neighbor in neighbors:
print(neighbor)
我想做的是:
- 获取 3 个最近的邻居,
- 将最近的邻居与 'dataset'、 进行比较
- Return匹配的数据点索引。
例如: 从上面的代码,结果是:
[2.7810836, 2.550537003, 0]
[3.06407232, 3.005305973, 0]
[1.465489372, 2.362125076, 0]
我想要的最终结果是:
结果 = [1, 5, 2]
假设数据索引从1开始而不是0,这是距离所选数据点(包括其自身)最近的3个邻居的数据集索引。
您的代码中的错误是您 return 来自您的 get neighbors 函数的行本身。要解决此问题,请更改行:
neighbors.append(distances[i][0])
到
neighbors.append(train.index(distances[i][0]) + 1)
它在完整的行列表中找到该行的索引,并在您希望索引从 1 开始时递增 1。
现在的结果是
[1, 5, 2]