使用来自 python pandas 数据框的矩阵查找形成成对指标(距离)列表
Form a list of pairwise metrics (distance) using a matrix lookp comming from a python pandas dataframe
我有一个距离矩阵作为数据框:
data_map = {
'startNode':["0","0","0","0","0","455","455","455","455","455","10","10","10","10","10","30","30","30","30","30","2","2","2","2","2"],
'EndNode':["0","455","30","10","2","0","455","30","10","2","0","455","30","10","2","0","455","10","2","30","0","455","30","10","2"],
'Dmeters':["0","19481","94","90","10","19481","0","750","75","20","90","75","1013","0","200","94","750","1013","50","0","10","20","50","200","0"]
}
df_map_mat = pd.DataFrame.from_dict(data_map)
输入数据帧:
df_map_mat
Out[141]:
startNode EndNode Dmeters
0 0 0 0
1 0 455 19481
2 0 30 94
3 0 10 90
4 0 2 10
5 455 0 19481
6 455 455 0
7 455 30 750
8 455 10 75
9 455 2 20
10 10 0 90
11 10 455 75
12 10 30 1013
13 10 10 0
14 10 2 200
15 30 0 94
16 30 455 750
17 30 10 1013
18 30 2 50
19 30 30 0
20 2 0 10
21 2 455 20
22 2 30 50
23 2 10 200
24 2 2 0
我需要查询 df_map_mat 数据框并填充如下所示的列表列
列表列是通过查询 df_map_mat 的 NID 列形成的
例如:起始节点中的 0 和结束节点中的 0 距离为 0,同样 10 -> 0 为 90,类似 30 -> 455 为 750 米。
df_dist_mat = {
'Nid':["0","10","2","30","455"],
'NName':["Q-CH","ANGC","AmOR","ANAGER","RPURAM"],
'D_list':[ "[0,90,10,94,19481]","[90,0,200,1013,75]","[10,200,0,50,20]","[94,1013,50,0,750]","[19481,75,20,750,0]"]
}
df_dist_mat = pd.DataFrame.from_dict(df_dist_mat)
预期的数据帧:
df_dist_mat
Out[142]:
Nid NName D_list
0 0 Q-CH [0,90,10,94,19481]
1 10 ANGC [90,0,200,1013,75]
2 2 AmOR [10,200,0,50,20]
3 30 ANAGER [94,1013,50,0,750]
4 455 RPURAM [19481,75,20,750,0]
[![enter code here][1]][1]
]
我已经在两个 np 数组中对 Nodeid 列进行了编码。这可能不是一个有效的解决方案,但却是一个给出答案的解决方案。
import numpy as np
x = np.array([[0],[10],[2], [30],[455]])
y = np.array([[0],[10],[2], [30],[455]])
def calc_dist(x,y):
d_list = []
for i in (x):
d_inner_list = []
for j in (y):
i = int(i)
j = int(j)
match = df_map_mat[(df_map_mat["startNode"] == i) & (df_map_mat["EndNode"] == j)]
d = match['Dmeters']
dist = int(d)
d_inner_list.append(dist)
d_list.append(d_inner_list)
print(d_list)
calc_dist(x,y)
解决方案:
calc_dist(x,y)
[[0, 90, 10, 94, 19481], [90, 0, 200, 1013, 75], [10, 200, 0, 50, 20], [94, 1013, 50, 0, 750], [19481, 75, 20, 750, 0]]
您可以使用 DataFrame.pivot
with DataFrame.reindex
:
arr = np.array([0,10,2,30,455])
df = (df_map_mat.astype({'startNode':int, 'EndNode':int})
.pivot('startNode','EndNode','Dmeters')
.reindex(index=arr, columns=arr))
print (df)
EndNode 0 10 2 30 455
startNode
0 0 90 10 94 19481
10 90 0 200 1013 75
2 10 200 0 50 20
30 94 1013 50 0 750
455 19481 75 20 750 0
对于列表使用:
out = df.to_numpy().tolist()
print (out)
[[0, 90, 10, 94, 19481], [90, 0, 200, 1013, 75],
[10, 200, 0, 50, 20], [94, 1013, 50, 0, 750],
[19481, 75, 20, 750, 0]]
我有一个距离矩阵作为数据框:
data_map = {
'startNode':["0","0","0","0","0","455","455","455","455","455","10","10","10","10","10","30","30","30","30","30","2","2","2","2","2"],
'EndNode':["0","455","30","10","2","0","455","30","10","2","0","455","30","10","2","0","455","10","2","30","0","455","30","10","2"],
'Dmeters':["0","19481","94","90","10","19481","0","750","75","20","90","75","1013","0","200","94","750","1013","50","0","10","20","50","200","0"]
}
df_map_mat = pd.DataFrame.from_dict(data_map)
输入数据帧:
df_map_mat
Out[141]:
startNode EndNode Dmeters
0 0 0 0
1 0 455 19481
2 0 30 94
3 0 10 90
4 0 2 10
5 455 0 19481
6 455 455 0
7 455 30 750
8 455 10 75
9 455 2 20
10 10 0 90
11 10 455 75
12 10 30 1013
13 10 10 0
14 10 2 200
15 30 0 94
16 30 455 750
17 30 10 1013
18 30 2 50
19 30 30 0
20 2 0 10
21 2 455 20
22 2 30 50
23 2 10 200
24 2 2 0
我需要查询 df_map_mat 数据框并填充如下所示的列表列
列表列是通过查询 df_map_mat 的 NID 列形成的 例如:起始节点中的 0 和结束节点中的 0 距离为 0,同样 10 -> 0 为 90,类似 30 -> 455 为 750 米。
df_dist_mat = {
'Nid':["0","10","2","30","455"],
'NName':["Q-CH","ANGC","AmOR","ANAGER","RPURAM"],
'D_list':[ "[0,90,10,94,19481]","[90,0,200,1013,75]","[10,200,0,50,20]","[94,1013,50,0,750]","[19481,75,20,750,0]"]
}
df_dist_mat = pd.DataFrame.from_dict(df_dist_mat)
预期的数据帧:
df_dist_mat
Out[142]:
Nid NName D_list
0 0 Q-CH [0,90,10,94,19481]
1 10 ANGC [90,0,200,1013,75]
2 2 AmOR [10,200,0,50,20]
3 30 ANAGER [94,1013,50,0,750]
4 455 RPURAM [19481,75,20,750,0]
[![enter code here][1]][1]
我已经在两个 np 数组中对 Nodeid 列进行了编码。这可能不是一个有效的解决方案,但却是一个给出答案的解决方案。
import numpy as np
x = np.array([[0],[10],[2], [30],[455]])
y = np.array([[0],[10],[2], [30],[455]])
def calc_dist(x,y):
d_list = []
for i in (x):
d_inner_list = []
for j in (y):
i = int(i)
j = int(j)
match = df_map_mat[(df_map_mat["startNode"] == i) & (df_map_mat["EndNode"] == j)]
d = match['Dmeters']
dist = int(d)
d_inner_list.append(dist)
d_list.append(d_inner_list)
print(d_list)
calc_dist(x,y)
解决方案:
calc_dist(x,y)
[[0, 90, 10, 94, 19481], [90, 0, 200, 1013, 75], [10, 200, 0, 50, 20], [94, 1013, 50, 0, 750], [19481, 75, 20, 750, 0]]
您可以使用 DataFrame.pivot
with DataFrame.reindex
:
arr = np.array([0,10,2,30,455])
df = (df_map_mat.astype({'startNode':int, 'EndNode':int})
.pivot('startNode','EndNode','Dmeters')
.reindex(index=arr, columns=arr))
print (df)
EndNode 0 10 2 30 455
startNode
0 0 90 10 94 19481
10 90 0 200 1013 75
2 10 200 0 50 20
30 94 1013 50 0 750
455 19481 75 20 750 0
对于列表使用:
out = df.to_numpy().tolist()
print (out)
[[0, 90, 10, 94, 19481], [90, 0, 200, 1013, 75],
[10, 200, 0, 50, 20], [94, 1013, 50, 0, 750],
[19481, 75, 20, 750, 0]]