Python Dataframe 通过中间点计算距离

Python Dataframe calculate Distance via intermediate point

我有一个 python 数据框,其中列出的距离如下

dict = {'from' : ['A','A','A','B','B','D','D','D'],
         'to' : ['B','C','D','C','E','B','C','E'],
        'distances': [4,3,1,1,3,4,2,9]}
df = pd.DataFrame.from_dict(dict)

我想枚举所有距离:

从point1 == > point2

其中点 1==> 点 2 = From point1 ==> B + From B==> point2 并包含在 a

我如何使用 python 有效地做到这一点 - 我假设某种 pd.merge?

然后我想将数据框重新格式化为以下内容

columns = ['From','To','Distance','Distance via B']

如果您正在寻找长度为 3 的路线,这里有一个解决方案。请注意,在某些情况下,直接路线(例如 A 到 B)比路线 A-B-C 更短:

three_route = pd.merge(df, df, left_on="to", right_on="from")
three_route["distance"] = three_route.distances_x + three_route.distances_y
three_route = three_route[["from_x", "to_x", "to_y", "distance"]]. \
      rename(columns = {"from_x":"from", "to_x": "via", "to_y": "to"})

结果是:

  from via to  distance
0    A   B  C         5
1    A   B  E         7
2    D   B  C         5
3    D   B  E         7
4    A   D  B         5
5    A   D  C         3
6    A   D  E        10