如何计算 pandas 数据帧中从一个数据点到所有其他数据点的欧氏距离之和?
How to calculate sum of Euclidean distances from one datapoint to all other datapoints from pandas dataframe?
我有以下 pandas 数据框:
import pandas as pd
import math
df = pd.DataFrame()
df['x'] = [2, 1, 3]
df['y'] = [2, 5, 6]
df['weight'] = [11, 12, 13]
print(df)
x y weight
0 2 2 11
1 1 5 12
2 3 6 13
假设这3个节点分别叫做{a,b,c}。我想计算一个节点到所有其他节点的总欧氏距离乘以它的权重,如下:
Sum = 11(d(a,b)+d(a,c)) + 12(d(b,a)+d(b,c)) + 13(d(c,a)+d(c,b))
In [72]: from scipy.spatial.distance import cdist
In [73]: a = df[['x','y']].values
In [74]: w = df.weight.values
In [100]: cdist(a,a).sum(1) * w
Out[100]: array([ 80.13921614, 64.78014765, 82.66925684])
我们也可以使用来自相同 SciPy 方法的 pdist
和 squareform
的组合来替换那里的 cdist
。
用这些实际值验证 -
In [76]: from scipy.spatial.distance import euclidean
In [77]: euclidean([2,2],[1,5])*11 + euclidean([2,2],[3,6])*11
Out[77]: 80.139216143646451
In [78]: euclidean([1,5],[2,2])*12 + euclidean([1,5],[3,6])*12
Out[78]: 64.78014765201803
In [80]: euclidean([3,6],[2,2])*13 + euclidean([3,6],[1,5])*13
Out[80]: 82.669256840526856
我有以下 pandas 数据框:
import pandas as pd
import math
df = pd.DataFrame()
df['x'] = [2, 1, 3]
df['y'] = [2, 5, 6]
df['weight'] = [11, 12, 13]
print(df)
x y weight
0 2 2 11
1 1 5 12
2 3 6 13
假设这3个节点分别叫做{a,b,c}。我想计算一个节点到所有其他节点的总欧氏距离乘以它的权重,如下:
Sum = 11(d(a,b)+d(a,c)) + 12(d(b,a)+d(b,c)) + 13(d(c,a)+d(c,b))
In [72]: from scipy.spatial.distance import cdist
In [73]: a = df[['x','y']].values
In [74]: w = df.weight.values
In [100]: cdist(a,a).sum(1) * w
Out[100]: array([ 80.13921614, 64.78014765, 82.66925684])
我们也可以使用来自相同 SciPy 方法的 pdist
和 squareform
的组合来替换那里的 cdist
。
用这些实际值验证 -
In [76]: from scipy.spatial.distance import euclidean
In [77]: euclidean([2,2],[1,5])*11 + euclidean([2,2],[3,6])*11
Out[77]: 80.139216143646451
In [78]: euclidean([1,5],[2,2])*12 + euclidean([1,5],[3,6])*12
Out[78]: 64.78014765201803
In [80]: euclidean([3,6],[2,2])*13 + euclidean([3,6],[1,5])*13
Out[80]: 82.669256840526856