如何测量两组点之间的成对距离?
How to measure pairwise distances between two sets of points?
我有两个数据集(csv 文件)。它们都包含两组(220 和 4400)点的经纬度。现在我想测量这两组点 (220 x 4400) 之间的成对距离(英里)。我怎样才能在 python 中做到这一点?类似于这个问题:https://gist.github.com/rochacbruno/2883505
最好使用 sklearn
,它完全符合您的要求。
假设我们有一些示例数据
towns = pd.DataFrame({
"name" : ["Merry Hill", "Spring Valley", "Nesconset"],
"lat" : [36.01, 41.32, 40.84],
"long" : [-76.7, -89.20, -73.15]
})
museum = pd.DataFrame({
"name" : ["Motte Historical Car Museum, Menifee", "Crocker Art Museum, Sacramento", "World Chess Hall Of Fame, St.Louis", "National Atomic Testing Museum, Las", "National Air and Space Museum, Washington", "The Metropolitan Museum of Art", "Museum of the American Military Family & Learning Center"],
"lat" : [33.743511, 38.576942, 38.644302, 36.114269, 38.887806, 40.778965, 35.083359],
"long" : [-117.165161, -121.504997, -90.261154, -115.148315, -77.019844, -73.962311, -106.381531]
})
您可以使用 sklearn
距离度量,它实现了半正弦
from sklearn.neighbors import DistanceMetric
dist = DistanceMetric.get_metric('haversine')
用
提取numpy
数组值后
places_gps = towns[["lat", "long"]].values
museum_gps = museum[["lat", "long"]].values
你只是
EARTH_RADIUS = 6371.009
haversine_distances = dist.pairwise(np.radians(places_gps), np.radians(museum_gps) )
haversine_distances *= EARTH_RADIUS
获取 KM
中的距离。如果需要里程,请乘以常数。
如果您只对最近的几个感兴趣,或者都在半径范围内,请查看 sklearn
BallTree 算法,该算法也实现了 haversine。它要快得多。
编辑:要将输出转换为数据帧,例如使用
pd_distances = pd.DataFrame(haversine_distances, columns=museum.name, index=towns.name, )
pd_distances
我有两个数据集(csv 文件)。它们都包含两组(220 和 4400)点的经纬度。现在我想测量这两组点 (220 x 4400) 之间的成对距离(英里)。我怎样才能在 python 中做到这一点?类似于这个问题:https://gist.github.com/rochacbruno/2883505
最好使用 sklearn
,它完全符合您的要求。
假设我们有一些示例数据
towns = pd.DataFrame({
"name" : ["Merry Hill", "Spring Valley", "Nesconset"],
"lat" : [36.01, 41.32, 40.84],
"long" : [-76.7, -89.20, -73.15]
})
museum = pd.DataFrame({
"name" : ["Motte Historical Car Museum, Menifee", "Crocker Art Museum, Sacramento", "World Chess Hall Of Fame, St.Louis", "National Atomic Testing Museum, Las", "National Air and Space Museum, Washington", "The Metropolitan Museum of Art", "Museum of the American Military Family & Learning Center"],
"lat" : [33.743511, 38.576942, 38.644302, 36.114269, 38.887806, 40.778965, 35.083359],
"long" : [-117.165161, -121.504997, -90.261154, -115.148315, -77.019844, -73.962311, -106.381531]
})
您可以使用 sklearn
距离度量,它实现了半正弦
from sklearn.neighbors import DistanceMetric
dist = DistanceMetric.get_metric('haversine')
用
提取numpy
数组值后
places_gps = towns[["lat", "long"]].values
museum_gps = museum[["lat", "long"]].values
你只是
EARTH_RADIUS = 6371.009
haversine_distances = dist.pairwise(np.radians(places_gps), np.radians(museum_gps) )
haversine_distances *= EARTH_RADIUS
获取 KM
中的距离。如果需要里程,请乘以常数。
如果您只对最近的几个感兴趣,或者都在半径范围内,请查看 sklearn
BallTree 算法,该算法也实现了 haversine。它要快得多。
编辑:要将输出转换为数据帧,例如使用
pd_distances = pd.DataFrame(haversine_distances, columns=museum.name, index=towns.name, )
pd_distances