如何识别彼此之间特定距离内的坐标
How to identify what coordinates that are within a specific distance of eachother
我正在尝试确定哪些坐标落在彼此的特定距离内。目前,我的代码将所有点组合在一起,而它应该是两个单独的组。
from sklearn.neighbors import DistanceMetric
from math import radians
import pandas as pd
import numpy as np
from collections import Counter
data = {'Lat': [38.42447, 38.424474, 38.424493, 38.424394, 38.424457, 38.424434],
'Long': [-77.402199, -77.402228, -77.402186, -77.398625, -77.398602, -77.398459],
'Name': ['Truck', 'Truck1','Truck2','Truck3','Truck4','Truck5',]}
df = pd.DataFrame(data)
df['Lat'] = np.radians(df['Lat'])
df['Long'] = np.radians(df['Long'])
dist = DistanceMetric.get_metric('haversine')
df[['Lat','Long']].to_numpy()
dist.pairwise(df[['Lat','Long']].to_numpy())*6371000
final_df = pd.DataFrame(dist.pairwise(df[['Lat','Long']].to_numpy())*6371000, columns=df.Name.unique(), index=df.Name.unique())
potential_grouping = []
for row, col in final_df.items():
for item in col:
if int(item) < 15:
potential_grouping.append(row)
outside_features = [k for k, v in Counter(potential_grouping).items() if v == 1]
acceptable_features = [k for k, v in Counter(potential_grouping).items() if v > 1]
print(acceptable_features)
current output: ['Truck', 'Truck1', 'Truck2', 'Truck3', 'Truck4', 'Truck5']
desired output: [['Truck', 'Truck1', 'Truck2'],['Truck3', 'Truck4', 'Truck5']]
这是正在发生的事情的蹩脚图片......
6 个小圆圈目前正在分组(红色大圆圈),但应该分开(2 个绿色圆圈)。发生这种情况是因为每个坐标(棕色小圆圈)彼此相距不到 15 米。我怎样才能确保得到我想要的输出?
这是使用 DBSCAN
的一种方法:
from sklearn.cluster import DBSCAN
# here Lat and Long are already in radians
X = df[['Lat', 'Long']].to_numpy()
# here 15 is your max distance in meters divided by earth radius in meters
clustering = DBSCAN(eps=15/6373000, min_samples=1, metric='haversine').fit(X)
# see groups
print(clustering.labels_)
# [0 0 0 1 1 1]
# get the result as you want
acceptable_features = df['Name'].groupby(clustering.labels_).agg(list).tolist()
print(acceptable_features)
# [['Truck', 'Truck1', 'Truck2'], ['Truck3', 'Truck4', 'Truck5']]
我正在尝试确定哪些坐标落在彼此的特定距离内。目前,我的代码将所有点组合在一起,而它应该是两个单独的组。
from sklearn.neighbors import DistanceMetric
from math import radians
import pandas as pd
import numpy as np
from collections import Counter
data = {'Lat': [38.42447, 38.424474, 38.424493, 38.424394, 38.424457, 38.424434],
'Long': [-77.402199, -77.402228, -77.402186, -77.398625, -77.398602, -77.398459],
'Name': ['Truck', 'Truck1','Truck2','Truck3','Truck4','Truck5',]}
df = pd.DataFrame(data)
df['Lat'] = np.radians(df['Lat'])
df['Long'] = np.radians(df['Long'])
dist = DistanceMetric.get_metric('haversine')
df[['Lat','Long']].to_numpy()
dist.pairwise(df[['Lat','Long']].to_numpy())*6371000
final_df = pd.DataFrame(dist.pairwise(df[['Lat','Long']].to_numpy())*6371000, columns=df.Name.unique(), index=df.Name.unique())
potential_grouping = []
for row, col in final_df.items():
for item in col:
if int(item) < 15:
potential_grouping.append(row)
outside_features = [k for k, v in Counter(potential_grouping).items() if v == 1]
acceptable_features = [k for k, v in Counter(potential_grouping).items() if v > 1]
print(acceptable_features)
current output: ['Truck', 'Truck1', 'Truck2', 'Truck3', 'Truck4', 'Truck5']
desired output: [['Truck', 'Truck1', 'Truck2'],['Truck3', 'Truck4', 'Truck5']]
这是正在发生的事情的蹩脚图片...... 6 个小圆圈目前正在分组(红色大圆圈),但应该分开(2 个绿色圆圈)。发生这种情况是因为每个坐标(棕色小圆圈)彼此相距不到 15 米。我怎样才能确保得到我想要的输出?
这是使用 DBSCAN
的一种方法:
from sklearn.cluster import DBSCAN
# here Lat and Long are already in radians
X = df[['Lat', 'Long']].to_numpy()
# here 15 is your max distance in meters divided by earth radius in meters
clustering = DBSCAN(eps=15/6373000, min_samples=1, metric='haversine').fit(X)
# see groups
print(clustering.labels_)
# [0 0 0 1 1 1]
# get the result as you want
acceptable_features = df['Name'].groupby(clustering.labels_).agg(list).tolist()
print(acceptable_features)
# [['Truck', 'Truck1', 'Truck2'], ['Truck3', 'Truck4', 'Truck5']]