将测地线数据类型更改为整数

Question

使用这段代码，我想创建一个距离矩阵，这很有效！我使用了 geopy 包并使用测地线距离方法来计算存储在 Pandas 数据框中的坐标之间的距离。

def get_distance(col):
    end = RD1.loc[col.name, 'Eindlocatie_Coord']
    return RD1['Eindlocatie_Coord'].apply(geodesic, args=(end,), ellipsoid='WGS-84')

def get_totaldistance(matrix):
    square = pd.DataFrame(np.zeros(len(RD1)**2).reshape(len(RD1), len(RD1)), index=RD1.index, columns=RD1.index)
    distances = square.apply(get_distance, axis=1).T
    totaldist = np.diag(distances,k=1).sum()
    return totaldist

distances = get_totaldistance(RD1)

但是，这些距离属于 测地线数据类型，我希望将这些距离设置为 浮点数，因为这将使我更进一步计算更简单。

我知道 print(geodesic(newport_ri, cleveland_oh).miles)（来自 geopy documentation 的示例）会 return 浮动，但我不确定如何将其应用到整个 pandas数据框列。

那么，如何更改我的代码以使浮点数 returned？

Answer 1

您可以使用 map():

将函数应用于数据框列

df['distance'] = df['distance'].map(lambda x: geodesic(x,other_distance).miles)

根据您的版本修改它。

Answer 2

我在我的函数中创建了一个额外的子函数来更改输出，这正是我想要的。这是解决方案：

def get_distance(col):
    end = RD1.loc[col.name, 'Eindlocatie_Coord']
    return RD1['Eindlocatie_Coord'].apply(geodesic, args=(end,), ellipsoid='WGS-84')

def get_totaldistance(matrix):
    square = pd.DataFrame(np.zeros(len(RD1)**2).reshape(len(RD1), len(RD1)), index=RD1.index, columns=RD1.index)
    distances = square.apply(get_distance, axis=1).T
    
    def units(input_instance):
        return input_instance.km
    
    distances_km = distances.applymap(units)
    
    totaldist = np.diag(distances_km,k=1).sum()
    return totaldist

其中函数 def units(input_instance) 是我的问题的解决方案。

将测地线数据类型更改为整数

Changing the geodesic datatype to an integer

python

geopy

pandas