在 python3 中绘制散点图,其中 x 轴为 latitude/longitude 公里,y 轴为深度

Plotting a scatter plot in python3 where x axis is latitude/longitude in km and y axis is depth

我正在尝试找到绘制一些数据的最佳方法。基本上我有一个数据文件,其中包含纬度、经度、深度、sample_ID、Group_ID 列。我想生成一个二维散点图,其中 y 是深度,x 是以公里为单位的从北到南的距离(或者相对于在指定方向采样的第一个站点计算横断面距离),类似于 ODV 样式地图下面的那个:

已更新

我想为我最初的问题添加更多信息。经过更多的搜索和测试后,我在 R 中找到了一个可能的解决方案,使用 geosphere 包和 distGEO 函数将我的坐标转换为以公里为单位的距离,然后可以对其进行映射。 (https://www.rdocumentation.org/packages/geosphere/versions/1.5-10/topics/distGeo)

如果有人知道 python 方法,那就太好了!

已更新

ODV 不允许我进行我需要的自定义。我想生成一个这样的图,我可以在其中指定元数据变量来为点着色。更具体地说,我的数据文件中的 group_ID 列在下面的文件示例中看到。

Latitude    Longitude   Depth_m Sample_ID   Group_ID
49.7225 -42.4467    10  S1  1
49.7225 -42.4467    50  S2  1
49.7225 -42.4467    75  S3  1
49.7225 -42.4467    101 S4  1
49.7225 -42.4467    152 S5  1
49.7225 -42.4467    199 S6  1
46.312  -39.658 10  S7  2
46.312  -39.658 49  S8  2
46.312  -39.658 73  S9  2
46.312  -39.658 100 S10 2
46.312  -39.658 153 S11 2
46.312  -39.658 198 S12 2

虽然我一直在试图弄清楚它,但它给我带来了很多麻烦。我已经使用 haversine 计算计算了坐标之间的距离,但是一旦到达那里,我不确定如何使用这些距离合并到散点图中。这是我目前所拥有的:

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
#import haversine as hs
from math import radians
from sklearn.neighbors import DistanceMetric
df=pd.read_csv("locations.csv",sep="\t",index_col="Sample_ID")
#plt.scatter(df['Latitude'], df['Depth_m'])
#plt.show()
df['Latitude'] = np.radians(df['Latitude'])
df['Longitude'] = np.radians(df['Longitude'])
dist = DistanceMetric.get_metric('haversine')
x = dist.pairwise(np.unique(df[['Latitude','Longitude']].to_numpy(),axis=0))*6373
print(x)

此代码为我提供了坐标距离矩阵,但老实说,我不知道如何获取它并将其拉入从北到南设置 x 轴的散点图。特别是因为必须考虑具有相同坐标的多个深度。非常感谢任何帮助绘图!

我不确定你想用距离做什么,但从概念上讲,你需要将你的 x 输出作为一个新列放入你的数据框中,因为我有 done.In 为组使用不同颜色的条款,我会为此使用 seaborn,因为它们有一个 hue 参数。请查看下面第一个散点图的输出,并尝试对第二个散点图执行的操作:

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from math import radians
from sklearn.neighbors import DistanceMetric
import seaborn as sns
fig, ax = plt.subplots(nrows=2)
sns.scatterplot(data=df, x='Latitude', y='Depth_m', hue='Group_ID', ax=ax[0])
df['Latitude'] = np.radians(df['Latitude'])
df['Longitude'] = np.radians(df['Longitude'])
dist = DistanceMetric.get_metric('haversine')
df['Distance'] = (dist.pairwise(df[['Latitude','Longitude']].to_numpy())*6373)[0]
sns.scatterplot(data=df, x='Distance' , y='Depth_m', hue='Group_ID', ax=ax[1])
plt.show()

对于距离计算,您可以使用 geopy 包,特别是 geopy.distance.geodesic(),来计算距离通过假设特定的椭圆体(例如 WGS84)沿着弧线。

要生成类似于您所描述的图,您可以使用 matplotlib 库的散点图功能,特别是 matplotlib.pyplot.scatter().

下面的代码示例将引导您完成距离计算(从某个参考 lat/long 到另一个 lat/long 的距离...这不一定是 N-S 组件,但它很容易计算)。以及如何使用 Group_ID 字段生成散点图以使用两种方法为点着色。

import matplotlib.pyplot as plt
import geopy
import pandas as pd

# Load your sample data to a Pandas DataFrame where each column corresponds to
# 'Latitude', 'Longitude', 'Depth_m', 'Sample_ID', 'Group_ID'
datafile = r'<path to a file containing your data>'
df = pd.read_csv(datafile)

# Defining one end of our arc to calculate distance along (arbitrarily taking 
# the first point in the example data as the reference point).
ref_point = (df['Latitude'].iloc[0], df['Longitude'].iloc[0])

#  Loop over each sample location calculating the distance along the arc using
#  pygeo.distance.geodesic function.
dist = []
for i in range(len(df)):
    cur_point = (df['Latitude'].iloc[i], df['Longitude'].iloc[i])
    cur_geodesic = geopy.distance.geodesic(ref_point, cur_point)
    cur_dist = cur_geodesic.km
    dist.append(cur_dist)

# Add computed distances to the df DataFrame as column 'Distance_km'
df['Distance_km'] = dist

# Create a matplotlib figure and add two axes for plotting
fig = plt.figure()
ax1 = fig.add_subplot(211)
ax2 = fig.add_subplot(212)

# Example 1: of creating a scatter plot using the calculated distance field and
# colouring the points using a numeric field (i.e. Group_ID in this case is numeric)
pts = ax1.scatter(df['Distance_km'], df['Depth_m'], s=30, c=df['Group_ID'], cmap=plt.cm.jet)
plt.colorbar(pts, ax=ax1)

ax1.set_xlabel('Arc Distance from Reference Point (km)')
ax1.set_ylabel('Depth (m)')
ax1.set_title('Colouring Points by a Numeric Field')
ax1.invert_yaxis()
ax1.grid(True)

# Example of creating basically the same scatter plot as above but handling the
# case of non-numeric values in the field to be used for colour (e.g. imagine 
# wanting to the the Sample_ID field instead)
groups = list(set(df['Group_ID'])) # get a list of the unique Group_ID values
for gid in groups:
    df_tmp = df[df['Group_ID'] == gid]
    ax2.scatter(df_tmp['Distance_km'], df_tmp['Depth_m'], s=30, label=gid)
    
ax2.legend(loc='upper center', title='Legend')
ax2.set_xlabel('Arc Distance from Reference Point (km)')
ax2.set_ylabel('Depth (m)')
ax2.set_title('Colouring Points with Using Categorical Values')
ax2.invert_yaxis()
ax2.grid(True)

fig.tight_layout()
plt.show()

以及输出图...