Python - 坐标密度均值
Python - coordinates density mean
我有几分 (X,Y)
id,X,Y
1,11.60388710,48.10862135
2,11.60420372,48.10865659
3,11.60424066,48.10864778
4,11.60425121,48.10867597
5,11.60471031,48.10872354
6,11.60428551,48.10917455
7,11.60563380,48.10921331
8,11.60422219,48.10867773
9,11.60434356,48.10870064
10,11.60460214,48.10843284
并希望找到一个受彼此靠近的点影响很大的中心点(不是质心)。
例如,我可以在 QGIS 中创建一个热图,并且有这样的东西:
也许有人知道如何写一个Python脚本来计算这个X,Y"density"中心?
感谢您的帮助!
您可以尝试使用 scipy
的核密度估计。例如(受 code here 启发):
from scipy.stats import gaussian_kde
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
%matplotlib inline
names = ["id","X","Y"]
data = [[1,11.60388710, 48.10862135],
[2,11.60420372, 48.10865659],
[3,11.60424066, 48.10864778],
[4,11.60425121, 48.10867597],
[5,11.60471031, 48.10872354],
[6,11.60428551, 48.10917455],
[7,11.60563380, 48.10921331],
[8,11.60422219, 48.10867773],
[9,11.60434356, 48.10870064],
[10,11.60460214, 48.10843284]]
df = pd.DataFrame(data, columns=names).set_index("id")
min_x = df["X"].min()
max_x = df["X"].max()
min_y = df["Y"].min()
max_y = df["Y"].max()
kernel = gaussian_kde(df.values.T) # get gaussian kernel density estimate
# set up grid
grid_xs, grid_ys = np.mgrid[min_x:max_x:50j, min_y:max_y:50j]
positions = np.vstack([grid_xs.ravel(),
grid_ys.ravel()]) # 2-dim array with X, Y-coordinates from xs, ys
Z = np.reshape(kernel(positions).T, grid_xs.shape) # get densities
fig, ax = plt.subplots(1)
ax.imshow(np.rot90(Z), cmap=plt.cm.jet, extent=[min_x, max_x, min_y, max_y])
ax.set_xlim(min_x, max_x)
ax.set_ylim(min_y, max_y)
ax.scatter(df["X"], df["Y"], color="white", marker="x")
当然,color maps 和格式设置选项在 matplotlib
中可用,可以根据需要调整输出。
获取中心的位置(定义中心为估计密度最高的网格位置):
idx = np.unravel_index(np.argmax(Z), Z.shape)
center = grid_xs[idx], grid_ys[idx] # (11.604243569387755, 48.108671759387754)
我有几分 (X,Y)
id,X,Y
1,11.60388710,48.10862135
2,11.60420372,48.10865659
3,11.60424066,48.10864778
4,11.60425121,48.10867597
5,11.60471031,48.10872354
6,11.60428551,48.10917455
7,11.60563380,48.10921331
8,11.60422219,48.10867773
9,11.60434356,48.10870064
10,11.60460214,48.10843284
并希望找到一个受彼此靠近的点影响很大的中心点(不是质心)。
例如,我可以在 QGIS 中创建一个热图,并且有这样的东西:
也许有人知道如何写一个Python脚本来计算这个X,Y"density"中心?
感谢您的帮助!
您可以尝试使用 scipy
的核密度估计。例如(受 code here 启发):
from scipy.stats import gaussian_kde
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
%matplotlib inline
names = ["id","X","Y"]
data = [[1,11.60388710, 48.10862135],
[2,11.60420372, 48.10865659],
[3,11.60424066, 48.10864778],
[4,11.60425121, 48.10867597],
[5,11.60471031, 48.10872354],
[6,11.60428551, 48.10917455],
[7,11.60563380, 48.10921331],
[8,11.60422219, 48.10867773],
[9,11.60434356, 48.10870064],
[10,11.60460214, 48.10843284]]
df = pd.DataFrame(data, columns=names).set_index("id")
min_x = df["X"].min()
max_x = df["X"].max()
min_y = df["Y"].min()
max_y = df["Y"].max()
kernel = gaussian_kde(df.values.T) # get gaussian kernel density estimate
# set up grid
grid_xs, grid_ys = np.mgrid[min_x:max_x:50j, min_y:max_y:50j]
positions = np.vstack([grid_xs.ravel(),
grid_ys.ravel()]) # 2-dim array with X, Y-coordinates from xs, ys
Z = np.reshape(kernel(positions).T, grid_xs.shape) # get densities
fig, ax = plt.subplots(1)
ax.imshow(np.rot90(Z), cmap=plt.cm.jet, extent=[min_x, max_x, min_y, max_y])
ax.set_xlim(min_x, max_x)
ax.set_ylim(min_y, max_y)
ax.scatter(df["X"], df["Y"], color="white", marker="x")
当然,color maps 和格式设置选项在 matplotlib
中可用,可以根据需要调整输出。
获取中心的位置(定义中心为估计密度最高的网格位置):
idx = np.unravel_index(np.argmax(Z), Z.shape)
center = grid_xs[idx], grid_ys[idx] # (11.604243569387755, 48.108671759387754)