在 Python 中再现二维直方图

Question

我在 Python 中处理一个非常大的数据集，所以我尝试使用直方图而不是数组（数组对于 saving/loading/mapping 来说太大了）。我正在爬取一堆文件并从中提取信息，然后我想获取这些信息并在之后重新制作直方图。我可以使用 1D 直方图执行此操作，如下所示：

counter, bins = np.histogram(nSigmaProtonHisto, bins=1000, range=(-10000, 10000))
nSigmaProtonPico[0] += counter
nSigmaProtonPico[1] = bins[:-1]

nSigmaProtonPico 是一个二维数组，用于存储 bin 边缘和直方图值的最终计数。 nSigmaProtonHisto 是特定事件的一维数组，我循环遍历数百万个事件。脚本完成后，它将遍历所有事件，我将得到一个包含直方图值和位置的二维数组。我可以简单地绘制它，像这样：

plt.plot(nSigmaProtonPico[1], nSigmaProtonPico[0])

当我尝试对 2D 直方图执行此操作时，它崩溃了。我错过了什么。这是我拥有的：

counter, bins1, bins2 = np.histogram2d(dEdX, pG, bins=1000, range=((0, 20), (-5, 5)))
dEdXpQPRIME[0] += counter[0]
dEdXpQPRIME[1] += counter[1]
dEdXpQPRIME[2] = bins1[:-1]
dEdXpQPRIME[3] = bins2[:-1]

这让我有所收获，但我不知道如何绘制它以便从所有数据中重现直方图。我认为它会像 x、y 和 z 坐标一样简单，但有 4 个而不是 3 个坐标。

我错过了什么？

Answer 1

counter 是一个二维数组。如果您在每次调用 histogram2d 时都有相同的分箱，您将获得相同大小的数组。因此，您可以简单地添加所有 counter 数组。考虑：

x1, y1 = np.random.normal(loc=0,scale=1, size=(2,10000))
x2, y2 = np.random.normal(loc=3,scale=1, size=(2,10000))

x_bins = np.linspace(-5,5,100)
y_bins = np.linspace(-5,5,100)

H1, xedges, yedges = np.histogram2d(x1, y1, bins=(x_bins, y_bins))
H2, xedges, yedges = np.histogram2d(x2, y2, bins=(x_bins, y_bins))

H1 和 H2 都是形状 (99,99)（每个维度有 100 个边）。

X, Y = np.meshgrid(xedges, yedges)
H = H1+H2

fig, axs = plt.subplots(1,3, figsize=(9,3))
axs[0].pcolormesh(X, Y, H1)
axs[1].pcolormesh(X, Y, H2)
axs[2].pcolormesh(X, Y, H)

在 Python 中再现二维直方图

Reproducing a 2d histogram in Python

python

numpy

matplotlib

histogram

histogram2d