在直方图下填充直到 fill_between python 的精确点

Filling under histogram until exact point with fill_between python

目前,我正在尝试在 python 中使用 fill_between 函数填充直方图,直到原始数字的第 10 和 90 个百分位数。 然而,问题是直方图曲线不是一个“函数”,而是一系列具有 bin 大小间隔的离散数字。我无法准确地填充到 10 或 90 个百分点。我尝试了几次,但都失败了。 下面的代码是我试过的:

S1 = [0.34804491  0.18036933  0.41111951  0.31947523 .........

0.46212255  0.39229157  0.28937502  0.22095423  0.52415083]
N, bins = np.histogram(S1, bins=np.linspace(0.1,0.7,20), normed=False)
bincenters   = 0.5*(bins[1:]+bins[:-1])    
ax.fill_between(bincenters,N,0,where=bincenters<=np.percentile(S1,10),interpolate=True,facecolor='r', alpha=0.5)
ax.fill_between(bincenters,N,0,where=bincenters>=np.percentile(S1,90),interpolate=True, facecolor='r', alpha=0.5,label = "Summer 10 P")

它似乎只填充到给定百分位数之前或之后的 bincenter,直到那些。

任何想法或帮助将不胜感激。 艾萨克

尝试将最后两行更改为:

ax.fill_between(bincenters, 0, N, interpolate=True,
                where=((bincenters>=np.percentile(bincenters, 10)) &
                       (bincenters<=np.percentile(bincenters, 90))))

我相信您想在 bincenters 上调用 np.percentile,因为那是您的有效 x 轴。

另一个区别是您想要在 10<x<90 的区域之间进行填充,这需要在 where 参数中使用 &

根据 OP 的评论进行编辑:

我认为要实现你想要的,你必须自己做一些最小的插值。请参阅下面使用随机正态分布的示例,其中我使用 scipy.interpolate 中的 interp1dbincenters.

进行插值
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import interp1d

# create normally distributed random data
n = 10000
data = np.random.normal(0, 1, n)
bins = np.linspace(-data.max(), data.max(), 20)
hist = np.histogram(data, bins=bins)[0]
bincenters = 0.5 * (bins[1:] + bins[:-1])

# create interpolation function and dense x-axis to interpolate over
f = interp1d(bincenters, hist, kind='cubic')
x = np.linspace(bincenters.min(), bincenters.max(), n)

plt.plot(bincenters, hist, '-o')
# calculate greatest bincenter < 10th percentile
bincenter_under10thPerc = bincenters[bincenters < np.percentile(bincenters, 10)].max()
bincenter_10thPerc = np.percentile(bincenters, 10)

bincenter_90thPerc = np.percentile(bincenters, 90)
# calculate smallest bincenter > 90th percentile
bincenter_above90thPerc = bincenters[bincenters > np.percentile(bincenters, 90)].min()

# fill between 10th percentile region using dense x-axis array, x
plt.fill_between(x, 0, f(x), interpolate=True,
                 where=((x>=bincenter_under10thPerc) &
                        (x<=bincenter_10thPerc)))

# fill between 90th percentile region using dense x-axis array, x
plt.fill_between(x, 0, f(x), interpolate=True,
                 where=((x>=bincenter_90thPerc) &
                        (x<=bincenter_above90thPerc)))

我得到的图如下。请注意,我将百分位数从 10/90% 更改为 30/70%,以便它们在图中显示得更好。再一次,我希望这就是你想要做的

我有一个这样的版本,它使用 axvspan 制作一个矩形,然后使用 hist 作为 clip_path:

def hist(sample, low=None, high=None):
    # draw the histogram
    options = dict(alpha=0.5, color='C0')
    xs, ys, patches = plt.hist(sample,
                               density=True,
                               histtype='step', 
                               linewidth=3,
                               **options)

    # fill in the histogram, if desired
    if low is not None:
        x1 = low
        if high is not None:
            x2 = high
        else:
            x2 = np.max(sample)

        fill = plt.axvspan(x1, x2, 
                           clip_path=patches[0],
                           **options)

类似的东西对你有用吗?