使用 Matplotlib 创建 Probability/Frequency 轴网格（不规则间隔）

Question

我正在尝试创建频率曲线图，但在操纵轴以获得我想要的图时遇到问题。

这是我正在尝试创建的 grid/plot 所需示例：

这是我用 matplotlib 成功创建的内容：

为了在此图中创建网格，我使用了以下代码：

m1 = pd.np.arange(.2, 1, .1)
m2 = pd.np.arange(1, 2, .2)
m3 = pd.np.arange(2, 10, 2)
m4 = pd.np.arange(2, 20, 1)
m5 = pd.np.arange(20, 80, 2)
m6 = pd.np.arange(80, 98, 1)
xTick_minor = pd.np.concatenate((m1,m2,m3,m4, m5, m6))
xTick_major = pd.np.array([.2,.5,1,2,5,10,20,30,40,50,60,70,80,90,95,98])

m1 = range(0, 250, 50)
m2 = range(250, 500, 10)
m3 = range(500, 1000, 20)
m4 = range(1000, 5000, 100)
m5 = range(5000, 10000, 200)
m6 = range(10000, 50000, 1000)

yTick_minor = pd.np.concatenate((m1,m2,m3,m4,m5,m6))
yTick_major = pd.np.array([250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000, 1500, 2000, 2500, 3000, 3500, 4000, 4500, 5000, 6000, 7000, 8000, 9000, 10000, 15000, 20000, 25000, 30000, 35000, 40000, 45000, 50000])

axes.invert_xaxis()
axes.set_ylabel('Discharge in CFS')
axes.set_xlabel('Exceedance Probability')
axes.xaxis.set_major_formatter(FormatStrFormatter('%3.1f'))
axes.set_xticks(xTick_major)
axes.set_xticks(xTick_minor, minor=True)
axes.grid(which='major', alpha=0.7)
axes.grid(which='minor', alpha=0.4)

axes.set_yticks(yTick_major)
axes.set_yticks(yTick_minor, minor=True)

网格是正确的，但我现在想做的是确保在显示中，低概率范围的间隔更大，低流量值（y 轴）也是如此。本质上，我想控制 刻度之间的间距 ，而不是刻度间隔本身，以便从 .2 到 .5 的范围显示类似于 x 轴上 40 到 50 之间的范围，如所需的网格所示。

这可以在 matplotlib 中完成吗？我已经阅读了关于 tick_params 和定位器的文档，但是其中 none 似乎解决了这种轴格式问题。

Answer 1

您可以为 x 轴定义一个自定义比例，您可以使用它来代替 'log'。不幸的是，它很复杂，您需要找出一个函数来将您为 x 轴提供的数字转换为更线性的数字。参见 http://matplotlib.org/examples/api/custom_scale_example.html。

编辑添加：

这个问题太有趣了，我决定弄清楚我是否可以自己制作自定义轴。我更改了 link 中的代码以使用您的示例。我很想看看它是否按照您想要的方式工作。

编辑：新的和改进的（？）代码！间距不像以前那样均匀，但现在当您将点列表传递给 plt.gca().set_xscale 时，它会自动完成（例如，请参见代码末尾）。它进行曲线拟合以将这些点拟合到逻辑函数，并使用生成的参数作为转换的基础。当我运行这段代码时，我收到警告（警告：将屏蔽元素转换为 nan）。我仍然没有弄清楚那里发生了什么，但它似乎并没有引起问题。这是我生成的数字：

import numpy as np
from numpy import ma
from matplotlib import scale as mscale
from matplotlib import transforms as mtransforms
from matplotlib.ticker import Formatter, FixedLocator
from scipy.optimize import curve_fit

def logistic(x, L, k, x0):
    """Logistic function (s-curve)."""
    return L / (1 + np.exp(-k * (x - x0)))

class ProbabilityScale(mscale.ScaleBase):
    """
    Scales data so that points along a logistic curve become evenly spaced.
    """

    # The scale class must have a member ``name`` that defines the
    # string used to select the scale.  For example,
    # ``gca().set_yscale("probability")`` would be used to select this
    # scale.
    name = 'probability'


    def __init__(self, axis, **kwargs):
        """
        Any keyword arguments passed to ``set_xscale`` and
        ``set_yscale`` will be passed along to the scale's
        constructor.

        lower_bound: Minimum value of x. Defaults to .01.
        upper_bound_dist: L - upper_bound_dist is the maximum value
        of x. Defaults to lower_bound.

        """
        mscale.ScaleBase.__init__(self)
        lower_bound = kwargs.pop("lower_bound", .01)
        if lower_bound <= 0:
            raise ValueError("lower_bound must be greater than 0")
        self.lower_bound = lower_bound
        upper_bound_dist = kwargs.pop("upper_bound_dist", lower_bound)
        self.points = kwargs['points']
        #determine parameters of logistic function with curve fitting
        x = np.linspace(0, 1, len(self.points))
        #initial guess for parameters
        p0 = [max(self.points), 1, .5]
        popt, pcov = curve_fit(logistic, x, self.points, p0 = p0)
        [self.L, self.k, self.x0] = popt
        self.upper_bound = self.L - upper_bound_dist

    def get_transform(self):
        """
        Override this method to return a new instance that does the
        actual transformation of the data.

        The ProbabilityTransform class is defined below as a
        nested class of this one.
        """
        return self.ProbabilityTransform(self.lower_bound, self.upper_bound, self.L, self.k, self.x0)

    def set_default_locators_and_formatters(self, axis):
        """
        Override to set up the locators and formatters to use with the
        scale.  This is only required if the scale requires custom
        locators and formatters.  Writing custom locators and
        formatters is rather outside the scope of this example, but
        there are many helpful examples in ``ticker.py``.
        """


        axis.set_major_locator(FixedLocator(self.points))

    def limit_range_for_scale(self, vmin, vmax, minpos):
        """
        Override to limit the bounds of the axis to the domain of the
        transform.  In this case, the bounds should be
        limited to the threshold that was passed in.  Unlike the
        autoscaling provided by the tick locators, this range limiting
        will always be adhered to, whether the axis range is set
        manually, determined automatically or changed through panning
        and zooming.
        """
        return max(vmin, self.lower_bound), min(vmax, self.upper_bound)

    class ProbabilityTransform(mtransforms.Transform):
        # There are two value members that must be defined.
        # ``input_dims`` and ``output_dims`` specify number of input
        # dimensions and output dimensions to the transformation.
        # These are used by the transformation framework to do some
        # error checking and prevent incompatible transformations from
        # being connected together.  When defining transforms for a
        # scale, which are, by definition, separable and have only one
        # dimension, these members should always be set to 1.
        input_dims = 1
        output_dims = 1
        is_separable = True

        def __init__(self, lower_bound, upper_bound, L, k, x0):
            mtransforms.Transform.__init__(self)
            self.lower_bound = lower_bound
            self.L = L
            self.k = k
            self.x0 = x0
            self.upper_bound = upper_bound
        def transform_non_affine(self, a):
            """
            This transform takes an Nx1 ``numpy`` array and returns a
            transformed copy.  Since the range of the scale
            is limited by the user-specified threshold, the input
            array must be masked to contain only valid values.
            ``matplotlib`` will handle masked arrays and remove the
            out-of-range data from the plot.  Importantly, the
            ``transform`` method *must* return an array that is the
            same shape as the input array, since these values need to
            remain synchronized with values in the other dimension.
            """
            masked = ma.masked_where((a < self.lower_bound) | (a > self.upper_bound), a)
            return ma.log((self.L - masked) / masked) / -self.k + self.x0

        def inverted(self):
            """
            Override this method so matplotlib knows how to get the
            inverse transform for this transform.
            """
            return ProbabilityScale.InvertedProbabilityTransform(self.lower_bound, self.upper_bound, self.L, self.k, self.x0)

    class InvertedProbabilityTransform(mtransforms.Transform):
        input_dims = 1
        output_dims = 1
        is_separable = True

        def __init__(self, lower_bound, upper_bound, L, k, x0):
            mtransforms.Transform.__init__(self)
            self.lower_bound = lower_bound
            self.L = L
            self.k = k
            self.x0 = x0
            self.upper_bound = upper_bound

        def transform_non_affine(self, a):
            return self.L / (1 + np.exp(-self.k * (a - self.x0)))
        def inverted(self):
            return ProbabilityScale.ProbabilityTransform(self.lower_bound, self.upper_bound, self.L, self.k, self.x0)

# Now that the Scale class has been defined, it must be registered so
# that ``matplotlib`` can find it.
mscale.register_scale(ProbabilityScale)


if __name__ == '__main__':
    import matplotlib.pyplot as plt
    x = np.linspace(.1, 100, 1000)
    points = np.array([.2,.5,1,2,5,10,20,30,40,50,60,70,80,90,95,98])

    plt.plot(x, x)
    plt.gca().set_xscale('probability', points = points, vmin = .01)
    plt.grid(True)

    plt.show()

Answer 2

我终于想出了正确的解决方案，感谢@Amy Teegarden 让我朝着正确的方向前进。我想我会在这里分享最终的解决方案供其他人参考！这是最终结果：

以下是真实的概率轴刻度，使用正态 CDF 及其逆函数 PPF，由 mu 和 sigma 参数化的函数。

import numpy as np
from matplotlib import scale as mscale
from matplotlib import transforms as mtransforms
from matplotlib.ticker import FormatStrFormatter, FixedLocator
from scipy.stats import norm

class ProbScale(mscale.ScaleBase):
    """
    Scales data in range 0 to 100 using a non-standard log transform
    This scale attempts to replicate "probability paper" scaling

    The scale function:
        A piecewise combination of exponential, linear, and logarithmic scales

    The inverse scale function:
      piecewise combination of exponential, linear, and logarithmic scales

    Since probabilities at 0 and 100 are not represented,
    there is user-defined upper and lower limit, above and below which nothing
    will be plotted.  This defaults to .1 and 99 for lower and upper, respectively.

    """

    # The scale class must have a member ``name`` that defines the
    # string used to select the scale.  For example,
    # ``gca().set_yscale("mercator")`` would be used to select this
    # scale.
    name = 'prob_scale'


    def __init__(self, axis, **kwargs):
        """
        Any keyword arguments passed to ``set_xscale`` and
        ``set_yscale`` will be passed along to the scale's
        constructor.

        upper: The probability above which to crop the data.
        lower: The probability below which to crop the data.
        """
        mscale.ScaleBase.__init__(self)
        upper = kwargs.pop("upper", 98) #Default to an upper bound of 98%
        if upper <= 0 or upper >= 100:
            raise ValueError("upper must be between 0 and 100.")
        lower = kwargs.pop("lower", 0.2) #Default to a lower bound of .2%
        if lower <= 0 or lower >= 100:
            raise ValueError("lower must be between 0 and 100.")
        if lower >= upper:
            raise ValueError("lower must be strictly less than upper!.")
        self.lower = lower
        self.upper = upper

        #This scale is best described by the CDF of the normal distribution
        #This distribution is paramaterized by mu and sigma, these default vaules
        #are provided to work generally well, but can be adjusted by the user if desired
        mu = kwargs.pop("mu", 15)
        sigma = kwargs.pop("sigma", 40)
        self.mu = mu
        self.sigma = sigma
        #Need to enfore the upper and lower limits on the axes initially
        axis.axes.set_xlim(lower,upper)

    def get_transform(self):
        """
        Override this method to return a new instance that does the
        actual transformation of the data.

        The ProbTransform class is defined below as a
        nested class of this one.
        """
        return self.ProbTransform(self.lower, self.upper, self.mu, self.sigma)

    def set_default_locators_and_formatters(self, axis):
        """
        Override to set up the locators and formatters to use with the
        scale.  This is only required if the scale requires custom
        locators and formatters.  Writing custom locators and
        formatters: many helpful examples in ``ticker.py``.

        In this case, the prob_scale uses a fixed locator from
        0.1 to 99 % and a custom no formatter class

        This builds both the major and minor locators, and cuts off any values
        above or below the user defined thresholds: upper, lower
        """
        #major_ticks = np.asarray([.2,.5,1,2,5,10,20,30,40,50,60,70,80,90,95,98])
        major_ticks = np.asarray([.2,1,2,5,10,20,30,40,50,60,70,80,90,98]) #removed a couple ticks to make it look nicer
        major_ticks = major_ticks[np.where( (major_ticks >= self.lower) & (major_ticks <= self.upper) )]

        minor_ticks = np.concatenate( [np.arange(.2, 1, .1), np.arange(1, 2, .2), np.arange(2,20,1), np.arange(20, 80, 2), np.arange(80, 98, 1)] )
        minor_ticks = minor_ticks[np.where( (minor_ticks >= self.lower) & (minor_ticks <= self.upper) )]
        axis.set_major_locator(FixedLocator(major_ticks))
        axis.set_minor_locator(FixedLocator(minor_ticks))


    def limit_range_for_scale(self, vmin, vmax, minpos):
        """
        Override to limit the bounds of the axis to the domain of the
        transform.  In the case of Probability, the bounds should be
        limited to the user bounds that were passed in.  Unlike the
        autoscaling provided by the tick locators, this range limiting
        will always be adhered to, whether the axis range is set
        manually, determined automatically or changed through panning
        and zooming.
        """
        return max(self.lower, vmin), min(self.upper, vmax)

    class ProbTransform(mtransforms.Transform):
        # There are two value members that must be defined.
        # ``input_dims`` and ``output_dims`` specify number of input
        # dimensions and output dimensions to the transformation.
        # These are used by the transformation framework to do some
        # error checking and prevent incompatible transformations from
        # being connected together.  When defining transforms for a
        # scale, which are, by definition, separable and have only one
        # dimension, these members should always be set to 1.
        input_dims = 1
        output_dims = 1
        is_separable = True

        def __init__(self, upper, lower, mu, sigma):
            mtransforms.Transform.__init__(self)
            self.upper = upper
            self.lower = lower
            self.mu = mu
            self.sigma = sigma

        def transform_non_affine(self, a):
            """
            This transform takes an Nx1 ``numpy`` array and returns a
            transformed copy.  Since the range of the Probability scale
            is limited by the user-specified threshold, the input
            array must be masked to contain only valid values.
            ``matplotlib`` will handle masked arrays and remove the
            out-of-range data from the plot.  Importantly, the
            ``transform`` method *must* return an array that is the
            same shape as the input array, since these values need to
            remain synchronized with values in the other dimension.
            """

            masked = np.ma.masked_where( (a < self.upper) & (a > self.lower) , a)
            #Get the CDF of the normal distribution located at mu and scaled by sigma
            #Multiply these by 100 to put it into a percent scale
            cdf = norm.cdf(masked, self.mu, self.sigma)*100
            return cdf

        def inverted(self):
            """
            Override this method so matplotlib knows how to get the
            inverse transform for this transform.
            """
            return ProbScale.InvertedProbTransform(self.lower, self.upper, self.mu, self.sigma)

    class InvertedProbTransform(mtransforms.Transform):
        input_dims = 1
        output_dims = 1
        is_separable = True

        def __init__(self, lower, upper, mu, sigma):
            mtransforms.Transform.__init__(self)
            self.lower = lower
            self.upper = upper
            self.mu = mu
            self.sigma = sigma

        def transform_non_affine(self, a):
            #Need to get the PPF value for a, which is in a percent scale [0,100], so move back to probability range [0,1]
            inverse = norm.ppf(a/100, self.mu, self.sigma)
            return inverse

        def inverted(self):
            return ProbScale.ProbTransform(self.lower, self.upper)

# Now that the Scale class has been defined, it must be registered so
# that ``matplotlib`` can find it.
mscale.register_scale(ProbScale)

此外，为了获得所需的 y 轴结果，我发现对数刻度确实可以通过一些额外的调整来获得合理的图形。这是用于强制对数刻度具有适当的小刻度的代码：

axes.set_yscale('log', basey=10, subsy=[2,3,4,5,6,7,8,9])

然后您可以使用定位器和格式化程序修改标签：

#Adjust the yaxis labels and format
axes.yaxis.set_minor_locator(FixedLocator([200, 500, 1500, 2500, 3500, 4500, 5000, 6000, 7000, 8000, 9000, 15000, 20000, 25000, 30000, 35000, 40000, 45000, 50000]))
axes.yaxis.set_minor_formatter(FormatStrFormatter('%d'))
axes.yaxis.set_major_formatter(FormatStrFormatter('%d'))

所以完整的轴手动输入看起来像这样：

axes.set_ylabel('Discharge in CFS')
axes.set_xlabel('Exceedance Probability')
plt.setp(plt.xticks()[1], rotation=45)
#Adjust the scales of the x and y axis
axes.set_yscale('log', basey=10, subsy=[2,3,4,5,6,7,8,9])
axes.set_xscale('prob_scale', upper=98, lower=.2)
#Adjust the yaxis labels and format
axes.yaxis.set_minor_locator(FixedLocator([200, 500, 1500, 2500, 3500, 4500, 5000, 6000, 7000, 8000, 9000, 15000, 20000, 25000, 30000, 35000, 40000, 45000, 50000]))
axes.yaxis.set_minor_formatter(FormatStrFormatter('%d'))
axes.yaxis.set_major_formatter(FormatStrFormatter('%d'))

#Finally set the y-limit of the plot to be reasonable
axes.set_ylim((0, 2*pp['Q'].max()))
#Invert the x-axis
axes.invert_xaxis()
#Turn on major and minor grid lines
axes.grid(which='both', alpha=.9)

这提供了一个半对数比例的概率纸图！漂亮的属性任何绘制在这些轴上的直线都表明它来自正态分布！

使用 Matplotlib 创建 Probability/Frequency 轴网格（不规则间隔）

Creating Probability/Frequency Axis Grid (Irregularly Spaced) with Matplotlib

python

plot

probability

matplotlib