KS 测试的非标准分布变量?

Non-standard distributions variables for KS testing?

您能否将 scipy.stats 中的 kstest 用于非标准分布函数(即改变 Students t 的 DOF,或改变 Cauchy 的 gamma)?我的最终目标是为我的分布拟合找到最大 p 值和相应的参数,但这不是问题。

编辑:

"

scipy.stat 的柯西 pdf 是:

cauchy.pdf(x) = 1 / (pi * (1 + x**2))

位置参数的位置参数为 x_0 = 0,伽玛参数的位置为 Y = 1。我实际上需要它看起来像这样

cauchy.pdf(x, x_0, Y) = Y**2 / [(Y * pi) * ((x - x_0)**2 + Y**2)]

"

Q1) Students t,至少,可以像这样使用

stuff = []
for dof in xrange(0,100):
    d, p, dof = scipy.stats.kstest(data, "t", args = (dof, ))
    stuff.append(np.hstack((d, p, dof)))

因为它似乎可以选择改变参数?

Q2) 如果你需要完整的正态分布方程(需要改变 sigma)和上面写的 Cauchy(需要改变 gamma),你会怎么做? 编辑:不是搜索 scipy.stats 非标准分布,实际上是否可以将我编写的函数提供给 kstest 以找到 p 值?

谢谢

看来你真正想做的是参数estimation.Using KT-test 这种方式并不是真正的目的。您应该对 corresponding distribution.

使用 .fit 方法
>>> import numpy as np, scipy.stats as stats
>>> arr = stats.norm.rvs(loc=10, scale=3, size=10) # generate 10 random samples from a normal distribution
>>> arr
array([ 11.54239861,  15.76348509,  12.65427353,  13.32551871,
        10.5756376 ,   7.98128118,  14.39058752,  15.08548683,
         9.21976924,  13.1020294 ])
>>> stats.norm.fit(arr)
(12.364046769964004, 2.3998164726918607)
>>> stats.cauchy.fit(arr)
(12.921113834451496, 1.5012714431045815)

现在快速查看文档:

>>> help(cauchy.fit)

Help on method fit in module scipy.stats._distn_infrastructure:

fit(data, *args, **kwds) method of scipy.stats._continuous_distns.cauchy_gen instance
    Return MLEs for shape, location, and scale parameters from data.

    MLE stands for Maximum Likelihood Estimate.  Starting estimates for
    the fit are given by input arguments; for any arguments not provided
    with starting estimates, ``self._fitstart(data)`` is called to generate
    such.

    One can hold some parameters fixed to specific values by passing in
    keyword arguments ``f0``, ``f1``, ..., ``fn`` (for shape parameters)
    and ``floc`` and ``fscale`` (for location and scale parameters,
    respectively).

...

Returns
-------
shape, loc, scale : tuple of floats
    MLEs for any shape statistics, followed by those for location and
    scale.

Notes
-----
This fit is computed by maximizing a log-likelihood function, with
penalty applied for samples outside of range of the distribution. The
returned answer is not guaranteed to be the globally optimal MLE, it
may only be locally optimal, or the optimization may fail altogether.

所以,假设我想保持其中一个参数不变,您可以轻松做到:

>>> stats.cauchy.fit(arr, floc=10)
(10, 2.4905786982353786)
>>> stats.norm.fit(arr, floc=10)
(10, 3.3686549590571668)