在 Python 中拟合 Gamma 分布
Fitting Gamma distribution in Python
我正在寻找小样本的 Gamma 分布参数。稍后,我需要使用参数来预测未来的数据。但是,结果显示错误答案。
这是我从 Excel 得到的结果,它是我正在寻找的正确答案
阿尔法 0.458718895
测试版 96.76626573
import scipy.stats as stats
data=[0.0621,0.046,0.0324,0.0279]
fit_alpha, fit_loc, fit_beta=stats.gamma.fit(data,floc=0)
print(fit_alpha, fit_loc, fit_beta)
ll=[1,2,3,4,5,6,7,8,9,10]
plop=stats.gamma.pdf(ll,fit_alpha, fit_loc, fit_beta)
print(plop)
预期结果:
6.29% 4.28% 3.40% 2.88% 2.53% 2.27% 2.06% 1.90% 1.76% 1.65%
您使用 fit
的方式有误。您尝试拟合 PDF,而 scipy.stat
正在将最佳基础分布拟合到随机数据。看看这里:
import matplotlib.pyplot as plt
import numpy as np
import scipy.stats as stats
from scipy.optimize import leastsq
def my_res( params, yData ):
a, b = params
xList= range( 1, len(yData) + 1 )
th = np.fromiter( ( stats.gamma.pdf( x, a, loc=0, scale=b ) for x in xList ), np.float )
diff = th - np.array( yData )
return diff
data = [ 0.0621, 0.046, 0.0324, 0.0279 ]
### this does not work as data is supposed to be the random variate data and not the pdf
fit_alpha, fit_loc, fit_beta = stats.gamma.fit(data, floc=0 )
print 'data fitted the wrong way:'
print(fit_alpha, fit_loc, fit_beta)
#### but making a least square fit with the pdf works
sol, err = leastsq( my_res, [.4, 1 ], args=( data, ) )
print '...and the right way:'
print sol
datath = [ stats.gamma.pdf( x, sol[0], loc=0, scale=sol[1]) for x in range(1,5) ]
### the result gives the expected answer
ll=[1,2,3,4,5,6,7,8,9,10]
plop=stats.gamma.pdf(ll, sol[0], loc=0, scale=sol[1])
print 'expected values:'
print(plop)
### if we generate random numbers with gamma distribution
### the fit does what is should
testData = stats.gamma.rvs(sol[0], loc=0, scale=sol[1], size=5000 )
print 'using stats.gamma.fit the correct way:'
print stats.gamma.fit( testData, floc=0 )
fig = plt.figure()
ax = fig.add_subplot( 1, 1, 1 )
ax.plot( data , ls='', marker='x')
ax.plot( datath , ls='', marker='^')
plt.show()
提供
>> data fitted the wrong way:
>> (10.36700043818477, 0, 0.00406096249836482)
>> ...and the right way:
>> [ 0.45826569 96.8498341 ]
>> expected values:
>> [0.06298405 0.04282212 0.0340243 0.02881519 0.02527189 0.02265992 0.02063036 0.01899356 0.01763645 0.01648688]
>> using stats.gamma.fit the correct way:
>> (0.454884062189886, 0, 94.94258888249479)
我认为您混淆了“示例”和“某些时候的 PDF 值”
如果您认为您的数据是一个样本,即 4 个来自伽马定律,那么拟合将给出类似的结果(我使用 OpenTURNS 平台)
import openturns as ot
sample = ot.Sample([[x] for x in data])
gamma_fitting = ot.GammaFactory().build(sample)
print (gamma_fitting)
>>> Gamma(k = 1.49938, lambda = 79.5426, gamma = 0.02325)
如果您的数据(4 个输入数字)在横坐标轴上,绘制结果将显示您的数据对应于拟合。
事实上,您正在寻找验证的 Gamma :
PDF([1,2,3,4]) ~ [ 0.0621, 0.046, 0.0324, 0.0279 ] = 数据
我正在寻找小样本的 Gamma 分布参数。稍后,我需要使用参数来预测未来的数据。但是,结果显示错误答案。
这是我从 Excel 得到的结果,它是我正在寻找的正确答案 阿尔法 0.458718895 测试版 96.76626573
import scipy.stats as stats
data=[0.0621,0.046,0.0324,0.0279]
fit_alpha, fit_loc, fit_beta=stats.gamma.fit(data,floc=0)
print(fit_alpha, fit_loc, fit_beta)
ll=[1,2,3,4,5,6,7,8,9,10]
plop=stats.gamma.pdf(ll,fit_alpha, fit_loc, fit_beta)
print(plop)
预期结果: 6.29% 4.28% 3.40% 2.88% 2.53% 2.27% 2.06% 1.90% 1.76% 1.65%
您使用 fit
的方式有误。您尝试拟合 PDF,而 scipy.stat
正在将最佳基础分布拟合到随机数据。看看这里:
import matplotlib.pyplot as plt
import numpy as np
import scipy.stats as stats
from scipy.optimize import leastsq
def my_res( params, yData ):
a, b = params
xList= range( 1, len(yData) + 1 )
th = np.fromiter( ( stats.gamma.pdf( x, a, loc=0, scale=b ) for x in xList ), np.float )
diff = th - np.array( yData )
return diff
data = [ 0.0621, 0.046, 0.0324, 0.0279 ]
### this does not work as data is supposed to be the random variate data and not the pdf
fit_alpha, fit_loc, fit_beta = stats.gamma.fit(data, floc=0 )
print 'data fitted the wrong way:'
print(fit_alpha, fit_loc, fit_beta)
#### but making a least square fit with the pdf works
sol, err = leastsq( my_res, [.4, 1 ], args=( data, ) )
print '...and the right way:'
print sol
datath = [ stats.gamma.pdf( x, sol[0], loc=0, scale=sol[1]) for x in range(1,5) ]
### the result gives the expected answer
ll=[1,2,3,4,5,6,7,8,9,10]
plop=stats.gamma.pdf(ll, sol[0], loc=0, scale=sol[1])
print 'expected values:'
print(plop)
### if we generate random numbers with gamma distribution
### the fit does what is should
testData = stats.gamma.rvs(sol[0], loc=0, scale=sol[1], size=5000 )
print 'using stats.gamma.fit the correct way:'
print stats.gamma.fit( testData, floc=0 )
fig = plt.figure()
ax = fig.add_subplot( 1, 1, 1 )
ax.plot( data , ls='', marker='x')
ax.plot( datath , ls='', marker='^')
plt.show()
提供
>> data fitted the wrong way:
>> (10.36700043818477, 0, 0.00406096249836482)
>> ...and the right way:
>> [ 0.45826569 96.8498341 ]
>> expected values:
>> [0.06298405 0.04282212 0.0340243 0.02881519 0.02527189 0.02265992 0.02063036 0.01899356 0.01763645 0.01648688]
>> using stats.gamma.fit the correct way:
>> (0.454884062189886, 0, 94.94258888249479)
我认为您混淆了“示例”和“某些时候的 PDF 值”
如果您认为您的数据是一个样本,即 4 个来自伽马定律,那么拟合将给出类似的结果(我使用 OpenTURNS 平台)
import openturns as ot
sample = ot.Sample([[x] for x in data])
gamma_fitting = ot.GammaFactory().build(sample)
print (gamma_fitting)
>>> Gamma(k = 1.49938, lambda = 79.5426, gamma = 0.02325)
如果您的数据(4 个输入数字)在横坐标轴上,绘制结果将显示您的数据对应于拟合。
事实上,您正在寻找验证的 Gamma :
PDF([1,2,3,4]) ~ [ 0.0621, 0.046, 0.0324, 0.0279 ] = 数据