Python:Numpy Gamma 函数为比例参数生成错误的平均值
Python: Numpy Gamma Function Produces Wrong Mean Value For Scale Parameter
我正在尝试从 numpy.random 的伽玛方法中抽取 1000 个样本(每个样本大小为 227),因此每个样本值应该是 i.i.d(独立同分布)。但是,比例参数的平均值是错误的。
我的形状参数 ( alpha ) 是 0.375,我的比例参数 ( lambda ) 是 1.674
根据我的课本,这两个参数的估计值公式如下:
alpha = ( xbar ^ 2 ) / ( sigma_hat ^ 2 )
lambda = ( xbar ) / ( sigma_hat ^ 2 )
我想我可能没有正确使用 Pandas .apply() 方法或者我的 get_lambda_hat 函数有误。
# In[11]:
# Import libraries:
import pandas as pd
import numpy as np
from numpy.random import gamma # gamma function
import seaborn as sns # plotting library
# plot histograms immediately:
get_ipython().run_line_magic('matplotlib', 'inline')
# In[12]:
# Define functions
def get_samples_from_gamma_dist( num_of_samples, size_of_samples, alpha, lamb ):
'''
Returns table with ( num_of_samples ) rows and ( size_of_samples ) columns.
Cells in the table are i.i.d sample values from numpy's gamma function
with shape parameter ( alpha ) and scale parameter ( lamb ).
'''
return pd.DataFrame(
data = gamma(
shape = alpha,
scale = lamb,
size =
(
num_of_samples,
size_of_samples
)
)
)
# Returns alpha_hat of a sample:
get_alpha_hat = lambda sample : ( sample.mean()**2 ) / sample.var()
# Returns lambda_hat of a sample:
get_lambda_hat = lambda sample : sample.mean() / sample.var()
# In[13]:
# Retrieve samples
# Declaring variables...
my_num_of_samples = 1000
my_size_of_samples = 227
my_alpha = 0.375
my_lambda = 1.674
# Initializing table...
data = get_samples_from_gamma_dist(
num_of_samples= my_num_of_samples,
size_of_samples= my_size_of_samples,
alpha= my_alpha,
lamb= my_lambda
)
# Getting estimated parameter values from each sample...
alpha_hats = data.apply( get_alpha_hat, axis = 1 ) # apply function across the table's columns
lambda_hats = data.apply( get_lambda_hat, axis = 1 ) # apply function across the table's columns
# In[14]:
# Plot histograms:
# Setting background of histograms to 'whitegrid'...
sns.set_style( style = 'whitegrid' )
# Plotting the sample distribution of alpha_hat...
sns.distplot( alpha_hats,
hist = True,
kde = True,
bins = 50,
axlabel = 'Estimates of Alpha',
hist_kws=dict(edgecolor="k", linewidth=2),
color = 'red' )
# In[15]:
# Plotting the sample distribution of lambda_hat...
sns.distplot( lambda_hats,
hist = True,
kde = True,
bins = 50,
axlabel = 'Estimates of Lambda',
hist_kws=dict(edgecolor="k", linewidth=2),
color = 'purple' )
# In[16]:
# Print results:
print( "Mean of alpha_hats =", alpha_hats.mean(), '\n' )
print( "Mean of lambda_hats =", lambda_hats.mean(), '\n' ) # about 0.62
print( "Standard Error of alpha_hats =", alpha_hats.std( ddof = 0 ), '\n' )
print( "Standard Error of lambda_hats =", lambda_hats.std( ddof = 0 ), '\n' )
在我分别绘制 alpha 和 lambda 估计值的直方图后,我注意到 alpha 样本分布几乎完美地集中在 0.375,但 lambda 的样本分布集中在 0.62 附近,与 1.674 相去甚远。我试过使用 lambda 的其他值,但它似乎从未正确居中。
我很想知道是否有人对解决此问题有任何建议。我已经包含了从我的 jupyter notebook 会话下载的 .py 文件中的所有代码。
已修复。 gamma 函数的概率质量函数在 numpy.random 中的实现方式与我的教科书不同。
我通过将 get_samples_from_gamma_dist() 主体中的 'scale' 参数设置为 1 / lamb:
得到了正确的平均值
def get_samples_from_gamma_dist( num_of_samples, size_of_samples, alpha, lamb ):
'''
Returns table with ( num_of_samples ) rows and ( size_of_samples ) columns.
Cells in the table are i.i.d sample values from numpy's gamma function
with shape parameter ( alpha ) and scale parameter ( 1 / lamb ).
'''
return pd.DataFrame(
data = gamma(
shape = alpha,
scale = 1 / lamb,
size =
(
num_of_samples,
size_of_samples
)
)
)
我正在尝试从 numpy.random 的伽玛方法中抽取 1000 个样本(每个样本大小为 227),因此每个样本值应该是 i.i.d(独立同分布)。但是,比例参数的平均值是错误的。
我的形状参数 ( alpha ) 是 0.375,我的比例参数 ( lambda ) 是 1.674
根据我的课本,这两个参数的估计值公式如下:
alpha = ( xbar ^ 2 ) / ( sigma_hat ^ 2 )
lambda = ( xbar ) / ( sigma_hat ^ 2 )
我想我可能没有正确使用 Pandas .apply() 方法或者我的 get_lambda_hat 函数有误。
# In[11]:
# Import libraries:
import pandas as pd
import numpy as np
from numpy.random import gamma # gamma function
import seaborn as sns # plotting library
# plot histograms immediately:
get_ipython().run_line_magic('matplotlib', 'inline')
# In[12]:
# Define functions
def get_samples_from_gamma_dist( num_of_samples, size_of_samples, alpha, lamb ):
'''
Returns table with ( num_of_samples ) rows and ( size_of_samples ) columns.
Cells in the table are i.i.d sample values from numpy's gamma function
with shape parameter ( alpha ) and scale parameter ( lamb ).
'''
return pd.DataFrame(
data = gamma(
shape = alpha,
scale = lamb,
size =
(
num_of_samples,
size_of_samples
)
)
)
# Returns alpha_hat of a sample:
get_alpha_hat = lambda sample : ( sample.mean()**2 ) / sample.var()
# Returns lambda_hat of a sample:
get_lambda_hat = lambda sample : sample.mean() / sample.var()
# In[13]:
# Retrieve samples
# Declaring variables...
my_num_of_samples = 1000
my_size_of_samples = 227
my_alpha = 0.375
my_lambda = 1.674
# Initializing table...
data = get_samples_from_gamma_dist(
num_of_samples= my_num_of_samples,
size_of_samples= my_size_of_samples,
alpha= my_alpha,
lamb= my_lambda
)
# Getting estimated parameter values from each sample...
alpha_hats = data.apply( get_alpha_hat, axis = 1 ) # apply function across the table's columns
lambda_hats = data.apply( get_lambda_hat, axis = 1 ) # apply function across the table's columns
# In[14]:
# Plot histograms:
# Setting background of histograms to 'whitegrid'...
sns.set_style( style = 'whitegrid' )
# Plotting the sample distribution of alpha_hat...
sns.distplot( alpha_hats,
hist = True,
kde = True,
bins = 50,
axlabel = 'Estimates of Alpha',
hist_kws=dict(edgecolor="k", linewidth=2),
color = 'red' )
# In[15]:
# Plotting the sample distribution of lambda_hat...
sns.distplot( lambda_hats,
hist = True,
kde = True,
bins = 50,
axlabel = 'Estimates of Lambda',
hist_kws=dict(edgecolor="k", linewidth=2),
color = 'purple' )
# In[16]:
# Print results:
print( "Mean of alpha_hats =", alpha_hats.mean(), '\n' )
print( "Mean of lambda_hats =", lambda_hats.mean(), '\n' ) # about 0.62
print( "Standard Error of alpha_hats =", alpha_hats.std( ddof = 0 ), '\n' )
print( "Standard Error of lambda_hats =", lambda_hats.std( ddof = 0 ), '\n' )
在我分别绘制 alpha 和 lambda 估计值的直方图后,我注意到 alpha 样本分布几乎完美地集中在 0.375,但 lambda 的样本分布集中在 0.62 附近,与 1.674 相去甚远。我试过使用 lambda 的其他值,但它似乎从未正确居中。
我很想知道是否有人对解决此问题有任何建议。我已经包含了从我的 jupyter notebook 会话下载的 .py 文件中的所有代码。
已修复。 gamma 函数的概率质量函数在 numpy.random 中的实现方式与我的教科书不同。
我通过将 get_samples_from_gamma_dist() 主体中的 'scale' 参数设置为 1 / lamb:
得到了正确的平均值def get_samples_from_gamma_dist( num_of_samples, size_of_samples, alpha, lamb ):
'''
Returns table with ( num_of_samples ) rows and ( size_of_samples ) columns.
Cells in the table are i.i.d sample values from numpy's gamma function
with shape parameter ( alpha ) and scale parameter ( 1 / lamb ).
'''
return pd.DataFrame(
data = gamma(
shape = alpha,
scale = 1 / lamb,
size =
(
num_of_samples,
size_of_samples
)
)
)