PyMC3 高斯混合模型
PyMC3 Gaussian Mixing Model
我一直在关注 PyMC3 的高斯混合模型示例:https://github.com/pymc-devs/pymc3/blob/master/pymc3/examples/gaussian_mixture_model.ipynb
并让它与人工数据集很好地协同工作。
我已经用真实的数据集试过了,我正在努力让它给出合理的结果:
我应该查看哪些参数 narrow/widen/change 以获得更好的拟合效果?痕迹似乎很稳定。这是我根据示例调整的模型片段:
model = pm.Model()
with model:
# cluster sizes
a = pm.constant(np.array([1., 1., 1.]))
p = pm.Dirichlet('p', a=a, shape=k)
# ensure all clusters have some points
p_min_potential = pm.Potential('p_min_potential', tt.switch(tt.min(p) < .1, -np.inf, 0))
# cluster centers
means = pm.Normal('means', mu=[0, 1.5, 3], sd=1, shape=k)
# break symmetry
order_means_potential = pm.Potential('order_means_potential',
tt.switch(means[1]-means[0] < 0, -np.inf, 0)
+ tt.switch(means[2]-means[1] < 0, -np.inf, 0))
# measurement error
sd = pm.Uniform('sd', lower=0, upper=2, shape=k)
# latent cluster of each observation
category = pm.Categorical('category', p=p, shape=ndata)
# likelihood for each observed value
points = pm.Normal('obs', mu=means[category], sd=sd[category], observed=data)
原来这里有一篇关于这个主题的优秀博客文章:
http://austinrochford.com/posts/2016-02-25-density-estimation-dpm.html
我一直在关注 PyMC3 的高斯混合模型示例:https://github.com/pymc-devs/pymc3/blob/master/pymc3/examples/gaussian_mixture_model.ipynb
并让它与人工数据集很好地协同工作。
我已经用真实的数据集试过了,我正在努力让它给出合理的结果:
我应该查看哪些参数 narrow/widen/change 以获得更好的拟合效果?痕迹似乎很稳定。这是我根据示例调整的模型片段:
model = pm.Model()
with model:
# cluster sizes
a = pm.constant(np.array([1., 1., 1.]))
p = pm.Dirichlet('p', a=a, shape=k)
# ensure all clusters have some points
p_min_potential = pm.Potential('p_min_potential', tt.switch(tt.min(p) < .1, -np.inf, 0))
# cluster centers
means = pm.Normal('means', mu=[0, 1.5, 3], sd=1, shape=k)
# break symmetry
order_means_potential = pm.Potential('order_means_potential',
tt.switch(means[1]-means[0] < 0, -np.inf, 0)
+ tt.switch(means[2]-means[1] < 0, -np.inf, 0))
# measurement error
sd = pm.Uniform('sd', lower=0, upper=2, shape=k)
# latent cluster of each observation
category = pm.Categorical('category', p=p, shape=ndata)
# likelihood for each observed value
points = pm.Normal('obs', mu=means[category], sd=sd[category], observed=data)
原来这里有一篇关于这个主题的优秀博客文章: http://austinrochford.com/posts/2016-02-25-density-estimation-dpm.html