大都会抽样

Question

我正在阅读一本名为 Bayesian Analysis in Python 的书。这本书主要侧重于 PyMC3 包，但对它背后的理论有点含糊，所以我很困惑。

假设我有这样的数据：

data = np.array([51.06, 55.12, 53.73, 50.24, 52.05, 56.40, 48.45, 52.34, 55.65, 51.49, 51.86, 63.43, 53.00, 56.09, 51.93, 52.31, 52.33, 57.48, 57.44, 55.14, 53.93, 54.62, 56.09, 68.58, 51.36, 55.47, 50.73, 51.94, 54.95, 50.39, 52.91, 51.5, 52.68, 47.72, 49.73, 51.82, 54.99, 52.84, 53.19, 54.52, 51.46, 53.73, 51.61, 49.81, 52.42, 54.3, 53.84, 53.16])

我正在看这样的模型：

使用Metropolis Sampling，我如何拟合估计 mu 和 sigma 的模型。

这是我根据阅读的内容猜测的伪代码：

M, S = 50, 1
G = 1

# These are priors right?
mu = stats.norm(loc=M, scale=S)
sigma = stats.halfnorm(scale=G)

target = stats.norm

steps = 1000

mu_samples = [50]
sigma_samples = [1]

for i in range(steps):
    # proposed sample...
    mu_i, sigma_i = mu.rvs(), sigma.rvs()

    # Something happens here
    # How do I calculate the likelidhood??
    "..."
    # some evaluation of a likelihood ratio??

    a = "some"/"ratio"

    acceptance_bar = np.random.random()

    if a > acceptance_bar:
        mu_samples.append(mu_i)
        sigma_samples.append(sigma_i)

我错过了什么？？

Answer 1

希望以下示例对您有所帮助。在这个例子中，我假设我们知道 sigma 的值，所以我们只有 mu.

的先验值

sigma = data.std() # we are assuming we know sigma

steps = 1000
mu_old = data.mean() # initial value, just a good guest
mu_samples = []

# we evaluate the prior for the initial point
prior_old = stats.norm(M, S).pdf(mu_old)
# we evaluate the likelihood for the initial point
likelihood_old = np.prod(stats.norm(mu_old, sigma).pdf(data))
# Bayes' theorem (omitting the denominator) for the initial point
post_old = prior_old * likelihood_old

for i in range(steps):
    # proposal distribution, propose a new value from the old one
    mu_new = stats.norm.rvs(mu_old, 0.1)

    # we evaluate the prior
    prior_new = stats.norm(M, S).pdf(mu_new)

    # we evaluate the likelihood
    likelihood_new = np.prod(stats.norm(mu_new, sigma).pdf(data))

    # Bayes' theorem (omitting the denominator)
    post_new = prior_new * likelihood_new

    # the ratio of posteriors (we do not need to know the normalizing constant)
    a =  post_new / post_old

    if np.random.random() < a:
        mu_old = mu_new
        post_old = post_new

    mu_samples.append(mu_old)

备注：

请注意，我已经定义了一个 建议分布 ，在本例中它是一个以 mu_old 为中心的高斯分布，标准差为 0.1（任意值）。在实践中，MH 的效率在很大程度上取决于提议分布，因此 PyMC3（以及 MH 的其他实际实现）使用一些启发式来调整提议分布。
为简单起见，我在此示例中使用了 pdf，但实际上使用 logpdf 更方便。这样可以在不改变结果的情况下避免下溢问题。
可能性计算为乘积
你遗漏的比率是后验比率
如果您不接受新的建议值，那么您将（再次）保存旧值。

记得检查 this repository 勘误表和代码的更新版本。与书中代码的更新代码的主要区别在于，现在使用 PyMC3 运行模型的首选方法是仅使用 pm.sample() 并让 PyMC3 选择采样器和初始化点你.

大都会抽样

Metropolis Sampling

python

bayesian

sampling

mcmc

pymc3