PyMC3 和 Arviz:使用 arviz 可视化多种条件下的最高后验密度 plot_hpd
PyMC3 and Arviz: Visualizing highest posterior density for multiple conditions using arviz plot_hpd
我正在尝试为多个组可视化具有最高后验密度 (hpd) 的简单线性回归。但是,我在为每种情况应用 hpd 时遇到问题。每当我 运行 这段代码时,我都会为每个条件提取相同的后验密度。我想可视化与其条件相对应的后验密度。如何为每个组绘制 hpd?
编辑:问题已在 PyMC3 discourse
中解决
import pymc3 as pm
import numpy as np
import matplotlib.pyplot as plt
import arviz as az
import pandas as pd
# data
data = pd.read_csv('www_MCMC/MCMC/data.csv')
rsp = data['Mean Response'].values
rt = data['Mean Reaction Time'].values
idx = pd.Categorical(data['Structure'], categories=['No Background', 'Only Road', 'Only Dot Ground', 'Dot Terrain + Dot Ground', 'Space', 'Full Background']).codes
groups = len(np.unique(idx))
# model
with pm.Model() as rsp_rt:
α = pm.Normal('α', mu=0, sd=10, shape=groups)
β = pm.Normal('β', mu=0, sd=10, shape=groups)
ϵ = pm.HalfCauchy('ϵ', 10)
μ = pm.Deterministic('μ', α[idx] + β[idx] * rt)
y_pred = pm.Normal('y_pred2', mu=μ, sd=ϵ, observed=rsp)
trace_rsp_rt = pm.sample(cores=1)
_, ax_rsp_rt = plt.subplots(2, 3, figsize=(10, 5), sharex=True, sharey=True, constrained_layout=True)
ax_rsp_rt = np.ravel(ax_rsp_rt)
for i in range(groups):
alpha = trace_rsp_rt['α'][:, i].mean()
beta = trace_rsp_rt['β'][:, i].mean()
ax_rsp_rt[i].plot(rt, alpha + beta * rt, c='k', label= f'rsp = {alpha:.2f} + {beta:.2f} * rt')
az.plot_hpd(rt, trace_rsp_rt['μ'], credible_interval=0.98, color='k', ax=ax_rsp_rt[i])
ax_rsp_rt[i].set_title(f'$\mu_{i}$')
ax_rsp_rt[i].set_xlabel(f'$x_{i}$')
ax_rsp_rt[i].set_ylabel(f'$y_{i}$', labelpad=17, rotation=0)
ax_rsp_rt[i].legend()
plt.xlim(1.2, 1.8)
plt.ylim(0.6, 1)
我已经在PyMC3 discourse上回答了问题,请参阅那里以获得更详细的答案。
为了完整起见,我也在这里分享部分答案:
对代码进行一些小修改应该可以解决该问题。但是,我建议利用 this notebook.
中所示的 ArviZ 和 xarray
...
for i in range(groups):
alpha = trace_rsp_rt['α'][:, i]
beta = trace_rsp_rt['β'][:, i]
mu = alpha + beta * rt
# there may be broadcasting issues requiring to use rt[None, :]
# xarray would handle broadcasting automatically ass seen in the notebook
ax_rsp_rt[i].plot(rt, mu.mean(), c='k', label= f'rsp = {alpha:.2f} + {beta:.2f} * rt')
az.plot_hpd(rt, mu, credible_interval=0.98, color='k', ax=ax_rsp_rt[i])
ax_rsp_rt[i].legend()
# combining pyplot and object based commands can yield unexpected results
ax.set_xlim(1.2, 1.8)
ax.set_ylim(0.6, 1)
我正在尝试为多个组可视化具有最高后验密度 (hpd) 的简单线性回归。但是,我在为每种情况应用 hpd 时遇到问题。每当我 运行 这段代码时,我都会为每个条件提取相同的后验密度。我想可视化与其条件相对应的后验密度。如何为每个组绘制 hpd?
编辑:问题已在 PyMC3 discourse
中解决import pymc3 as pm
import numpy as np
import matplotlib.pyplot as plt
import arviz as az
import pandas as pd
# data
data = pd.read_csv('www_MCMC/MCMC/data.csv')
rsp = data['Mean Response'].values
rt = data['Mean Reaction Time'].values
idx = pd.Categorical(data['Structure'], categories=['No Background', 'Only Road', 'Only Dot Ground', 'Dot Terrain + Dot Ground', 'Space', 'Full Background']).codes
groups = len(np.unique(idx))
# model
with pm.Model() as rsp_rt:
α = pm.Normal('α', mu=0, sd=10, shape=groups)
β = pm.Normal('β', mu=0, sd=10, shape=groups)
ϵ = pm.HalfCauchy('ϵ', 10)
μ = pm.Deterministic('μ', α[idx] + β[idx] * rt)
y_pred = pm.Normal('y_pred2', mu=μ, sd=ϵ, observed=rsp)
trace_rsp_rt = pm.sample(cores=1)
_, ax_rsp_rt = plt.subplots(2, 3, figsize=(10, 5), sharex=True, sharey=True, constrained_layout=True)
ax_rsp_rt = np.ravel(ax_rsp_rt)
for i in range(groups):
alpha = trace_rsp_rt['α'][:, i].mean()
beta = trace_rsp_rt['β'][:, i].mean()
ax_rsp_rt[i].plot(rt, alpha + beta * rt, c='k', label= f'rsp = {alpha:.2f} + {beta:.2f} * rt')
az.plot_hpd(rt, trace_rsp_rt['μ'], credible_interval=0.98, color='k', ax=ax_rsp_rt[i])
ax_rsp_rt[i].set_title(f'$\mu_{i}$')
ax_rsp_rt[i].set_xlabel(f'$x_{i}$')
ax_rsp_rt[i].set_ylabel(f'$y_{i}$', labelpad=17, rotation=0)
ax_rsp_rt[i].legend()
plt.xlim(1.2, 1.8)
plt.ylim(0.6, 1)
我已经在PyMC3 discourse上回答了问题,请参阅那里以获得更详细的答案。
为了完整起见,我也在这里分享部分答案:
对代码进行一些小修改应该可以解决该问题。但是,我建议利用 this notebook.
中所示的 ArviZ 和 xarray...
for i in range(groups):
alpha = trace_rsp_rt['α'][:, i]
beta = trace_rsp_rt['β'][:, i]
mu = alpha + beta * rt
# there may be broadcasting issues requiring to use rt[None, :]
# xarray would handle broadcasting automatically ass seen in the notebook
ax_rsp_rt[i].plot(rt, mu.mean(), c='k', label= f'rsp = {alpha:.2f} + {beta:.2f} * rt')
az.plot_hpd(rt, mu, credible_interval=0.98, color='k', ax=ax_rsp_rt[i])
ax_rsp_rt[i].legend()
# combining pyplot and object based commands can yield unexpected results
ax.set_xlim(1.2, 1.8)
ax.set_ylim(0.6, 1)