pymc3 ：具有多维集中因子的狄利克雷

Question

我正在努力实现 Dirichlet 变量的集中因子依赖于另一个变量的模型。

情况如下：

系统因组件故障而失败（共有三个组件，每个组件只有一个故障 test/observation）。

组件发生故障的概率取决于温度。

这是该情况的（评论的）简短实施：

import numpy as np
import pymc3 as pm
import theano.tensor as tt


# Temperature data : 3 cold temperatures and 3 warm temperatures
T_data = np.array([10, 12, 14, 80, 90, 95])

# Data of failures of 3 components : [0,0,1] means component 3 failed
F_data = np.array([[0, 0, 1], \
       [0, 0, 1], \
       [0, 0, 1], \
       [1, 0, 0], \
       [1, 0, 0], \
       [1, 0, 0]])

n_component = 3

# When temperature is cold : Component 1 fails
# When temperature is warm : Component 3 fails
# Component 2 never fails

# Number of observations :
n_obs = len(F_data)


# The number of failures can be modeled as a Multinomial F ~ M(n_obs, p) with parameters 
# -  n_test : number of tests (Fixed)
# -  p : probability of failure of each component (shape (n_obs, 3))

# The probability of failure of components follows a Dirichlet distribution p ~ Dir(alpha) with parameters:
# -  alpha : concentration (shape (n_obs, 3))
# The Dirichlet distributions ensures the probabilities sum to 1 

# The alpha parameters (and the the probability of failures) depend on the temperature alpha ~ a + b * T
# - a : bias term (shape (1,3))
# - b : describes temperature dependency of alpha (shape (1,3))

_

# The prior on "a" is a normal distributions with mean 1/2 and std 0.001
# a ~ N(1/2, 0.001)

# The prior on "b" is a normal distribution zith mean 0 and std 0.001
# b ~ N(0, 0.001)


# Coding it all with pymc3

with pm.Model() as model:
    a = pm.Normal('a', 1/2, 1/(0.001**2), shape = n_component)
    b = pm.Normal('b', 0, 1/(0.001**2), shape = n_component)

    # I generate 3 alphas values (corresponding to the 3 components) for each of the 6 temperatures
    # I tried different ways to compute alpha but nothing worked out
    alphas = pm.Deterministic('alphas', a + b * tt.stack([T_data, T_data, T_data], axis=1))
    #alphas = pm.Deterministic('alphas', a + b[None, :] * T_data[:, None])
    #alphas = pm.Deterministic('alphas', a + tt.outer(T_data,b))


    # I think I should get 3 probabilities (corresponding to the 3 components) for each of the 6 temperatures
    #p = pm.Dirichlet('p', alphas, shape = n_component)
    p = pm.Dirichlet('p', alphas, shape = (n_obs,n_component))

    # Multinomial is observed and take values from F_data
    F = pm.Multinomial('F', 1, p, observed = F_data)


with model:
    trace = pm.sample(5000)

我在示例函数中遇到以下错误：

RemoteTraceback                           Traceback (most recent call last)
RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/anaconda3/lib/python3.6/site-packages/pymc3/parallel_sampling.py", line 73, in run
    self._start_loop()
  File "/anaconda3/lib/python3.6/site-packages/pymc3/parallel_sampling.py", line 113, in _start_loop
    point, stats = self._compute_point()
  File "/anaconda3/lib/python3.6/site-packages/pymc3/parallel_sampling.py", line 139, in _compute_point
    point, stats = self._step_method.step(self._point)
  File "/anaconda3/lib/python3.6/site-packages/pymc3/step_methods/arraystep.py", line 247, in step
    apoint, stats = self.astep(array)
  File "/anaconda3/lib/python3.6/site-packages/pymc3/step_methods/hmc/base_hmc.py", line 117, in astep
    'might be misspecified.' % start.energy)
ValueError: Bad initial energy: inf. The model might be misspecified.
"""

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
ValueError: Bad initial energy: inf. The model might be misspecified.

The above exception was the direct cause of the following exception:

RuntimeError                              Traceback (most recent call last)
<ipython-input-5-121fdd564b02> in <module>()
      1 with model:
      2     #start = pm.find_MAP()
----> 3     trace = pm.sample(5000)

/anaconda3/lib/python3.6/site-packages/pymc3/sampling.py in sample(draws, step, init, n_init, start, trace, chain_idx, chains, cores, tune, nuts_kwargs, step_kwargs, progressbar, model, random_seed, live_plot, discard_tuned_samples, live_plot_kwargs, compute_convergence_checks, use_mmap, **kwargs)
    438             _print_step_hierarchy(step)
    439             try:
--> 440                 trace = _mp_sample(**sample_args)
    441             except pickle.PickleError:
    442                 _log.warning("Could not pickle model, sampling singlethreaded.")

/anaconda3/lib/python3.6/site-packages/pymc3/sampling.py in _mp_sample(draws, tune, step, chains, cores, chain, random_seed, start, progressbar, trace, model, use_mmap, **kwargs)
    988         try:
    989             with sampler:
--> 990                 for draw in sampler:
    991                     trace = traces[draw.chain - chain]
    992                     if trace.supports_sampler_stats and draw.stats is not None:

/anaconda3/lib/python3.6/site-packages/pymc3/parallel_sampling.py in __iter__(self)
    303 
    304         while self._active:
--> 305             draw = ProcessAdapter.recv_draw(self._active)
    306             proc, is_last, draw, tuning, stats, warns = draw
    307             if self._progress is not None:

/anaconda3/lib/python3.6/site-packages/pymc3/parallel_sampling.py in recv_draw(processes, timeout)
    221         if msg[0] == 'error':
    222             old = msg[1]
--> 223             six.raise_from(RuntimeError('Chain %s failed.' % proc.chain), old)
    224         elif msg[0] == 'writing_done':
    225             proc._readable = True

/anaconda3/lib/python3.6/site-packages/six.py in raise_from(value, from_value)

RuntimeError: Chain 1 failed.

有什么建议吗？

Answer 1

模型指定错误。 alphas 在您当前的参数化下采用非正值，而 Dirichlet 分布要求它们为正，从而导致模型指定错误。

Dirichlet-Multinomial回归中，使用指数link函数来调节线性模型的范围和Dirichlet-Multinomial的域，即

alpha = exp(beta*X)

在 the MGLM package documentation 中有这方面的详细信息。

Dirichlet-多项式回归模型

如果我们实施这个模型，我们可以实现不错的模型收敛和采样。

import numpy as np
import pymc3 as pm
import theano
import theano.tensor as tt
from sklearn.preprocessing import scale

T_data = np.array([10,12,14,80,90,95])

# standardize the data for better sampling
T_data_z = scale(T_data)

# transform to theano tensor, so it works with tt.outer
T_data_z = theano.shared(T_data_z)

F_data = np.array([
    [0,0,1],
    [0,0,1],
    [0,0,1],
    [1,0,0],
    [1,0,0],
    [1,0,0],
])

# N = num_obs, K = num_components
N, K = F_data.shape

with pm.Model() as dmr_model:
    a = pm.Normal('a', mu=0, sd=1, shape=K)
    b = pm.Normal('b', mu=0, sd=1, shape=K)

    alpha = pm.Deterministic('alpha', pm.math.exp(a + tt.outer(T_data_z, b)))

    p = pm.Dirichlet('p', a=alpha, shape=(N, K))

    F = pm.Multinomial('F', 1, p, observed=F_data)

    trace = pm.sample(5000, tune=10000, target_accept=0.9)

模型结果

此模型中的抽样并不完美。例如，即使提高了目标接受率并进行了额外的调整，仍然存在一些差异。

There were 501 divergences after tuning. Increase target_accept or reparameterize.

There were 477 divergences after tuning. Increase target_accept or reparameterize.

The acceptance probability does not match the target. It is 0.5858954056820339, but should be close to 0.8. Try to increase the number of tuning steps.

The number of effective samples is smaller than 10% for some parameters.

跟踪图

我们可以看到 a 和 b 的轨迹看起来不错，平均位置对数据有意义。

配对图

虽然相关性对于 NUTS 来说不是什么问题，但不相关的后验采样是理想的。在大多数情况下，我们看到相关性较低，a 组件中有一些轻微的结构。

后验图

最后，我们可以查看 p 的后验图并确认它们对数据有意义。

替代模型

Dirichlet-Multinomial 的优点是处理过度分散。可能值得尝试更简单的多项逻辑回归/Softmax 回归，因为它运行明显更快并且不会出现 DMR 模型中出现的任何采样问题。

最后，您可以运行两者并执行模型比较以查看 Dirichlet-Multinomial 是否真的增加了解释值。

型号

with pm.Model() as softmax_model:
    a = pm.Normal('a', mu=0, sd=1, shape=K)
    b = pm.Normal('b', mu=0, sd=1, shape=K)

    p = pm.Deterministic('p', tt.nnet.softmax(a + tt.outer(T_data_z, b)))

    F = pm.Multinomial('F', 1, p, observed = F_data)

    trace_sm = pm.sample(5000, tune=10000)

pymc3 ：具有多维集中因子的狄利克雷

pymc3 : Dirichlet with multidimensional concentration factor

python

dirichlet

bayesian

multinomial

pymc3

Dirichlet-多项式回归模型

模型结果

跟踪图

配对图

后验图

替代模型

型号

后验图