使用 `LKJCorr` 先验在 PyMC3 中修改 BPMF:使用 `NUTS` 的 PositiveDefiniteError

Modified BPMF in PyMC3 using `LKJCorr` priors: PositiveDefiniteError using `NUTS`

我之前实现了原始 Bayesian Probabilistic Matrix Factorization (BPMF) model in pymc3. for reference, data source, and problem setup. Per the answer to that question from @twiecki, I've implemented a variation of the model using LKJCorr priors for the correlation matrices and uniform priors for the standard deviations. In the original model, the covariance matrices are drawn from Wishart distributions, but due to current limitations of pymc3, the Wishart distribution cannot be sampled from properly. This answer 到一个松散相关的问题,为 LKJCorr 先验的选择提供了简洁的解释。新型号如下。

import pymc3 as pm
import numpy as np
import theano.tensor as t


n, m = train.shape
dim = 10  # dimensionality
beta_0 = 1  # scaling factor for lambdas; unclear on its use
alpha = 2  # fixed precision for likelihood function
std = .05  # how much noise to use for model initialization

# We will use separate priors for sigma and correlation matrix.
# In order to convert the upper triangular correlation values to a
# complete correlation matrix, we need to construct an index matrix:
n_elem = dim * (dim - 1) / 2
tri_index = np.zeros([dim, dim], dtype=int)
tri_index[np.triu_indices(dim, k=1)] = np.arange(n_elem)
tri_index[np.triu_indices(dim, k=1)[::-1]] = np.arange(n_elem)

logging.info('building the BPMF model')
with pm.Model() as bpmf:
    # Specify user feature matrix
    sigma_u = pm.Uniform('sigma_u', shape=dim)
    corr_triangle_u = pm.LKJCorr(
        'corr_u', n=1, p=dim,
        testval=np.random.randn(n_elem) * std)

    corr_matrix_u = corr_triangle_u[tri_index]
    corr_matrix_u = t.fill_diagonal(corr_matrix_u, 1)
    cov_matrix_u = t.diag(sigma_u).dot(corr_matrix_u.dot(t.diag(sigma_u)))
    lambda_u = t.nlinalg.matrix_inverse(cov_matrix_u)

    mu_u = pm.Normal(
        'mu_u', mu=0, tau=beta_0 * lambda_u, shape=dim,
         testval=np.random.randn(dim) * std)
    U = pm.MvNormal(
        'U', mu=mu_u, tau=lambda_u,
        shape=(n, dim), testval=np.random.randn(n, dim) * std)

    # Specify item feature matrix
    sigma_v = pm.Uniform('sigma_v', shape=dim)
    corr_triangle_v = pm.LKJCorr(
        'corr_v', n=1, p=dim,
        testval=np.random.randn(n_elem) * std)

    corr_matrix_v = corr_triangle_v[tri_index]
    corr_matrix_v = t.fill_diagonal(corr_matrix_v, 1)
    cov_matrix_v = t.diag(sigma_v).dot(corr_matrix_v.dot(t.diag(sigma_v)))
    lambda_v = t.nlinalg.matrix_inverse(cov_matrix_v)

    mu_v = pm.Normal(
        'mu_v', mu=0, tau=beta_0 * lambda_v, shape=dim,
         testval=np.random.randn(dim) * std)
    V = pm.MvNormal(
        'V', mu=mu_v, tau=lambda_v,
        testval=np.random.randn(m, dim) * std)

    # Specify rating likelihood function
    R = pm.Normal(
        'R', mu=t.dot(U, V.T), tau=alpha * np.ones((n, m)),
        observed=train)

# `start` is the start dictionary obtained from running find_MAP for PMF.
# See the previous post for PMF code.
for key in bpmf.test_point:
    if key not in start:
        start[key] = bpmf.test_point[key]

with bpmf:
    step = pm.NUTS(scaling=start)

此重新实现的目标是生成一个可以使用 NUTS 采样器进行估计的模型。不幸的是,我在最后一行仍然遇到同样的错误:

PositiveDefiniteError: Scaling is not positive definite. Simple check failed. Diagonal contains negatives. Check indexes [   0    1    2    3    ...   1030 1031 1032 1033 1034   ]

我已经在 this gist 中提供了 PMF、BPMF 和此修改后的 BPMF 的所有代码,以便轻松复制错误。您需要做的就是下载数据(也在要点中引用)。

看起来您正在将完整的精度矩阵传递给正态分布:

mu_u = pm.Normal(
    'mu_u', mu=0, tau=beta_0 * lambda_u, shape=dim,
     testval=np.random.randn(dim) * std)

我假设您只想传递对角线值:

mu_u = pm.Normal(
    'mu_u', mu=0, tau=beta_0 * t.diag(lambda_u), shape=dim,
     testval=np.random.randn(dim) * std)

这会更改为 mu_u 并且 mu_v 会为您解决吗?