具有向量输入的 GPFlow 多类分类导致形状不匹配的值错误

GPFlow Multiclass classification with vector inputs causes value error on shape mismatch

我正在尝试遵循 GPFlow 中的多类分类(使用 v2.1.3),如下所述:

https://gpflow.readthedocs.io/en/master/notebooks/advanced/multiclass_classification.html

与例子不同的是,X向量是10维的,类预测的个数是5。但是好像有错误使用诱导变量时的维度。我更改了内核并使用虚拟数据来实现可重现性,只是希望将此代码设为 运行。我把变量的维度放在了问题所在的情况下。任何损失计算都会导致错误,例如:

 ValueError: Dimensions must be equal, but are 10 and 5 for '{{node truediv}} = RealDiv[T=DT_DOUBLE](strided_slice_2, truediv/softplus/forward/IdentityN)' with input shapes: [200,10], [5].

好像它需要Y结果作为诱导变量,但是gpflow网站上的例子不需要它或者它混淆了[=的长度32=]X 输入要预测的 类 个数。

我尝试像 一样扩展 Y 的维度,但没有帮助。

可重现代码:

import gpflow
from gpflow.utilities import ops, print_summary, set_trainable
from gpflow.config import set_default_float, default_float, set_default_summary_fmt
from gpflow.ci_utils import ci_niter
import random
import numpy as np
import tensorflow as tf

np.random.seed(0)
tf.random.set_seed(123)

num_classes = 5
num_of_data_points = 1000
num_of_functions = num_classes
num_of_independent_vars = 10

data_gp_train = np.random.rand(num_of_data_points, num_of_independent_vars)
data_gp_train_target_hot = np.eye(num_classes)[np.array(random.choices(list(range(num_classes)), k=num_of_data_points))].astype(bool)
data_gp_train_target = np.apply_along_axis(np.argmax, 1, data_gp_train_target_hot)
data_gp_train_target = np.expand_dims(data_gp_train_target, axis=1)


data_gp = ( data_gp_train, data_gp_train_target )

lengthscales = [0.1]*num_classes
variances = [1.0]*num_classes
kernel = gpflow.kernels.Matern32(variance=variances, lengthscales=lengthscales) 

# Robustmax Multiclass Likelihood
invlink = gpflow.likelihoods.RobustMax(num_of_functions)  # Robustmax inverse link function
likelihood = gpflow.likelihoods.MultiClass(num_of_functions, invlink=invlink)  # Multiclass likelihood

inducing_inputs = data_gp_train[::5].copy()  # inducing inputs (20% of obs are inducing)
# inducing_inputs = data_gp_train[:200,:].copy()  # inducing inputs (20% of obs are inducing)
   
m = gpflow.models.SVGP(
    kernel=kernel,
    likelihood=likelihood,
    inducing_variable=inducing_inputs,
    num_latent_gps=num_of_functions,
    whiten=True,
    q_diag=True,
)

set_trainable(m.inducing_variable, False)
print_summary(m)

opt = gpflow.optimizers.Scipy()
opt_logs = opt.minimize(
    m.training_loss_closure(data_gp), m.trainable_variables, options=dict(maxiter=ci_niter(1000))
)
print_summary(m, fmt="notebook")

维度:

data_gp[0].shape
Out[132]: (1000, 10)

data_gp[1].shape
Out[133]: (1000, 5)

inducing_inputs.shape
Out[134]: (200, 10)

错误:

 ValueError: Dimensions must be equal, but are 10 and 5 for '{{node truediv}} = RealDiv[T=DT_DOUBLE](strided_slice_2, truediv/softplus/forward/IdentityN)' with input shapes: [200,10], [5].

当 运行 你的例子我得到一个稍微不同的错误,但问题在于你如何定义长度尺度和方差。你写:

lengthscales = [0.1]*num_classes
variances = [1.0]*num_classes
kernel = gpflow.kernels.Matern32(variance=variances, lengthscales=lengthscales)

但是标准内核需要标量方差,并且长度尺度是标量或匹配特征的数量,所以如果你用以下代码替换该代码:

lengthscales = [0.1]*num_of_independent_vars
kernel = gpflow.kernels.Matern32(variance=1.0, lengthscales=lengthscales)

然后一切正常。

这将为每个输出(class 概率)提供一个 shared 内核,每个输入维度具有 independent 长度标度( "ARD").

如果你想为每个输出使用不同的内核(但是,例如,各向同性长度尺度),你可以使用 SeparateIndependent 多输出内核来实现,请参阅 multioutput notebook example.