如何在 Tensorflow 概率分布中对结构化参数建模?
How to model structured parameters in Tensorflow Probability distributions?
我想对具有结构化参数的多元分布建模,例如:具有由低秩部分和对角线部分组成的协方差矩阵的多元正态分布。实现这一目标的推荐方法是什么? (张量流 2.8)
DIM=4
mean = tf.Variable(np.zeros(DIM), dtype=tf.float32, name='mean')
low_rank = tf.Variable(np.zeros((DIM, 2)), dtype=tf.float32, name='cov')
diagonal = tf.Variable(np.zeros(DIM), dtype=tf.float32, name='noise')
target_distribution = tfd.MultivariateNormalTriL(
loc=mean,
scale_tril=tf.linalg.cholesky(
tf.linalg.matmul(low_rank, low_rank, transpose_b=True) + tf.linalg.diag(tf.math.softplus(diagonal))
)
)
print(target_distribution.trainable_variables)
仅 returns
(<tf.Variable 'mean:0' shape=(4,) dtype=float32, numpy=array([0., 0., 0., 0.], dtype=float32)>,)
,即只有那些直接赋值的变量才进入跟踪变量的范围,而不是通过表达式进入的变量。
语法是什么让 low_rank
和 diagonal
成为我可以适应数据的可训练变量?
我知道有 tfd.MultivariateNormalDiagPlusLowRank
解决了这个具体的例子,但我仍然对推荐的结构化参数建模方法感兴趣。
当你在 tf.Variable 上 运行 任何 TF op 时(在急切模式下),变量值被读入张量并计算新值,失去与变量的任何先前关联.因此,在您的示例中,cholesky 和 matmul 都在构建 Distribution 之前发生,并且它永远不会看到这些变量。
在 TFP 中,我们创建了一些实用程序来解决此类问题,特别是 tfp.util.DeferredTensor
, tfp.util.TransformedVariable
, and tfp.experimental.util.DeferredModule
. Each of these aim to allow for lazy evaluation/construction of some thing. TransformedVariable
is nice because it also handles updating of the underlying variable in pre-transform space. It's limited in the sense that it can only have a single underlying Variable -- your example suggests you'll want to have several floating around. Check out the examples in DeferredModule -- it might get you what you want. You may want to parameterize some composition of [tf.linalg.LinearOperator
s])https://www.tensorflow.org/api_docs/python/tf/linalg/LinearOperator) 带有一些变量或类似的东西。
这是使用 DeferredModule 重写的上述示例:https://colab.research.google.com/drive/1DRX_Jv58abfsWE6h1BIz6YiQRAzCmn8r
我想对具有结构化参数的多元分布建模,例如:具有由低秩部分和对角线部分组成的协方差矩阵的多元正态分布。实现这一目标的推荐方法是什么? (张量流 2.8)
DIM=4
mean = tf.Variable(np.zeros(DIM), dtype=tf.float32, name='mean')
low_rank = tf.Variable(np.zeros((DIM, 2)), dtype=tf.float32, name='cov')
diagonal = tf.Variable(np.zeros(DIM), dtype=tf.float32, name='noise')
target_distribution = tfd.MultivariateNormalTriL(
loc=mean,
scale_tril=tf.linalg.cholesky(
tf.linalg.matmul(low_rank, low_rank, transpose_b=True) + tf.linalg.diag(tf.math.softplus(diagonal))
)
)
print(target_distribution.trainable_variables)
仅 returns
(<tf.Variable 'mean:0' shape=(4,) dtype=float32, numpy=array([0., 0., 0., 0.], dtype=float32)>,)
,即只有那些直接赋值的变量才进入跟踪变量的范围,而不是通过表达式进入的变量。
语法是什么让 low_rank
和 diagonal
成为我可以适应数据的可训练变量?
我知道有 tfd.MultivariateNormalDiagPlusLowRank
解决了这个具体的例子,但我仍然对推荐的结构化参数建模方法感兴趣。
当你在 tf.Variable 上 运行 任何 TF op 时(在急切模式下),变量值被读入张量并计算新值,失去与变量的任何先前关联.因此,在您的示例中,cholesky 和 matmul 都在构建 Distribution 之前发生,并且它永远不会看到这些变量。
在 TFP 中,我们创建了一些实用程序来解决此类问题,特别是 tfp.util.DeferredTensor
, tfp.util.TransformedVariable
, and tfp.experimental.util.DeferredModule
. Each of these aim to allow for lazy evaluation/construction of some thing. TransformedVariable
is nice because it also handles updating of the underlying variable in pre-transform space. It's limited in the sense that it can only have a single underlying Variable -- your example suggests you'll want to have several floating around. Check out the examples in DeferredModule -- it might get you what you want. You may want to parameterize some composition of [tf.linalg.LinearOperator
s])https://www.tensorflow.org/api_docs/python/tf/linalg/LinearOperator) 带有一些变量或类似的东西。
这是使用 DeferredModule 重写的上述示例:https://colab.research.google.com/drive/1DRX_Jv58abfsWE6h1BIz6YiQRAzCmn8r