TensorFlow 中的 Cholesky 因子分化

Question

我想获得 tf.cholesky 相对于其输入的梯度。目前，tf.cholesky 没有注册梯度：

LookupError: No gradient defined for operation 'Cholesky' (op type: Cholesky)

用于生成此错误的代码是：

import tensorflow as tf
A = tf.diag(tf.ones([3]))
chol = tf.cholesky(A)
cholgrad = tf.gradients(chol, A)

虽然我可以自己计算梯度并注册它，但我看到计算 Cholesky 梯度的唯一现有方法涉及 the use of for loops and needs the shape of the input matrix. However, to the best of my knowledge, symbolic loops aren't currently available to TensorFlow.

获取输入矩阵形状的一种可能的解决方法 A 可能是使用：

[int(elem) for elem in list(A.get_shape())]

但是如果 A 的维度依赖于形状为 TensorShape([Dimension(None)]) 的 TensorFlow 占位符对象，则此方法不起作用。

如果有人知道如何计算和注册 tf.cholesky 的梯度，我将不胜感激。

Answer 1

我们在这个问题的回答和评论中对此进行了一些讨论：TensorFlow cholesky decomposition。可能（？）可以移植 Theano implementation of CholeskyGrad, provided its semantics are actually what you want. Theano's is based upon Smith's "Differentiation of the Cholesky Algorithm".

如果您将其实现为 Python 刚刚调用的 C++ 操作，您可以不受限制地访问您可能需要的所有循环结构，以及 Eigen 提供的任何内容。如果你想在纯tensorflow中做，你可以使用控制流ops，比如tf.control_flow_ops.While来循环。

一旦您知道要应用的实际公式，请在此处找到答案：展示了如何在 tensorflow 中为操作实现和注册梯度。

您也可以 create an issue on github 请求此功能，不过，当然，如果您自己实现它然后发送拉取请求，您可能会更快地获得它。 :)

Cholesky factor differentiation in TensorFlow