在 tensorflow 双射器中使用和修改变量

Use and modify variables in tensorflow bijectors

参考文献paper for TensorFlow Distributions (now Probability)中提到TensorFlow Variables可以用来构造BijectorTransformedDistribution对象,即:

import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions

tf.enable_eager_execution()

shift = tf.Variable(1., dtype=tf.float32)
myBij = tfp.bijectors.Affine(shift=shift)

# Normal distribution centered in zero, then shifted to 1 using the bijection
myDistr = tfd.TransformedDistribution(
            distribution=tfd.Normal(loc=0., scale=1.),
            bijector=myBij,
            name="test")

# 2 samples of a normal centered at 1:
y = myDistr.sample(2)
# 2 samples of a normal centered at 0, obtained using inverse transform of myBij:
x = myBij.inverse(y)

我现在想修改 shift 变量(比如,我可能会计算一些似然函数的梯度作为 shift 的函数并更新它的值)所以我这样做

shift.assign(2.)
gx = myBij.forward(x)

我希望 gx=y+1,但我看到 gx=y... 事实上,myBij.shift 仍然等于 1

如果我尝试直接修改双射器,即:

myBij.shift.assign(2.)

我明白了

AttributeError: 'tensorflow.python.framework.ops.EagerTensor' object has no attribute 'assign'

计算梯度也没有按预期工作:

with tf.GradientTape() as tape:
    gx = myBij.forward(x)
grad = tape.gradient(gx, shift)

产生 None,以及脚本结束时的异常:

Exception ignored in: <bound method GradientTape.__del__ of <tensorflow.python.eager.backprop.GradientTape object at 0x7f529c4702e8>>
Traceback (most recent call last):
File "~/.local/lib/python3.6/site-packages/tensorflow/python/eager/backprop.py", line 765, in __del__
AttributeError: 'NoneType' object has no attribute 'context'

我在这里错过了什么?

编辑:我用 graph/session 让它工作,所以似乎急切执行有问题...

注意:我有tensorflow 1.12.0版和tensorflow_probability 0.5.0版

如果您使用的是急切模式,则需要重新计算从变量向前的所有内容。最好在函数中捕获此逻辑;

import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions

tf.enable_eager_execution()

shift = tf.Variable(1., dtype=tf.float32)
def f():
  myBij = tfp.bijectors.Affine(shift=shift)

  # Normal distribution centered in zero, then shifted to 1 using the bijection
  myDistr = tfd.TransformedDistribution(
            distribution=tfd.Normal(loc=0., scale=1.),
            bijector=myBij,
            name="test")

  # 2 samples of a normal centered at 1:
  y = myDistr.sample(2)
  # 2 samples of a normal centered at 0, obtained using inverse
  # transform of myBij:
  x = myBij.inverse(y)
  return x, y
x, y = f()
shift.assign(2.)
gx, _ = f()

关于渐变,您需要将对 f() 的调用包装在 GradientTape