在 Tensorflow 和作用域中重新分配变量

Question

我试图理解为什么其中一种实现有效，而另一种不有效的区别。我试图在 tensorflow 中表示一些几何体。

首先，一个辅助文件，d_math.py

!/usr/bin/env python3

将 numpy 导入为 np 将 tensorflow 导入为 tf

dtype = tf.float64

def skew_symmetric(vector):
    #Creates a tensorflow matrix which is a skew-symmetric version of the input vector    
    return tf.stack([(0., -vector[2], vector[1]), (vector[2], 0., -vector[0]), (-vector[1], vector[0], 0.)], axis=0)

这是实现 1：

#!/usr/bin/env python3
import numpy as np
import tensorflow as tf
import d_math as d
import math
import time


class Joint():
    def __init__(self, axis, pos): #TODO: right now only revolute:
        axis_ = tf.Variable(axis, dtype=d.dtype)
        axis_ /= tf.linalg.norm(axis)
        theta_ = tf.Variable(0.0, dtype=d.dtype) #Always at the 0 angle config
        self.theta_ = theta_
        self.R_ = tf.cos(theta_) * tf.eye(3, dtype=d.dtype) + d.skew_symmetric(axis_) + (1. - tf.cos(theta_)) * tf.einsum('i,j->ij', axis_, axis_)
        


joint = Joint(np.array([1.0, 1.0, 1.0]), 0.0)
init = tf.global_variables_initializer()    

with tf.Session() as session:
    session.run(init)    
    print(joint.R_)
    print(joint.R_.eval())
    joint.theta_ = joint.theta_.assign(math.pi/4.)
    session.run(joint.theta_)
    print(joint.R_.eval())

上面的版本更新了theta，然后我得到了两个旋转矩阵的求值，一个是theta = 0，一个是theta = pi/4。

然后我尝试稍微重构我的代码，添加一个全局会话变量，在单独的文件中创建，并尽可能多地隐藏我现在在 API:

版本 2：

#!/usr/bin/env python3
import numpy as np
import tensorflow as tf
import d_math as d
import math
import time
import session as s


class Joint():
    def __init__(self, axis, pos): #TODO: right now only revolute:
        axis_ = tf.Variable(axis, dtype=d.dtype)
        axis_ = axis_ / tf.linalg.norm(axis)
        theta_ = tf.Variable(0.0, dtype=d.dtype) #Always at the 0 angle config
        self.theta_ = theta_
        self.R_ = tf.cos(theta_) * tf.eye(3, dtype=d.dtype) + d.skew_symmetric(axis_) + (1. - tf.cos(theta_)) * tf.einsum('i,j->ij', axis_, axis_)
        
    def set_theta(self, theta):
        self.theta_.assign(theta)
        s.session.run(self.theta_)
        

joint = Joint(np.array([1.0, 1.0, 1.0]), 0.0)
init = tf.global_variables_initializer()    

with s.session as session:
    session.run(init)  
    print(joint.R_)
    print(joint.R_.eval())
    #joint.theta_ = joint.theta_.assign(math.pi/4.)
    joint.set_theta(math.pi/4.)
    print(joint.R_.eval())

session.py可以在这里看到：

#!/usr/bin/env python3
import tensorflow as tf

session = tf.Session()

这为两个评估给出了 theta = 0 的 R 矩阵。

有人可以向我解释为什么实施 2 不起作用吗？

Answer 1

tf.assign returns 更新变量的引用。根据文档：Returns: A Tensor that will hold the new value of 'ref' after the assignment has completed.

在第一个示例中，您实际上使用的是更新后的参考：

joint.theta_ = joint.theta_.assign(math.pi/4.)
session.run(joint.theta_)
print(joint.R_.eval())

在第二个示例中，您没有使用更新的参考：

 def set_theta(self, theta):
    not_used = self.theta_.assign(theta)
    s.session.run(self.theta_)

我最好的猜测是，如果您使用更新后的参考，它应该可以工作：

def set_theta(self, theta):
    self.theta_ = self.theta_.assign(theta)
    s.session.run(self.theta_)

此外，最好不要覆盖原始张量引用，因此我会为更新后的变量创建一个新属性：

def set_theta(self, theta):
    self.theta_updated_ = self.theta_.assign(theta)
    s.session.run(self.theta_updated_)

# ...
print(self.theta_updated_.eval())  # <<< This should give you updated value

重要：但是运行 print(joint.R_.eval()) 可能不会为您提供更新的值，因为操作 self.R_ 未强制依赖于更新后的引用 self.theta_updated_，您可能必须使用 tf.control_dependencies 来强制执行 self.R_ 操作，只有在更新完成后。例如：

with tf.control_dependencies([self.theta_updated_]):
    self.R_ = tf.cos(theta_) * # ...

最后说明：给变量赋值不会自动告诉其他操作它们需要等到赋值完成。我很难发现这一点。这是我写的一些 snippets，用于跟踪使用 tf.assign 时变量的行为。我建议仔细阅读名为：Optimizing original variables that have been updated using tf.assign 的片段。这些片段是独立的。

在 Tensorflow 和作用域中重新分配变量

Reassigning Variables in Tensorflow and scope

python

rotational-matrices

tensorflow

!/usr/bin/env python3