BasicRNNCell 中的内部变量
internal variables in BasicRNNCell
我有下面的示例代码来测试BasicRNNCell
。我想得到它的内部矩阵,这样我就可以使用我自己的代码计算 output_res
、newstate_res
的值,以确保我可以重现 output_res
、newstate_res
.
在tensorflow源码中,写着output = new_state = act(W * input + U * state + B)
。有人知道我怎样才能得到 W
和 U
吗? (我试图访问 cell._kernel
,但它不可用。)
$ cat ./main.py
#!/usr/bin/env python
# vim: set noexpandtab tabstop=2 shiftwidth=2 softtabstop=-1 fileencoding=utf-8:
import tensorflow as tf
import numpy as np
batch_size = 4
vector_size = 3
inputs = tf.placeholder(
tf.float32
, [batch_size, vector_size]
)
num_units = 2
state = tf.zeros([batch_size, num_units], tf.float32)
cell = tf.contrib.rnn.BasicRNNCell(num_units=num_units)
output, newstate = cell(inputs = inputs, state = state)
X = np.zeros([batch_size, vector_size])
#X = np.ones([batch_size, vector_size])
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
output_res, newstate_res = sess.run([output, newstate], feed_dict = {inputs: X})
print(output_res)
print(newstate_res)
sess.close()
$ ./main.py
[[ 0. 0.]
[ 0. 0.]
[ 0. 0.]
[ 0. 0.]]
[[ 0. 0.]
[ 0. 0.]
[ 0. 0.]
[ 0. 0.]]
简短回答:你知道你在追求 cell._kernel
。下面是一些使用 variables
属性 获取内核(和偏差)的代码,这在大多数 TensorFlow RNN 中:
import tensorflow as tf
import numpy as np
batch_size = 4
vector_size = 3
inputs = tf.placeholder(tf.float32, [batch_size, vector_size])
num_units = 2
state = tf.zeros([batch_size, num_units], tf.float32)
cell = tf.contrib.rnn.BasicRNNCell(num_units=num_units)
output, newstate = cell(inputs=inputs, state=state)
print("Output of cell.variables is a list of Tensors:")
print(cell.variables)
kernel, bias = cell.variables
X = np.zeros([batch_size, vector_size])
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
output_, newstate_, k_, b_ = sess.run(
[output, newstate, kernel, bias], feed_dict = {inputs: X})
print("Output:")
print(output_)
print("New State == Output:")
print(newstate_)
print("\nKernel:")
print(k_)
print("\nBias:")
print(b_)
输出
Output of cell.variables is a list of Tensors:
[<tf.Variable 'basic_rnn_cell/kernel:0' shape=(5, 2) dtype=float32_ref>,
<tf.Variable 'basic_rnn_cell/bias:0' shape=(2,) dtype=float32_ref>]
Output:
[[ 0. 0.]
[ 0. 0.]
[ 0. 0.]
[ 0. 0.]]
New State == Output:
[[ 0. 0.]
[ 0. 0.]
[ 0. 0.]
[ 0. 0.]]
Kernel:
[[ 0.41417515 -0.64997244]
[-0.40868729 -0.90995187]
[ 0.62134564 -0.88962835]
[-0.35878009 -0.25680023]
[ 0.35606658 -0.83596271]]
Bias:
[ 0. 0.]
长答:你还问W和U怎么取,我把call
的实现照搬过来,讨论一下W和U在哪
def call(self, inputs, state):
"""Most basic RNN: output = new_state = act(W * input + U * state + B)."""
gate_inputs = math_ops.matmul(
array_ops.concat([inputs, state], 1), self._kernel)
gate_inputs = nn_ops.bias_add(gate_inputs, self._bias)
output = self._activation(gate_inputs)
return output, output
看起来不像有一个 W 和一个 U,但它们确实存在。本质上,内核的前 vector_size
行是 W,接下来的 num_units
行是 U。也许在 LaTeX 中查看元素方面的数学运算会有所帮助:
我正在使用 m 作为通用批处理索引,v 作为 vector_size
, n 为 num_units
,b 为 batch_size
。还有 [ ; ]表示串联。由于 TensorFlow 是批量处理的,因此实现通常使用右乘矩阵。
由于这是一个非常基础的 RNN,output == new_state
。下一次迭代的 "history" 只是当前迭代的输出。
我有下面的示例代码来测试BasicRNNCell
。我想得到它的内部矩阵,这样我就可以使用我自己的代码计算 output_res
、newstate_res
的值,以确保我可以重现 output_res
、newstate_res
.
在tensorflow源码中,写着output = new_state = act(W * input + U * state + B)
。有人知道我怎样才能得到 W
和 U
吗? (我试图访问 cell._kernel
,但它不可用。)
$ cat ./main.py
#!/usr/bin/env python
# vim: set noexpandtab tabstop=2 shiftwidth=2 softtabstop=-1 fileencoding=utf-8:
import tensorflow as tf
import numpy as np
batch_size = 4
vector_size = 3
inputs = tf.placeholder(
tf.float32
, [batch_size, vector_size]
)
num_units = 2
state = tf.zeros([batch_size, num_units], tf.float32)
cell = tf.contrib.rnn.BasicRNNCell(num_units=num_units)
output, newstate = cell(inputs = inputs, state = state)
X = np.zeros([batch_size, vector_size])
#X = np.ones([batch_size, vector_size])
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
output_res, newstate_res = sess.run([output, newstate], feed_dict = {inputs: X})
print(output_res)
print(newstate_res)
sess.close()
$ ./main.py
[[ 0. 0.]
[ 0. 0.]
[ 0. 0.]
[ 0. 0.]]
[[ 0. 0.]
[ 0. 0.]
[ 0. 0.]
[ 0. 0.]]
简短回答:你知道你在追求 cell._kernel
。下面是一些使用 variables
属性 获取内核(和偏差)的代码,这在大多数 TensorFlow RNN 中:
import tensorflow as tf
import numpy as np
batch_size = 4
vector_size = 3
inputs = tf.placeholder(tf.float32, [batch_size, vector_size])
num_units = 2
state = tf.zeros([batch_size, num_units], tf.float32)
cell = tf.contrib.rnn.BasicRNNCell(num_units=num_units)
output, newstate = cell(inputs=inputs, state=state)
print("Output of cell.variables is a list of Tensors:")
print(cell.variables)
kernel, bias = cell.variables
X = np.zeros([batch_size, vector_size])
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
output_, newstate_, k_, b_ = sess.run(
[output, newstate, kernel, bias], feed_dict = {inputs: X})
print("Output:")
print(output_)
print("New State == Output:")
print(newstate_)
print("\nKernel:")
print(k_)
print("\nBias:")
print(b_)
输出
Output of cell.variables is a list of Tensors:
[<tf.Variable 'basic_rnn_cell/kernel:0' shape=(5, 2) dtype=float32_ref>,
<tf.Variable 'basic_rnn_cell/bias:0' shape=(2,) dtype=float32_ref>]
Output:
[[ 0. 0.]
[ 0. 0.]
[ 0. 0.]
[ 0. 0.]]
New State == Output:
[[ 0. 0.]
[ 0. 0.]
[ 0. 0.]
[ 0. 0.]]
Kernel:
[[ 0.41417515 -0.64997244]
[-0.40868729 -0.90995187]
[ 0.62134564 -0.88962835]
[-0.35878009 -0.25680023]
[ 0.35606658 -0.83596271]]
Bias:
[ 0. 0.]
长答:你还问W和U怎么取,我把call
的实现照搬过来,讨论一下W和U在哪
def call(self, inputs, state):
"""Most basic RNN: output = new_state = act(W * input + U * state + B)."""
gate_inputs = math_ops.matmul(
array_ops.concat([inputs, state], 1), self._kernel)
gate_inputs = nn_ops.bias_add(gate_inputs, self._bias)
output = self._activation(gate_inputs)
return output, output
看起来不像有一个 W 和一个 U,但它们确实存在。本质上,内核的前 vector_size
行是 W,接下来的 num_units
行是 U。也许在 LaTeX 中查看元素方面的数学运算会有所帮助:
我正在使用 m 作为通用批处理索引,v 作为 vector_size
, n 为 num_units
,b 为 batch_size
。还有 [ ; ]表示串联。由于 TensorFlow 是批量处理的,因此实现通常使用右乘矩阵。
由于这是一个非常基础的 RNN,output == new_state
。下一次迭代的 "history" 只是当前迭代的输出。