在 Tensorflow 中可视化注意力激活
Visualizing attention activation in Tensorflow
有没有办法在 TensorFlow 的 seq2seq
模型中可视化某些输入的注意力权重,如上图 link(来自 Bahdanau 等人,2014 年)?我发现 TensorFlow's github issue 与此有关,但我找不到如何在会话期间获取注意力掩码。
我还想为我的文本摘要任务可视化 Tensorflow seq2seq ops 的注意力权重。我认为临时解决方案是使用 session.run() 来评估注意力掩码张量,如上所述。有趣的是,原始的 seq2seq.py ops 被认为是旧版本,无法在 github 中轻易找到,所以我只是使用 0.12.0 wheel 发行版中的 seq2seq.py 文件并对其进行了修改。为了绘制热图,我使用了 'Matplotlib' 包,非常方便。
我修改了代码如下:
https://github.com/rockingdingo/deepnlp/tree/master/deepnlp/textsum#attention-visualization
# Find the attention mask tensor in function attention_decoder()-> attention()
# Add the attention mask tensor to ‘return’ statement of all the function that calls the attention_decoder(),
# all the way up to model_with_buckets() function, which is the final function I use for bucket training.
def attention(query):
"""Put attention masks on hidden using hidden_features and query."""
ds = [] # Results of attention reads will be stored here.
# some code
for a in xrange(num_heads):
with variable_scope.variable_scope("Attention_%d" % a):
# some code
s = math_ops.reduce_sum(v[a] * math_ops.tanh(hidden_features[a] + y),
[2, 3])
# This is the attention mask tensor we want to extract
a = nn_ops.softmax(s)
# some code
# add 'a' to return function
return ds, a
# modified model.step() function and return masks tensor
self.outputs, self.losses, self.attn_masks = seq2seq_attn.model_with_buckets(…)
# use session.run() to evaluate attn masks
attn_out = session.run(self.attn_masks[bucket_id], input_feed)
attn_matrix = ...
# Use the plot_attention function in eval.py to visual the 2D ndarray during prediction.
eval.plot_attention(attn_matrix[0:ty_cut, 0:tx_cut], X_label = X_label, Y_label = Y_label)
并且可能在未来tensorflow会有更好的方法来提取和可视化注意力权重图。有什么想法吗?
有没有办法在 TensorFlow 的 seq2seq
模型中可视化某些输入的注意力权重,如上图 link(来自 Bahdanau 等人,2014 年)?我发现 TensorFlow's github issue 与此有关,但我找不到如何在会话期间获取注意力掩码。
我还想为我的文本摘要任务可视化 Tensorflow seq2seq ops 的注意力权重。我认为临时解决方案是使用 session.run() 来评估注意力掩码张量,如上所述。有趣的是,原始的 seq2seq.py ops 被认为是旧版本,无法在 github 中轻易找到,所以我只是使用 0.12.0 wheel 发行版中的 seq2seq.py 文件并对其进行了修改。为了绘制热图,我使用了 'Matplotlib' 包,非常方便。
我修改了代码如下: https://github.com/rockingdingo/deepnlp/tree/master/deepnlp/textsum#attention-visualization
# Find the attention mask tensor in function attention_decoder()-> attention()
# Add the attention mask tensor to ‘return’ statement of all the function that calls the attention_decoder(),
# all the way up to model_with_buckets() function, which is the final function I use for bucket training.
def attention(query):
"""Put attention masks on hidden using hidden_features and query."""
ds = [] # Results of attention reads will be stored here.
# some code
for a in xrange(num_heads):
with variable_scope.variable_scope("Attention_%d" % a):
# some code
s = math_ops.reduce_sum(v[a] * math_ops.tanh(hidden_features[a] + y),
[2, 3])
# This is the attention mask tensor we want to extract
a = nn_ops.softmax(s)
# some code
# add 'a' to return function
return ds, a
# modified model.step() function and return masks tensor
self.outputs, self.losses, self.attn_masks = seq2seq_attn.model_with_buckets(…)
# use session.run() to evaluate attn masks
attn_out = session.run(self.attn_masks[bucket_id], input_feed)
attn_matrix = ...
# Use the plot_attention function in eval.py to visual the 2D ndarray during prediction.
eval.plot_attention(attn_matrix[0:ty_cut, 0:tx_cut], X_label = X_label, Y_label = Y_label)
并且可能在未来tensorflow会有更好的方法来提取和可视化注意力权重图。有什么想法吗?