TFLearn回归，损失计算中的形状不相容

Question

我正在研究蛋白质序列。我的目标是创建一个卷积网络，预测蛋白质中每个氨基酸的三个角度。我在调试需要重塑操作的 TFLearn DNN 模型时遇到问题。

输入数据描述了（当前）25 种不同长度的蛋白质。要使用张量，我需要具有统一的维度，所以我用零填充空输入单元格。每个氨基酸由一个 4 维代码表示。这些细节可能并不重要，除了可以帮助您理解张量的形状。

DNN的输出是六个数，分别代表三个角度的正弦和余弦。为了创建有序对，DNN 图将 [..., 6] 张量重塑为 [..., 3, 2]。我的目标数据以相同的方式编码。我使用余弦距离计算损失。

我构建了一个非卷积 DNN，它显示出良好的初始学习行为，这与我将在此处 post 编写的代码非常相似。但该模型孤立地处理了三个相邻的氨基酸。我想将 每种蛋白质 视为一个单元 - 首先滑动 windows 3 个氨基酸宽，最终变大。

现在我正在转换为卷积模型，我似乎无法获得匹配的形状。以下是我的代码的工作部分：

import tensorflow as tf
import tflearn as tfl

from protein import ProteinDatabase   # don't worry about its details

def backbone_angle_distance(predict, actual):
    with tf.name_scope("BackboneAngleDistance"):
        actual = tfl.reshape(actual, [-1,3,2])
        # Supply the -1 argument for axis that TFLearn can't pass
        loss = tf.losses.cosine_distance(predict, actual, -1, 
               reduction=tf.losses.Reduction.MEAN)
        return loss

# Training data
database = ProteinDatabase("./data")
inp, tgt = database.training_arrays()

# DNN model, convolution only in topmost layer for now
net = tfl.input_data(shape=[None, None, 4]) 
net = tfl.conv_1d(net, 24, 3)
net = tfl.conv_1d(net, 12, 1)
net = tfl.conv_1d(net, 6, 1)
net = tfl.reshape(net, [-1,3,2]) 
net = tf.nn.l2_normalize(net, dim=2)
net = tfl.regression(net, optimizer="sgd", learning_rate=0.1, \
                     loss=backbone_angle_distance)
model = tfl.DNN(net)

# Generate a prediction.  Compare shapes for compatibility.
out = model.predict(inp)
print("\ninp : {}, shape = {}".format(type(inp), inp.shape))
print("out : {}, shape = {}".format(type(out), out.shape))
print("tgt : {}, shape = {}".format(type(tgt), tgt.shape))
print("tgt shape, if flattened by one dimension = {}\n".\
      format(tgt.reshape([-1,3,2]).shape))

此时的输出为：

inp : <class 'numpy.ndarray'>, shape = (25, 543, 4)
out : <class 'numpy.ndarray'>, shape = (13575, 3, 2)
tgt : <class 'numpy.ndarray'>, shape = (25, 543, 3, 2)
tgt shape, if flattened by one dimension = (13575, 3, 2)

因此，如果我重塑 4D 张量 tgt，展平最外层维度，out 和 tgt 应该匹配。由于 TFLearn 的代码进行了批处理，我尝试在我的自定义损失函数 backbone_angle_distance() 的第一行截取并重塑 Tensor actual。

如果我添加几行来尝试模型拟合，如下所示：

e, b = 1, 5
model.fit(inp, tgt, n_epoch=e, batch_size=b, validation_set=0.2, show_metric=True)

我得到以下额外输出和错误：

---------------------------------
Run id: EEG6JW
Log directory: /tmp/tflearn_logs/
---------------------------------
Training samples: 20
Validation samples: 5
--
--
Traceback (most recent call last):
  File "exp54.py", line 252, in <module>
    model.fit(inp, tgt, n_epoch=e, batch_size=b, validation_set=0.2, show_metric=True)
  File "/usr/local/lib/python3.5/dist-packages/tflearn/models/dnn.py", line 216, in fit
    callbacks=callbacks)
  File "/usr/local/lib/python3.5/dist-packages/tflearn/helpers/trainer.py", line 339, in fit
    show_metric)
  File "/usr/local/lib/python3.5/dist-packages/tflearn/helpers/trainer.py", line 818, in _train
    feed_batch)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 789, in run
    run_metadata_ptr)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 975, in _run
    % (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (5, 543, 3, 2) for Tensor 'TargetsData/Y:0', which has shape '(?, 3, 2)'

我在我的代码中的哪个位置指定了 TargetsData/Y:0 的形状 (?, 3, 2)？我知道不会的。根据回溯，我实际上似乎从未在 backbone_angle_distance().

中完成重塑操作

感谢任何建议，谢谢！

Answer 1

您需要像这样重塑 tgt：tgt = tgt.reshape([-1,3,2])

目标维度应与此行定义的网络输出维度相同： net = tfl.reshape(net, [-1,3,2]).

Answer 2

嗯，看来我是在回答我自己的问题。

我尝试了 Geert 所建议的各种排列组合，但我无法使任何工作正常进行。当我在此处讨论的网络之前构建非卷积网络时，尝试将训练数据重塑为 [-1,3,2] 是合适的。最终，我得出结论，TFLearn 不会让我在损失函数中展平 CNN 所需的 4D 张量。我需要像以前一样添加一维。但是现在我必须保留两个，而不是保留一维（这是 -1 所做的）。

这是我的解决方案。

1）从损失函数中剔除reshape：

def backbone_angle_distance(predict, actual):
    with tf.name_scope("BackboneAngleDistance"):
        # Supply the -1 argument for axis that TFLearn can't pass
        loss = tf.losses.cosine_distance(predict, actual, -1, 
               reduction=tf.losses.Reduction.MEAN)
        return loss

2）引入变量shp，显式存储4D输入Tensor的维度：

net = tfl.input_data(shape=[None, None, 4])
shp = tf.shape(net)  # <--- (new)
net = tfl.conv_1d(net, 24, window) 
net = tfl.conv_1d(net, 12, 1)
net = tfl.conv_1d(net, 6, 1)
net = tfl.reshape(net, [shp[0], shp[1], 3, 2])  # <--- (new)
net = tf.nn.l2_normalize(net, dim=2)
net = tfl.regression(net, optimizer="sgd", learning_rate=0.1, \
                     loss=backbone_angle_distance_1)
model = tfl.DNN(net)

我之前遇到的与形状相关的错误现在都消失了。但是，如果有人还在关注这个，我还有进一步的问题。

a) 我做到了吗"right"？该算法可能永远不会在分布式系统上进行训练，因为我拥有的数据集太小而无法打扰。但是，据我了解，TensorFlow 图使用的任何本身不是 TensorFlow 对象的东西都有可能破坏任何可以执行的并行化优化。 shp 是一个合适的 TensorFlow 对象吗？我通过切片操作得到的元素呢？

b) 如果我在 Numpy 工作，这看起来像是 Python 的省略号运算符的工作。我什至不自觉地使用省略号在本次讨论的顶部写下了我对张量形状的初步描述。 TensorFlow 理解省略号吗？它可能有用。

TFLearn回归，损失计算中的形状不相容

TFLearn regression, shape incompatibility in loss calculation

python

tensorflow

tflearn