带有桶错误的 Tensorflow

Question

我正在尝试使用张量流训练序列到序列模型。我在教程中看到，桶有助于加快训练速度。到目前为止，我可以只使用一个存储桶进行训练，也可以只使用一个 gpu 和多个存储桶使用或多或少的开箱即用代码，但是当我尝试使用多个存储桶和多个 gpu 时，我收到一条错误消息参数无效：您必须为占位符张量 'gpu_scope_0/encoder50_gpu0' 提供 dtype int32

的值

从错误中，我可以看出我没有正确声明 input_feed，因此每次都期望输入是最大桶的大小。不过，我对为什么会这样感到困惑，因为在我正在改编的示例中，它在初始化 input_feed 的占位符时做同样的事情。据我所知，教程也初始化了最大尺寸的桶，但是当我使用教程的代码时，这个错误并没有发生。

以下是我认为是相关的初始化代码：

self.encoder_inputs = [[] for _ in xrange(self.num_gpus)]
    self.decoder_inputs = [[] for _ in xrange(self.num_gpus)]
    self.target_weights = [[] for _ in xrange(self.num_gpus)]
    self.scope_prefix = "gpu_scope"
    for j in xrange(self.num_gpus):
        with tf.device("/gpu:%d" % (self.gpu_offset + j)):
            with tf.name_scope('%s_%d' % (self.scope_prefix, j)) as scope:
                for i in xrange(buckets[-1][0]):  # Last bucket is the biggest one.
                    self.encoder_inputs[j].append(tf.placeholder(tf.int32, shape=[None],
                                                                 name="encoder{0}_gpu{1}".format(i,j)))
                for i in xrange(buckets[-1][1] + 1):
                    self.decoder_inputs[j].append(tf.placeholder(tf.int32, shape=[None],
                                                                 name="decoder{0}_gpu{1}".format(i,j)))
                    self.target_weights[j].append(tf.placeholder(tf.float32, shape=[None],
                                                                 name="weight{0}_gpu{1}".format(i,j)))

    # Our targets are decoder inputs shifted by one.
    self.losses = []
    self.outputs = []

    # The following loss computation creates the neural network. The specified
    # device hosts the trainable tf parameters.
    bucket = buckets[0]
    i = 0
    with tf.device(param_device):
        output, loss = tf.nn.seq2seq.model_with_buckets(self.encoder_inputs[i], self.decoder_inputs[i],
                                                        [self.decoder_inputs[i][k + 1] for k in
                                                         xrange(len(self.decoder_inputs[i]) - 1)],
                                                        self.target_weights[0], buckets,
                                                        lambda x, y: seq2seq_f(x, y, True),
                                                        softmax_loss_function=self.softmax_loss_function)

    bucket = buckets[0]
    self.encoder_states = []
    with tf.device('/gpu:%d' % self.gpu_offset):
        with variable_scope.variable_scope(variable_scope.get_variable_scope(),
                                           reuse=True):
            self.encoder_outputs, self.encoder_states = get_encoder_outputs(self,
                                                                            self.encoder_inputs[0])

    if not forward_only:
        self.grads = []
    print ("past line 297")
    done_once = False
    for i in xrange(self.num_gpus):
        with tf.device("/gpu:%d" % (self.gpu_offset + i)):
            with tf.name_scope("%s_%d" % (self.scope_prefix, i)) as scope:
                with variable_scope.variable_scope(variable_scope.get_variable_scope(), reuse=True):
                    #for j, bucket in enumerate(buckets):
                    output, loss = tf.nn.seq2seq.model_with_buckets(self.encoder_inputs[i],
                                                                    self.decoder_inputs[i],
                                                                    [self.decoder_inputs[i][k + 1] for k in
                                                                     xrange(len(self.decoder_inputs[i]) - 1)],
                                                                    self.target_weights[i], buckets,
                                                                    lambda x, y: seq2seq_f(x, y, True),
                                                                    softmax_loss_function=self.softmax_loss_function)

                    self.losses.append(loss)
                    self.outputs.append(output)


    # Training outputs and losses.
    if forward_only:
        self.outputs, self.losses = tf.nn.seq2seq.model_with_buckets(
            self.encoder_inputs, self.decoder_inputs,
            [self.decoder_inputs[0][k + 1] for k in xrange(buckets[0][1])],
            self.target_weights, buckets, lambda x, y: seq2seq_f(x, y, True),
            softmax_loss_function=self.softmax_loss_function)
        # If we use output projection, we need to project outputs for decoding.
        if self.output_projection is not None:
            for b in xrange(len(buckets)):
                self.outputs[b] = [
                    tf.matmul(output, self.output_projection[0]) + self.output_projection[1]
                    for output in self.outputs[b]
                    ]
    else:
        self.bucket_grads = []
        self.gradient_norms = []
        params = tf.trainable_variables()
        opt = tf.train.GradientDescentOptimizer(self.learning_rate)
        self.updates = []
        with tf.device(aggregation_device):
            for g in xrange(self.num_gpus):
                for b in xrange(len(buckets)):
                    gradients = tf.gradients(self.losses[g][b], params)
                    clipped_grads, norm = tf.clip_by_global_norm(gradients, max_gradient_norm)
                    self.gradient_norms.append(norm)
                    self.updates.append(
                        opt.apply_gradients(zip(clipped_grads, params), global_step=self.global_step))

下面是传入数据时的相关代码：

    input_feed = {}
      for i in xrange(self.num_gpus):
        for l in xrange(encoder_size):
            input_feed[self.encoder_inputs[i][l].name] = encoder_inputs[i][l]
        for l in xrange(decoder_size):
            input_feed[self.decoder_inputs[i][l].name] = decoder_inputs[i][l]
            input_feed[self.target_weights[i][l].name] = target_weights[i][l]

        # Since our targets are decoder inputs shifted by one, we need one more.
        last_target = self.decoder_inputs[i][decoder_size].name
        input_feed[last_target] = np.zeros([self.batch_size], dtype=np.int32)

        last_weight = self.target_weights[i][decoder_size].name
        input_feed[last_weight] = np.zeros([self.batch_size], dtype=np.float32)
    # Output feed: depends on whether we do a backward step or not.

    if not forward_only:
        output_feed = [self.updates[bucket_id], self.gradient_norms[bucket_id], self.losses[bucket_id]]
    else:
        output_feed = [self.losses[bucket_id]]  # Loss for this batch.
        for l in xrange(decoder_size):  # Output logits.
            output_feed.append(self.outputs[0][l])

现在我正在考虑将每个输入填充到桶的大小，但我预计这会失去分桶的一些优势

Answer 1

原来这个问题不在于占位符的输入，而是后来在我的代码中我引用了未初始化的占位符。据我所知，当我修复后来的问题时，这个错误停止了

带有桶错误的 Tensorflow

Tensorflow with Buckets Error

multi-gpu

tensorflow