GPU 使用率非常低
Very low GPU usage
我尝试使用单个 dynamic_rnn 来处理非常长的分类任务序列。
以下是一些参数:
rnn_size=500, seq_max_length=2500, batch_size=50, embedding_size=64, softmax_size=1600.
代码如下:
x_vec = tf.nn.embedding_lookup(embedding_matrix_variable, self.x)
lstm_fw_cell = rnn_cell.LSTMCell(num_units = hidden_unit, input_size = word_dim)
lstm_fw_cell = rnn_cell.DropoutWrapper(lstm_fw_cell, output_keep_prob=self.dropout_keep_prob, input_keep_prob=self.dropout_keep_prob)
outputs, _ = rnn.dynamic_rnn(lstm_fw_cell, x, dtype=tf.float32, sequence_length=real_length, swap_memory=False)
outputs = tf.transpose(outputs, [1, 0, 2])
outputs = tf.unpack(outputs)
output = outputs[0]
one = tf.ones([1, hidden_unit], tf.float32)
with tf.variable_scope("output"):
tf.get_variable_scope().reuse_variables()
for i in range(1, len(outputs_6)):
ind = self.real_length < (i+1)
ind = tf.to_float(ind)
ind = tf.expand_dims(ind, -1)
mat = tf.matmul(ind, one)
output=tf.add(tf.mul(output, mat), tf.mul(outputs[i], 1.0-mat))
y_prediction = tf.matmul(output, w_h2y) + b_h2y
y_prediction = tf.nn.softmax(y_prediction)
weight_decay = L2 * ( tf.nn.l2_loss(w_h2y) + tf.nn.l2_loss(b_h2y) )
self.cost = tf.reduce_mean( -tf.reduce_sum(self.y*tf.log(y_prediction + 1e-10)) ) + weight_decay
self.optimizer = tf.train.AdamOptimizer(alpha).minimize(self.cost)
GPU在TITAN上的使用率只有5%。
CPU的使用率约为150%。
我不确定是什么问题。
正如 Yaroslav 指出的那样 - 这是一个很难回答的问题,因为它需要分析您的代码(或足够幸运的人能够识别该问题)。 This comment on the github issues is a good starting point for profiling, as is the new TensorFlow Performance 页。
我尝试使用单个 dynamic_rnn 来处理非常长的分类任务序列。 以下是一些参数: rnn_size=500, seq_max_length=2500, batch_size=50, embedding_size=64, softmax_size=1600.
代码如下:
x_vec = tf.nn.embedding_lookup(embedding_matrix_variable, self.x)
lstm_fw_cell = rnn_cell.LSTMCell(num_units = hidden_unit, input_size = word_dim)
lstm_fw_cell = rnn_cell.DropoutWrapper(lstm_fw_cell, output_keep_prob=self.dropout_keep_prob, input_keep_prob=self.dropout_keep_prob)
outputs, _ = rnn.dynamic_rnn(lstm_fw_cell, x, dtype=tf.float32, sequence_length=real_length, swap_memory=False)
outputs = tf.transpose(outputs, [1, 0, 2])
outputs = tf.unpack(outputs)
output = outputs[0]
one = tf.ones([1, hidden_unit], tf.float32)
with tf.variable_scope("output"):
tf.get_variable_scope().reuse_variables()
for i in range(1, len(outputs_6)):
ind = self.real_length < (i+1)
ind = tf.to_float(ind)
ind = tf.expand_dims(ind, -1)
mat = tf.matmul(ind, one)
output=tf.add(tf.mul(output, mat), tf.mul(outputs[i], 1.0-mat))
y_prediction = tf.matmul(output, w_h2y) + b_h2y
y_prediction = tf.nn.softmax(y_prediction)
weight_decay = L2 * ( tf.nn.l2_loss(w_h2y) + tf.nn.l2_loss(b_h2y) )
self.cost = tf.reduce_mean( -tf.reduce_sum(self.y*tf.log(y_prediction + 1e-10)) ) + weight_decay
self.optimizer = tf.train.AdamOptimizer(alpha).minimize(self.cost)
GPU在TITAN上的使用率只有5%。 CPU的使用率约为150%。 我不确定是什么问题。
正如 Yaroslav 指出的那样 - 这是一个很难回答的问题,因为它需要分析您的代码(或足够幸运的人能够识别该问题)。 This comment on the github issues is a good starting point for profiling, as is the new TensorFlow Performance 页。