在 tensorflow 中读取大型 train/validation/test 数据集
Read big train/validation/test datasets in tensorflow
将多个大数据集加载到张量流中的正确方法是什么?
我有三个大数据集(文件),分别用于训练、验证和测试。我可以通过 tf.train.string_input_producer 成功加载训练集,并将其提供给 tf.train.shuffle_batch 对象。然后我可以迭代地获取一批数据来优化我的模型。
但是,当我试图以同样的方式加载我的验证集时,我卡住了,程序一直说 "OutOfRange Error" 即使我没有在 string_input_producer 中设置 num_epochs。
任何人都可以阐明它吗?除此之外,我也在想在tensorflow中做training/validation的正确方法是什么?实际上,我没有看到任何在大数据集上进行训练和测试的示例(我搜索了很多)。这对我来说太奇怪了......
下面的代码片段。
def extract_validationset(filename, batch_size):
with tf.device("/cpu:0"):
queue = tf.train.string_input_producer([filename])
reader = tf.TextLineReader()
_, line = reader.read(queue)
line = tf.decode_csv(...)
label = line[0]
feature = tf.pack(list(line[1:]))
l, f = tf.train.batch([label, feature], batch_size=batch_size, num_threads=8)
return l, f
def extract_trainset(train, batch_size):
with tf.device("/cpu:0"):
train_files = tf.train.string_input_producer([train])
reader = tf.TextLineReader()
_, train_line = reader.read(train_files)
train_line = tf.decode_csv(...)
l, f = tf.train.shuffle_batch(...,
batch_size=batch_size, capacity=50000, min_after_dequeue=10000, num_threads=8)
return l, f
....
label_batch, feature_batch = extract_trainset("train", batch_size)
label_eval, feature_eval = extract_validationset("test", batch_size)
with tf.Session() as sess:
tf.initialize_all_variables().run()
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
# Loop through training steps.
for step in xrange(int(num_epochs * train_size) // batch_size):
feature, label = sess.run([feature_batch, label_batch])
feed_dict = {train_data_node: feature, train_labels_node: label}
_, l, predictions = sess.run([optimizer, loss, evaluation], feed_dict=feed_dict)
# after EVAL_FREQUENCY steps, do evaluation on whole test set
if step % EVAL_FREQUENCY == 0:
for step in xrange(steps_per_epoch):
f, l = sess.run([feature_eval, label_eval])
true_count += sess.run(evaluation, feed_dict={train_data_node: f, train_labels_node: l})
print('Precision @ 1: %0.04f' % true_count / num_examples)
<!---- ERROR ---->
tensorflow.python.framework.errors.OutOfRangeError: FIFOQueue '_5_batch/fifo_queue' is closed and has insufficient elements (requested 334, current size 0)
[[Node: batch = QueueDequeueMany[component_types=[DT_FLOAT, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](batch/fifo_queue, batch/n)]]
由 op u'batch' 引起,定义于:
这可能有点晚了,但我遇到了同样的问题。在我的例子中,我愚蠢地打电话给 sess.run 在 之后我用 coord.request_stop(), coord.join_threads().
关闭了商店
也许您的 "train" 代码中有 coord.request_stop() 之类的东西 运行,当您尝试加载验证数据时关闭队列。
我尝试设置 num_epochs=None,成功了。
将多个大数据集加载到张量流中的正确方法是什么?
我有三个大数据集(文件),分别用于训练、验证和测试。我可以通过 tf.train.string_input_producer 成功加载训练集,并将其提供给 tf.train.shuffle_batch 对象。然后我可以迭代地获取一批数据来优化我的模型。
但是,当我试图以同样的方式加载我的验证集时,我卡住了,程序一直说 "OutOfRange Error" 即使我没有在 string_input_producer 中设置 num_epochs。
任何人都可以阐明它吗?除此之外,我也在想在tensorflow中做training/validation的正确方法是什么?实际上,我没有看到任何在大数据集上进行训练和测试的示例(我搜索了很多)。这对我来说太奇怪了......
下面的代码片段。
def extract_validationset(filename, batch_size):
with tf.device("/cpu:0"):
queue = tf.train.string_input_producer([filename])
reader = tf.TextLineReader()
_, line = reader.read(queue)
line = tf.decode_csv(...)
label = line[0]
feature = tf.pack(list(line[1:]))
l, f = tf.train.batch([label, feature], batch_size=batch_size, num_threads=8)
return l, f
def extract_trainset(train, batch_size):
with tf.device("/cpu:0"):
train_files = tf.train.string_input_producer([train])
reader = tf.TextLineReader()
_, train_line = reader.read(train_files)
train_line = tf.decode_csv(...)
l, f = tf.train.shuffle_batch(...,
batch_size=batch_size, capacity=50000, min_after_dequeue=10000, num_threads=8)
return l, f
....
label_batch, feature_batch = extract_trainset("train", batch_size)
label_eval, feature_eval = extract_validationset("test", batch_size)
with tf.Session() as sess:
tf.initialize_all_variables().run()
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
# Loop through training steps.
for step in xrange(int(num_epochs * train_size) // batch_size):
feature, label = sess.run([feature_batch, label_batch])
feed_dict = {train_data_node: feature, train_labels_node: label}
_, l, predictions = sess.run([optimizer, loss, evaluation], feed_dict=feed_dict)
# after EVAL_FREQUENCY steps, do evaluation on whole test set
if step % EVAL_FREQUENCY == 0:
for step in xrange(steps_per_epoch):
f, l = sess.run([feature_eval, label_eval])
true_count += sess.run(evaluation, feed_dict={train_data_node: f, train_labels_node: l})
print('Precision @ 1: %0.04f' % true_count / num_examples)
<!---- ERROR ---->
tensorflow.python.framework.errors.OutOfRangeError: FIFOQueue '_5_batch/fifo_queue' is closed and has insufficient elements (requested 334, current size 0)
[[Node: batch = QueueDequeueMany[component_types=[DT_FLOAT, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](batch/fifo_queue, batch/n)]]
由 op u'batch' 引起,定义于:
这可能有点晚了,但我遇到了同样的问题。在我的例子中,我愚蠢地打电话给 sess.run 在 之后我用 coord.request_stop(), coord.join_threads().
关闭了商店也许您的 "train" 代码中有 coord.request_stop() 之类的东西 运行,当您尝试加载验证数据时关闭队列。
我尝试设置 num_epochs=None,成功了。