单维numpy数组的Tensorflow批量训练,无需将其转换为多维numpy数组
Tensorflow batch training of single dimension numpy array without converting it into multiple dimension numpy arrays
我对 numpy 数组到张量有些困惑....
我的代码:
import os
import tensorflow as tf
from tensorflow.python.framework import ops
from tensorflow.python.framework import dtypes
import numpy as np
import random
n = 2500
y = np.zeros((n), dtype = np.int32)
for i in range(n):
y[i] = random.randint(0,1)
print "Before Batch Training:"
print "len(y):" , len(y)
print "y: " , y
print "y[9]: " , y[9]
batch_size = 10
num_preprocess_threads = 1
min_queue_examples = 256
y_batch = tf.train.batch([y], batch_size=batch_size, num_threads=num_preprocess_threads, capacity=min_queue_examples + 3 * batch_size, allow_smaller_final_batch=True)
print "After Batch Training:"
print "y_batch:" , y_batch
print "y_batch[9]: " , y_batch[9]
with tf.Session() as sess:
tf.global_variables_initializer().run()
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
y_proccessed = sess.run(y_batch)
print "After Session Run:"
print "y_proccessed:" , y_proccessed
print "y_proccessed[9]: " , y_proccessed[9]
print "y_proccessed[0][9]: " , y_proccessed[0][9]
print "y_proccessed[1][9]: " , y_proccessed[1][9]
print "y_proccessed[2][9]: " , y_proccessed[2][9]
print "y_proccessed[3][9]: " , y_proccessed[3][9]
print "y_proccessed[4][9]: " , y_proccessed[4][9]
print "y_proccessed[5][9]: " , y_proccessed[5][9]
coord.request_stop()
coord.join(threads)
sess.close()
执行后结果:
Before Batch Training:
len(y): 2500
y: [0 0 1 ..., 1 1 1]
y[9]: 1
After Batch Training:
y_batch: Tensor("batch:0", shape=(?, 2500), dtype=int32)
y_batch[9]: Tensor("strided_slice:0", shape=(2500,), dtype=int32)
After Session Run:
y_proccessed: [[0 0 1 ..., 1 1 1]
[0 0 1 ..., 1 1 1]
[0 0 1 ..., 1 1 1]
...,
[0 0 1 ..., 1 1 1]
[0 0 1 ..., 1 1 1]
[0 0 1 ..., 1 1 1]]
y_proccessed[9]: [0 0 1 ..., 1 1 1]
y_proccessed[0][9]: 1
y_proccessed[1][9]: 1
y_proccessed[2][9]: 1
y_proccessed[3][9]: 1
y_proccessed[4][9]: 1
y_proccessed[5][9]: 1
我的困惑在哪里 y_proccessed[9] 应该像 y[9] 一样生成“1”的结果,而不是生成 [0 0 1 ..., 1 1 1]?
另一方面,如果您查看 y_proccessed 会产生
[[0 0 1 ..., 1 1 1]
[0 0 1 ..., 1 1 1]
[0 0 1 ..., 1 1 1]
...,
[0 0 1 ..., 1 1 1]
[0 0 1 ..., 1 1 1]
[0 0 1 ..., 1 1 1]]
它生成相同的冗余前 10 个数组,它应该循环到另一个批次的其他 10 个子序列数组?
谢谢
设法修复它:
import os
import tensorflow as tf
from tensorflow.python.framework import ops
from tensorflow.python.framework import dtypes
import numpy as np
import random
n = 2500
y = np.zeros((n), dtype = np.int32)
for i in range(n):
y[i] = random.randint(0,1)
print "Before Batch Training:"
print "len(y):" , len(y)
print "y: " , y
print "y[9]: " , y[9]
batch_size = 10
num_preprocess_threads = 1
min_queue_examples = 256
#adding enqueue_many=True into the tf.train.batch
y_batch = tf.train.batch([y], batch_size=batch_size, num_threads=num_preprocess_threads, capacity=min_queue_examples + 3 * batch_size, enqueue_many=True, allow_smaller_final_batch=True)
print "After Batch Training:"
print "y_batch:" , y_batch
print "y_batch[9]: " , y_batch[9]
with tf.Session() as sess:
tf.global_variables_initializer().run()
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
y_proccessed = sess.run(y_batch)
print "After Session Run:"
print "y_proccessed:" , y_proccessed
print "y_proccessed[9]: " , y_proccessed[9]
print "y_proccessed[0][9]: " , y_proccessed[0][9]
print "y_proccessed[1][9]: " , y_proccessed[1][9]
print "y_proccessed[2][9]: " , y_proccessed[2][9]
print "y_proccessed[3][9]: " , y_proccessed[3][9]
print "y_proccessed[4][9]: " , y_proccessed[4][9]
print "y_proccessed[5][9]: " , y_proccessed[5][9]
coord.request_stop()
coord.join(threads)
sess.close()
我留下一些额外的资源以供参考:
http://ischlag.github.io/2016/11/07/tensorflow-input-pipeline-for-large-datasets/
https://github.com/dennybritz/tf-rnn/blob/master/sequence_example.ipynb
顺便说一句,如果有人有更好的解决方案请post它。
谢谢
我对 numpy 数组到张量有些困惑....
我的代码:
import os
import tensorflow as tf
from tensorflow.python.framework import ops
from tensorflow.python.framework import dtypes
import numpy as np
import random
n = 2500
y = np.zeros((n), dtype = np.int32)
for i in range(n):
y[i] = random.randint(0,1)
print "Before Batch Training:"
print "len(y):" , len(y)
print "y: " , y
print "y[9]: " , y[9]
batch_size = 10
num_preprocess_threads = 1
min_queue_examples = 256
y_batch = tf.train.batch([y], batch_size=batch_size, num_threads=num_preprocess_threads, capacity=min_queue_examples + 3 * batch_size, allow_smaller_final_batch=True)
print "After Batch Training:"
print "y_batch:" , y_batch
print "y_batch[9]: " , y_batch[9]
with tf.Session() as sess:
tf.global_variables_initializer().run()
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
y_proccessed = sess.run(y_batch)
print "After Session Run:"
print "y_proccessed:" , y_proccessed
print "y_proccessed[9]: " , y_proccessed[9]
print "y_proccessed[0][9]: " , y_proccessed[0][9]
print "y_proccessed[1][9]: " , y_proccessed[1][9]
print "y_proccessed[2][9]: " , y_proccessed[2][9]
print "y_proccessed[3][9]: " , y_proccessed[3][9]
print "y_proccessed[4][9]: " , y_proccessed[4][9]
print "y_proccessed[5][9]: " , y_proccessed[5][9]
coord.request_stop()
coord.join(threads)
sess.close()
执行后结果:
Before Batch Training:
len(y): 2500
y: [0 0 1 ..., 1 1 1]
y[9]: 1
After Batch Training:
y_batch: Tensor("batch:0", shape=(?, 2500), dtype=int32)
y_batch[9]: Tensor("strided_slice:0", shape=(2500,), dtype=int32)
After Session Run:
y_proccessed: [[0 0 1 ..., 1 1 1]
[0 0 1 ..., 1 1 1]
[0 0 1 ..., 1 1 1]
...,
[0 0 1 ..., 1 1 1]
[0 0 1 ..., 1 1 1]
[0 0 1 ..., 1 1 1]]
y_proccessed[9]: [0 0 1 ..., 1 1 1]
y_proccessed[0][9]: 1
y_proccessed[1][9]: 1
y_proccessed[2][9]: 1
y_proccessed[3][9]: 1
y_proccessed[4][9]: 1
y_proccessed[5][9]: 1
我的困惑在哪里 y_proccessed[9] 应该像 y[9] 一样生成“1”的结果,而不是生成 [0 0 1 ..., 1 1 1]?
另一方面,如果您查看 y_proccessed 会产生
[[0 0 1 ..., 1 1 1]
[0 0 1 ..., 1 1 1]
[0 0 1 ..., 1 1 1]
...,
[0 0 1 ..., 1 1 1]
[0 0 1 ..., 1 1 1]
[0 0 1 ..., 1 1 1]]
它生成相同的冗余前 10 个数组,它应该循环到另一个批次的其他 10 个子序列数组?
谢谢
设法修复它:
import os
import tensorflow as tf
from tensorflow.python.framework import ops
from tensorflow.python.framework import dtypes
import numpy as np
import random
n = 2500
y = np.zeros((n), dtype = np.int32)
for i in range(n):
y[i] = random.randint(0,1)
print "Before Batch Training:"
print "len(y):" , len(y)
print "y: " , y
print "y[9]: " , y[9]
batch_size = 10
num_preprocess_threads = 1
min_queue_examples = 256
#adding enqueue_many=True into the tf.train.batch
y_batch = tf.train.batch([y], batch_size=batch_size, num_threads=num_preprocess_threads, capacity=min_queue_examples + 3 * batch_size, enqueue_many=True, allow_smaller_final_batch=True)
print "After Batch Training:"
print "y_batch:" , y_batch
print "y_batch[9]: " , y_batch[9]
with tf.Session() as sess:
tf.global_variables_initializer().run()
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
y_proccessed = sess.run(y_batch)
print "After Session Run:"
print "y_proccessed:" , y_proccessed
print "y_proccessed[9]: " , y_proccessed[9]
print "y_proccessed[0][9]: " , y_proccessed[0][9]
print "y_proccessed[1][9]: " , y_proccessed[1][9]
print "y_proccessed[2][9]: " , y_proccessed[2][9]
print "y_proccessed[3][9]: " , y_proccessed[3][9]
print "y_proccessed[4][9]: " , y_proccessed[4][9]
print "y_proccessed[5][9]: " , y_proccessed[5][9]
coord.request_stop()
coord.join(threads)
sess.close()
我留下一些额外的资源以供参考:
http://ischlag.github.io/2016/11/07/tensorflow-input-pipeline-for-large-datasets/
https://github.com/dennybritz/tf-rnn/blob/master/sequence_example.ipynb
顺便说一句,如果有人有更好的解决方案请post它。
谢谢