如何为 python 中的自定义数据实现 next_batch() 函数
How to implement next_batch() function for custom data in python
我目前正在通过实施深度卷积网络在 kaggle 上进行猫狗分类任务。以下代码行用于数据预处理:
def label_img(img):
word_label = img.split('.')[-3]
if word_label == 'cat': return [1,0]
elif word_label == 'dog': return [0,1]
def create_train_data():
training_data = []
for img in tqdm(os.listdir(TRAIN_DIR)):
label = label_img(img)
path = os.path.join(TRAIN_DIR,img)
img = cv2.resize(cv2.imread(path,cv2.IMREAD_GRAYSCALE),IMG_SIZE,IMG_SIZE))
training_data.append([np.array(img),np.array(label)])
shuffle(training_data)
return training_data
train_data = create_train_data()
X_train = np.array([i[0] for i in train_data]).reshape(-1, IMG_SIZE,IMG_SIZE,1)
Y_train =np.asarray([i[1] for i in train_data])
我想实现一个函数来复制 tensorflow deep MNIST 教程中提供的以下函数
batch = mnist.train.next_batch(100)
这个code是想出生成批次的函数的好例子。
为了简要说明,您只需要为 x_train 和 y_train 想出两个数组,例如:
batch_inputs = np.ndarray(shape=(batch_size), dtype=np.int32)
batch_labels = np.ndarray(shape=(batch_size, 1), dtype=np.int32)
并像这样设置训练数据:
batch_inpouts[i] = ...
batch_labels[i, 0] = ...
最后将数据集传给session:
_, loss_val = session.run([optimizer, loss], feed_dict={train_inputs: batch_inputs, train_labels:batch_labels})
除了生成一个批次,您可能还想为每个批次随机重新排列数据。
EPOCH = 100
BATCH_SIZE = 128
TRAIN_DATASIZE,_,_,_ = X_train.shape
PERIOD = TRAIN_DATASIZE/BATCH_SIZE #Number of iterations for each epoch
for e in range(EPOCH):
idxs = numpy.random.permutation(TRAIN_DATASIZE) #shuffled ordering
X_random = X_train[idxs]
Y_random = Y_train[idxs]
for i in range(PERIOD):
batch_X = X_random[i * BATCH_SIZE:(i+1) * BATCH_SIZE]
batch_Y = Y_random[i * BATCH_SIZE:(i+1) * BATCH_SIZE]
sess.run(train,feed_dict = {X: batch_X, Y:batch_Y})
我目前正在通过实施深度卷积网络在 kaggle 上进行猫狗分类任务。以下代码行用于数据预处理:
def label_img(img):
word_label = img.split('.')[-3]
if word_label == 'cat': return [1,0]
elif word_label == 'dog': return [0,1]
def create_train_data():
training_data = []
for img in tqdm(os.listdir(TRAIN_DIR)):
label = label_img(img)
path = os.path.join(TRAIN_DIR,img)
img = cv2.resize(cv2.imread(path,cv2.IMREAD_GRAYSCALE),IMG_SIZE,IMG_SIZE))
training_data.append([np.array(img),np.array(label)])
shuffle(training_data)
return training_data
train_data = create_train_data()
X_train = np.array([i[0] for i in train_data]).reshape(-1, IMG_SIZE,IMG_SIZE,1)
Y_train =np.asarray([i[1] for i in train_data])
我想实现一个函数来复制 tensorflow deep MNIST 教程中提供的以下函数
batch = mnist.train.next_batch(100)
这个code是想出生成批次的函数的好例子。
为了简要说明,您只需要为 x_train 和 y_train 想出两个数组,例如:
batch_inputs = np.ndarray(shape=(batch_size), dtype=np.int32)
batch_labels = np.ndarray(shape=(batch_size, 1), dtype=np.int32)
并像这样设置训练数据:
batch_inpouts[i] = ...
batch_labels[i, 0] = ...
最后将数据集传给session:
_, loss_val = session.run([optimizer, loss], feed_dict={train_inputs: batch_inputs, train_labels:batch_labels})
除了生成一个批次,您可能还想为每个批次随机重新排列数据。
EPOCH = 100
BATCH_SIZE = 128
TRAIN_DATASIZE,_,_,_ = X_train.shape
PERIOD = TRAIN_DATASIZE/BATCH_SIZE #Number of iterations for each epoch
for e in range(EPOCH):
idxs = numpy.random.permutation(TRAIN_DATASIZE) #shuffled ordering
X_random = X_train[idxs]
Y_random = Y_train[idxs]
for i in range(PERIOD):
batch_X = X_random[i * BATCH_SIZE:(i+1) * BATCH_SIZE]
batch_Y = Y_random[i * BATCH_SIZE:(i+1) * BATCH_SIZE]
sess.run(train,feed_dict = {X: batch_X, Y:batch_Y})