如何创建训练和测试数据并输入keras模型？

Question

这可能是一个很简单的问题，但作为keras和机器学习的新手，我无法解决这个问题。 这是一个双classclass化问题。我的代码是在带有 Tensorflow 后端的 keras 中编写的（来源：Kaggle）。

我有一个目录，其中包含两个名为 "cat" 和 "dog" 的文件夹。每个文件夹都有多张大小为 224 x 224 pixels 的图像。图像总大小超过 32 GB。标签将基于文件夹名称，即如果文件夹名称包含 "cat"，标签将为“0”，否则为“1”。

代码片段（来源：Kaggle）：

def get_images(directory):
    Images = []
    Labels = []  
    label = 0
    for labels in os.listdir(directory): #Main Directory where each class label is present as folder name.
        if labels == 'cat': #Folder contain 'cat' Images get the '0' class label.
            label = 0
        elif labels == 'dog':
            label = 1

        for image_file in os.listdir(directory+labels): #Extracting the file name of the image from Class Label folder
            image = cv2.imread(directory+labels+r'/'+image_file) #Reading the image (OpenCV)        
            image = cv2.resize(image,(224,224)) #Resize the image, Some images are different sizes. (Resizing is very Important)
            Images.append(image)
            Labels.append(label)

    return shuffle(Images,Labels,random_state=817328462) #Shuffle the dataset you just prepared. 817328462 

def get_classlabel(class_code):    
    labels = {0:'cat', 1:'dog'}
    return labels[class_code]

Images, Labels = get_images('./path_of_data_set') #Extract the training images from the folders.
Images = np.array(Images)
Labels = np.array(Labels)

def sequence():
    model = Models.Sequential()
    ...
model=sequence();
model.summary()

# Train the model with the new callback
model.fit(Images, Labels, batch_size=32, epochs=100, validation_split=0.10, verbose=1)

如果 .png 图片的数量很少，那么我的代码是运行完美。当我使用 32GB 图像数据时出现问题。然后我遇到了内存问题。我在这方面检查了很多 post 并找到了很多解决方案，但我无法在这段代码中实现它们。

你能告诉我如何将数据输入模型，这样它才不会出现内存问题吗？

Answer 1

检查这里。有详细资料。您可能需要再添加几行。 https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image/ImageDataGenerator

如何创建训练和测试数据并输入keras模型？

How to create training and testing data and feed into keras model?

keras

tensorflow

python-3.6

numpy-ndarray