Keras ImageDataGenerator：如何使用图像路径进行数据扩充

Question

我正在研究 CNN 模型，我想使用一些数据增强，但出现了两个问题：

我的标签是图像（我的模型是某种自动编码器，但预期的输出图像与我的输入图像不同），因此我不能使用 ImageDataGenerator.flow_from_directory()。我正在考虑 ImageDataGenerator.flow(train_list, y = labels_list)，但我的第二个问题出现了：
我的输入和标签数据集都非常庞大，我更喜欢使用图像路径（flow() 函数无法正确处理）而不是将我所有的数据集加载到一个数组中并使我的 RAM 爆炸。

如何正确处理这两个问题？对于我所发现的，可能有两种解决方案：

Create my own generator : 听说过Sequenceclass中的Keras__getitem__函数，但是能不能影响ImageDataGenerator class?
使用TF DATA或TFRecords，但它们似乎很难用，数据扩充仍有待实现。

有没有解决这个简单问题的最简单方法？一个简单的技巧是 强制 ImageDataGenerator.flow() 使用 nparray 图像路径 而不是 nparray 图像，但我担心修改 Keras/tensorflow 文件会会产生意想不到的后果（因为某些函数在其他 classes 中被调用，局部更改很快就会导致我所有笔记本库的全局更改）。

Answer 1

好的，感谢 this article，我终于找到了解决这些问题的方法。我的错误是我一直使用 ImageDataGenerator 尽管它缺乏灵活性，解决方案非常简单：使用另一个数据扩充工具。

我们可以恢复作者的方法如下：

首先，创建一个个性化批处理生成器 作为 Keras Sequence class 的子class（这意味着要实现一个 __getitem__ 根据各自的路径加载图像的函数）。
对详尽列表使用数据扩充albumentations library. It has the advantages of offering more transformation functions as Imgaug or ImageDataGenerator, while being faster. Moreover, this website allows you to test some of its augmentation methods, even with your own images ! See this one。

这个库的缺点是，由于它相对较新，网上能找到的文档很少，我花了几个小时试图解决我遇到的问题。

确实，当我尝试可视化一些增强函数时，结果完全是黑色图像（奇怪的事实：只有当我使用 RandomGamma 或 RandomBrightnessContrast。配合HorizontalFlip或ShiftScaleRotate等转换函数，即可正常使用。

经过整整半天的尝试找出问题所在，我最终想出了这个解决方案，如果您要尝试这个库，它可能会对您有所帮助：图像加载必须用 OpenCV 来完成（我使用 load_img 和 img_to_array 来自 tf.keras.preprocessing.image 的函数来加载和处理）。如果有人能解释为什么这不起作用，我很乐意听到。

无论如何，这是我显示增强图像的最终代码：

!pip install -U git+https://github.com/albu/albumentations > /dev/null && echo "All libraries are successfully installed!"
from albumentations import Compose, HorizontalFlip, RandomBrightnessContrast, ToFloat, RGBShift
import cv2
import matplotlib.pyplot as plt
import numpy as np
from google.colab.patches import cv2_imshow # I work on a Google Colab, thus I cannot use cv2.imshow()


augmentation = Compose([HorizontalFlip(p = 0.5),
                        RandomBrightnessContrast(p = 1),
                        ToFloat(max_value = 255) # Normalize the pixels values into the [0,1] interval
                        # Feel free to add more !
                        ])

img = cv2.imread('Your_path_here.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # cv2.imread() loads the images in BGR format, thus you have to convert it to RGB before applying any transformation function.
img = augmentation(image = img)['image'] # Apply the augmentation functions to the image.
plt.figure(figsize=(7, 7))
plt.imshow((img*255).astype(np.uint8)) # Put the pixels values back to [0,255]. Replace by plt.imshow(img) if the ToFloat function is not used.
plt.show()


'''
If you want to display using cv2_imshow(), simply replace the last three lines by :

img = cv2.normalize(img, None, 255,0, cv2.NORM_MINMAX, cv2.CV_8UC1) # if the ToFloat argument is set up inside Compose(), you have to put the pixels values back to [0,255] before plotting them with cv2_imshow(). I couldn't try with cv2.imshow(), but according to the documentation it seems this line would be useless with this displaying function.
cv2_imshow(img)

I don't recommend it though, because cv2_imshow() plot the images in BGR format, thus some augmentation methods such as RGBShift will not work properly.
'''

编辑：

我遇到了 albumentations 库的几个问题（我在 Github 的 this question 中描述过，但现在我仍然没有答案）因此 我最好建议使用 Imgaug 进行数据扩充：它工作得很好，几乎和 albumentations 一样容易使用，尽管可用的东西少了一点转换函数。

Keras ImageDataGenerator：如何使用图像路径进行数据扩充

Keras ImageDataGenerator : how to use data augmentation with images paths

python

deep-learning

keras

tensorflow

data-augmentation

编辑：