是否可以使用向量方法来移动存储在 numpy ndarray 中的图像以进行数据扩充？

Question

背景：这是Aurelien Geron的教材Hands on Machine Learning中的练习题之一。

问题是：编写一个函数，可以将 MNIST 图像沿任意方向（左、右、上、下）移动一个像素。然后对于训练集中的每个图像，创建四个移位副本（每个方向一个）并将它们添加到训练集中。

我的思考过程：

我在 X_train 中有一个大小为 (59500, 784) 的 numpy 数组（每行是一个 (28,28) 图像）。对于 X_train 的每一行：
1. 将行重塑为 28,28
2. 每个方向（上、下、左、右）：
  1. 整形为 784,0
  2. 写入空数组
将新数组附加到 X_train

我的代码：

import numpy as np
from scipy.ndimage.interpolation import shift

def shift_and_append(X, n):
    x_arr = np.zeros((1, 784))
    for i in range(n):
        for j in range(-1,2):
            for k in range(-1,2):
                if j!=k and j!=-k:
                    x_arr = np.append(x_arr, shift(X[i,:].reshape(28,28), [j, k]).reshape(1, 784), axis=0)
    return np.append(X, x_arr[1:,:], axis=0)

X_train_new = shift_and_append(X_train, X_train.shape[0])
y_train_new = np.append(y_train, np.repeat(y_train, 4), axis=0)

需要很长时间才能运行。我觉得这是蛮力强迫它。是否有类似向量的有效方法来实现此目的？

Answer 1

3 嵌套 for 循环与 if 条件同时重塑和追加显然不是一个好主意； numpy.roll 以矢量方式完美完成工作：

import numpy as np
import matplotlib.pyplot as plt 
from keras.datasets import mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train.shape
# (60000, 28, 28)

# plot an original image
plt.gray() 
plt.matshow(x_train[0]) 
plt.show()

先来演示一下操作：

# one pixel down:
x_down = np.roll(x_train[0], 1, axis=0)
plt.gray() 
plt.matshow(x_down) 
plt.show()

# one pixel up:
x_up = np.roll(x_train[0], -1, axis=0)
plt.gray() 
plt.matshow(x_up) 
plt.show()

# one pixel left:
x_left = np.roll(x_train[0], -1, axis=1)
plt.gray() 
plt.matshow(x_left) 
plt.show()

# one pixel right:
x_right = np.roll(x_train[0], 1, axis=1)
plt.gray() 
plt.matshow(x_right) 
plt.show()

确定后，我们可以简单地通过

生成所有训练图像的 "right" 版本

x_all_right = [np.roll(x, 1, axis=1) for x in x_train]

其他3个方向也类似。

让我们确认x_all_right中的第一张图片确实是我们想要的：

plt.gray() 
plt.matshow(x_all_right[0]) 
plt.show()

您甚至可以避免最后的列表推导式，转而使用纯 Numpy 代码，因为

x_all_right = np.roll(x_train, 1, axis=2)

哪个更有效，虽然不太直观（只需采用相应的单图像命令版本并将 axis 增加 1）。

是否可以使用向量方法来移动存储在 numpy ndarray 中的图像以进行数据扩充？

Is it possible to use vector methods to shift images stored in a numpy ndarray for data augmentation?

python

numpy

scipy

mnist