用自己的类似 mnist 的图像做出错误的预测

Wrong predictions with own mnist-like images

尝试使用简单的体系结构识别手写数字。测试给出 0.9723 准确度

import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow import keras
from tensorflow.keras.layers import Dense, Flatten
from sklearn.model_selection import train_test_split


# data split
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# normalizing
x_train = x_train / 255
x_test = x_test / 255

y_train_cat = keras.utils.to_categorical(y_train, 10)
y_test_cat = keras.utils.to_categorical(y_test, 10)

# creating model
model = keras.Sequential([
    Flatten(input_shape=(28, 28, 1)),
    Dense(128, activation='relu'),
    Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

x_train_split, x_val_split, y_train_split, y_val_split = train_test_split(x_train, y_train_cat, test_size=0.2)

model.fit(
    x_train_split,
    y_train_split,
    batch_size=32,
    epochs=6,
    validation_data=(x_val_split, y_val_split))

# saving model
model.save('mnist_model.h5')

# test
model.evaluate(x_test, y_test_cat)

但是当我尝试识别自己的数字(0 到 9)时,其中一些数字无法正确识别: numbers and prediction above

尝试使用此代码:

from keras.models import load_model
from tensorflow.keras.datasets import mnist
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image

model = load_model('mnist_model.h5')

(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_test = x_test / 255
y_test_cat = keras.utils.to_categorical(y_test, 10)

model.evaluate(x_test, y_test_cat)

filenames = [
    'project_imgs/0.png', 'project_imgs/1.png', 'project_imgs/2.png', 'project_imgs/3.png',
    'project_imgs/4.png', 'project_imgs/5.png', 'project_imgs/6.png', 'project_imgs/7.png',
    'project_imgs/8.png', 'project_imgs/9.png'
             ]

data = []
data_eds = []

for file in filenames:
    picture = Image.open(file).convert('L')
    pic_r = picture.resize((28, 28))
    pic = np.array(pic_r)
    pic = 255 - pic
    pic = pic / 255
    pic_eds = np.expand_dims(pic, axis=0)

    data.append(pic)
    data_eds.append(pic_eds)

plt.figure(figsize=(10, 5))
for i in range(10):
    ax = plt.subplot(2, 5, i+1)
    ax.set_title(f'Looks like {np.argmax(model.predict(data_eds[i]))}')

    plt.xticks([])
    plt.yticks([])

    plt.imshow(data[i], cmap=plt.cm.binary)
plt.show()

我不明白为什么会这样。会不会是图片的原因?我已经看到 MNIST 生成的图像更黑,不像我的那样灰。还是因为这个28x28正方形的图形尺寸?

可能是数据集的差异。 MNIST 数字通常比您自己的数字有更多的纯色和更粗的颜色。这是我唯一想到的,因为你的代码看起来不错。

解决方案是将您的数字更改为更类似于 MNIST 数字,或者使用您的数字创建足够大的数据集来训练模型。

您在加载图像后处理图像时使用了以下代码行:

pic = 255 - pic

训练步骤中不存在此预处理步骤。也许这就是大多数图像被错误分类的原因。

好的,关键是处理图像。我编写的代码能够识别出 10 张图像中的 9 张,但仍然无法识别数字“9”。

for file in filenames:
    img = Image.new('RGBA', size=(28, 28), color='white')
    number = Image.open(file).convert('RGBA')
    number_res = number.resize((20, 20), resample=Image.ANTIALIAS)\
        .rotate(6, expand=1, fillcolor='white')
    img.paste(number_res, (4, 4))
    img = img.convert('L')
    img = np.array(img)
    img = 255 - img
    img = img / 255
    img_eds = np.expand_dims(img, axis=0)

    data.append(img)
    data_eds.append(img_eds)

然后我在 Photoshop 中使用它,它成功了。据我了解,“9”未被识别,因为尾部末端和环之间的水平距离相当大。因此,不可能将数字放在中心。 最终结果: