Keras 批量归一化和样本权重
Keras Batchnormalization and sample weights
我正在尝试 tensorflow website 上的训练和评估示例。
具体来说,这部分:
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
x_train = x_train.reshape(60000, 784).astype('float32') / 255
x_test = x_test.reshape(10000, 784).astype('float32') / 255
y_train = y_train.astype('float32')
y_test = y_test.astype('float32')
def get_uncompiled_model():
inputs = keras.Input(shape=(784,), name='digits')
x = layers.Dense(64, activation='relu', name='dense_1')(inputs)
x = layers.BatchNormalization()(x)
x = layers.Dense(64, activation='relu', name='dense_2')(x)
outputs = layers.Dense(10, activation='softmax', name='predictions')(x)
model = keras.Model(inputs=inputs, outputs=outputs)
return model
def get_compiled_model():
model = get_uncompiled_model()
model.compile(optimizer=keras.optimizers.RMSprop(learning_rate=1e-3),
loss='sparse_categorical_crossentropy',
metrics=['sparse_categorical_accuracy'])
return model
sample_weight = np.ones(shape=(len(y_train),))
sample_weight[y_train == 5] = 2.
# Create a Dataset that includes sample weights
# (3rd element in the return tuple).
train_dataset = tf.data.Dataset.from_tensor_slices(
(x_train, y_train, sample_weight))
# Shuffle and slice the dataset.
train_dataset = train_dataset.shuffle(buffer_size=1024).batch(64)
model = get_compiled_model()
model.fit(train_dataset, epochs=3)
看来,如果我添加批量归一化层(这一行:x = layers.BatchNormalization()(x)
),我会收到以下错误:
InvalidArgumentError: The second input must be a scalar, but it has shape [64]
[[{{node batch_normalization_2/cond/ReadVariableOp/Switch}}]]
有什么想法吗?
同样的代码对我有用。
我唯一更改的行是:
model.compile(optimizer=keras.optimizers.RMSprop(learning_rate=1e-3)
至 model.compile(optimizer=keras.optimizers.RMSprop(lr=1e-3)
(这是特定于版本的)
然后
model.fit(train_dataset, epochs=3)
到 model.fit(train_dataset, epochs=3, steps_per_epoch=30)
原因:当使用迭代器作为模型的输入时,您应该指定 steps_per_epoch
参数
如果你只想使用样本权重,则不必使用tf.data.Dataset
,只需运行:
model.fit(x=x_train, y=y_train, sample_weight=sample_weight, batch_size=64, epochs=3)
它对我有用(当我将 learning_rate
更改为 lr
时,如@ASHu2 所述)。
3 个 epoch 后准确率达到 97%:
...
57408/60000 [===========================>..] - ETA: 0s - loss: 0.1010 - sparse_categorical_accuracy: 0.9709
58816/60000 [============================>.] - ETA: 0s - loss: 0.1011 - sparse_categorical_accuracy: 0.9708
60000/60000 [==============================] - 2s 37us/sample - loss: 0.1007 - sparse_categorical_accuracy: 0.9709
我在 windows 上使用了 TF 1.14.0。
我将tensorflow从1.14.1版本更新到2.0.0-rc1后问题解决了
我正在尝试 tensorflow website 上的训练和评估示例。 具体来说,这部分:
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
x_train = x_train.reshape(60000, 784).astype('float32') / 255
x_test = x_test.reshape(10000, 784).astype('float32') / 255
y_train = y_train.astype('float32')
y_test = y_test.astype('float32')
def get_uncompiled_model():
inputs = keras.Input(shape=(784,), name='digits')
x = layers.Dense(64, activation='relu', name='dense_1')(inputs)
x = layers.BatchNormalization()(x)
x = layers.Dense(64, activation='relu', name='dense_2')(x)
outputs = layers.Dense(10, activation='softmax', name='predictions')(x)
model = keras.Model(inputs=inputs, outputs=outputs)
return model
def get_compiled_model():
model = get_uncompiled_model()
model.compile(optimizer=keras.optimizers.RMSprop(learning_rate=1e-3),
loss='sparse_categorical_crossentropy',
metrics=['sparse_categorical_accuracy'])
return model
sample_weight = np.ones(shape=(len(y_train),))
sample_weight[y_train == 5] = 2.
# Create a Dataset that includes sample weights
# (3rd element in the return tuple).
train_dataset = tf.data.Dataset.from_tensor_slices(
(x_train, y_train, sample_weight))
# Shuffle and slice the dataset.
train_dataset = train_dataset.shuffle(buffer_size=1024).batch(64)
model = get_compiled_model()
model.fit(train_dataset, epochs=3)
看来,如果我添加批量归一化层(这一行:x = layers.BatchNormalization()(x)
),我会收到以下错误:
InvalidArgumentError: The second input must be a scalar, but it has shape [64]
[[{{node batch_normalization_2/cond/ReadVariableOp/Switch}}]]
有什么想法吗?
同样的代码对我有用。
我唯一更改的行是:
model.compile(optimizer=keras.optimizers.RMSprop(learning_rate=1e-3)
至 model.compile(optimizer=keras.optimizers.RMSprop(lr=1e-3)
(这是特定于版本的)
然后
model.fit(train_dataset, epochs=3)
到 model.fit(train_dataset, epochs=3, steps_per_epoch=30)
原因:当使用迭代器作为模型的输入时,您应该指定 steps_per_epoch
参数
如果你只想使用样本权重,则不必使用tf.data.Dataset
,只需运行:
model.fit(x=x_train, y=y_train, sample_weight=sample_weight, batch_size=64, epochs=3)
它对我有用(当我将 learning_rate
更改为 lr
时,如@ASHu2 所述)。
3 个 epoch 后准确率达到 97%:
...
57408/60000 [===========================>..] - ETA: 0s - loss: 0.1010 - sparse_categorical_accuracy: 0.9709
58816/60000 [============================>.] - ETA: 0s - loss: 0.1011 - sparse_categorical_accuracy: 0.9708
60000/60000 [==============================] - 2s 37us/sample - loss: 0.1007 - sparse_categorical_accuracy: 0.9709
我在 windows 上使用了 TF 1.14.0。
我将tensorflow从1.14.1版本更新到2.0.0-rc1后问题解决了