TF 准确度得分和混淆矩阵不一致。 TensorFlow 是否在每次访问 BatchDataset 时对数据进行混洗?
TF accuracy score and confusion matrix disagree. Is TensorFlow shuffling data on each access of BatchDataset?
model.evaluate()
报告的准确性与从 Sklearn 或 TF 混淆矩阵计算的准确性有很大不同。
from sklearn.metrics import confusion_matrix
...
training_data, validation_data, testing_data = load_img_datasets()
# These ^ are tensorflow.python.data.ops.dataset_ops.BatchDataset
strategy = tf.distribute.MirroredStrategy()
with strategy.scope():
model = create_model(INPUT_SHAPE, NUM_CATEGORIES)
optimizer = tf.keras.optimizers.Adam()
metrics = ['accuracy']
model.compile(loss='categorical_crossentropy',
optimizer=optimizer,
metrics=metrics)
history = model.fit(training_data, epochs=epochs,
validation_data=validation_data)
testing_data.shuffle(len(testing_data), reshuffle_each_iteration=False)
# I think this ^ is preventing additional shuffles on access
loss, accuracy = model.evaluate(testing_data)
print(f"Accuracy: {(accuracy * 100):.2f}%")
# Prints
# Accuracy: 78.7%
y_hat = model.predict(testing_data)
y_test = np.concatenate([y for x, y in testing_data], axis=0)
c_matrix = confusion_matrix(np.argmax(y_test, axis=-1),
np.argmax(y_hat, axis=-1))
print(c_matrix)
# Prints result that does not agree:
# Confusion matrix:
#[[ 72 111 54 15 69]
# [ 82 100 44 16 78]
# [ 64 114 52 21 69]
# [ 71 106 54 21 68]
# [ 79 101 51 25 64]]
# Accuracy calculated from CM = 19.3%
起初,我以为 TensorFlow 在每次访问时都在改组 testing_data
,所以我添加了 testing_data.shuffle(len(testing_data), reshuffle_each_iteration=False)
,但结果仍然不一致。
也试过TF混淆矩阵:
y_hat = model.predict(testing_data)
y_test = np.concatenate([y for x, y in testing_data], axis=0)
true_class = tf.argmax(y_test, 1)
predicted_class = tf.argmax(y_hat, 1)
cm = tf.math.confusion_matrix(true_class, predicted_class, NUM_CATEGORIES)
print(cm)
...结果相似。
显然预测的标签必须与正确的标签进行比较。我做错了什么?
我找不到来源,但似乎 Tensorflow 仍在幕后改组测试。您可以尝试遍历数据集以获得预测和真实 类:
predicted_classes = np.array([])
true_classes = np.array([])
for x, y in testing_data:
predicted_classes = np.concatenate([predicted_classes,
np.argmax(model(x), axis = -1)])
true_classes = np.concatenate([true_classes, np.argmax(y.numpy(), axis=-1)])
model(x)
是为了更快的执行。 From the source:
Computation is done in batches. This method is designed for
performance in
large scale inputs. For small amount of inputs that fit in one batch,
directly using __call__
is recommended for faster execution, e.g.,
model(x)
如果不行,你可以试试model.predict(x)
。
model.evaluate()
报告的准确性与从 Sklearn 或 TF 混淆矩阵计算的准确性有很大不同。
from sklearn.metrics import confusion_matrix
...
training_data, validation_data, testing_data = load_img_datasets()
# These ^ are tensorflow.python.data.ops.dataset_ops.BatchDataset
strategy = tf.distribute.MirroredStrategy()
with strategy.scope():
model = create_model(INPUT_SHAPE, NUM_CATEGORIES)
optimizer = tf.keras.optimizers.Adam()
metrics = ['accuracy']
model.compile(loss='categorical_crossentropy',
optimizer=optimizer,
metrics=metrics)
history = model.fit(training_data, epochs=epochs,
validation_data=validation_data)
testing_data.shuffle(len(testing_data), reshuffle_each_iteration=False)
# I think this ^ is preventing additional shuffles on access
loss, accuracy = model.evaluate(testing_data)
print(f"Accuracy: {(accuracy * 100):.2f}%")
# Prints
# Accuracy: 78.7%
y_hat = model.predict(testing_data)
y_test = np.concatenate([y for x, y in testing_data], axis=0)
c_matrix = confusion_matrix(np.argmax(y_test, axis=-1),
np.argmax(y_hat, axis=-1))
print(c_matrix)
# Prints result that does not agree:
# Confusion matrix:
#[[ 72 111 54 15 69]
# [ 82 100 44 16 78]
# [ 64 114 52 21 69]
# [ 71 106 54 21 68]
# [ 79 101 51 25 64]]
# Accuracy calculated from CM = 19.3%
起初,我以为 TensorFlow 在每次访问时都在改组 testing_data
,所以我添加了 testing_data.shuffle(len(testing_data), reshuffle_each_iteration=False)
,但结果仍然不一致。
也试过TF混淆矩阵:
y_hat = model.predict(testing_data)
y_test = np.concatenate([y for x, y in testing_data], axis=0)
true_class = tf.argmax(y_test, 1)
predicted_class = tf.argmax(y_hat, 1)
cm = tf.math.confusion_matrix(true_class, predicted_class, NUM_CATEGORIES)
print(cm)
...结果相似。
显然预测的标签必须与正确的标签进行比较。我做错了什么?
我找不到来源,但似乎 Tensorflow 仍在幕后改组测试。您可以尝试遍历数据集以获得预测和真实 类:
predicted_classes = np.array([])
true_classes = np.array([])
for x, y in testing_data:
predicted_classes = np.concatenate([predicted_classes,
np.argmax(model(x), axis = -1)])
true_classes = np.concatenate([true_classes, np.argmax(y.numpy(), axis=-1)])
model(x)
是为了更快的执行。 From the source:
Computation is done in batches. This method is designed for performance in large scale inputs. For small amount of inputs that fit in one batch, directly using
__call__
is recommended for faster execution, e.g.,model(x)
如果不行,你可以试试model.predict(x)
。