从 np 数组转换后,keras(tensorflow 2.0)中的 tensorflow 数据集数据的形状是错误的
Shape of tensorflow dataset data in keras (tensorflow 2.0) is wrong after conversion from np array
在 this guide 之后设置简单的 tensorflow 2.0 测试时,keras 输入层的输入是错误的,但只有在转换为数据集(假装具有正确的形状)之后才会如此。
运行 文档中的 colab 笔记本当然可以工作,但我无法弄清楚我的设置可能有什么问题。任何提示表示赞赏!
在 jupyter 实验室中设置一些假数据:
data = np.random.random((1000, 32,))
labels = np.random.random((1000, 10,))
dataset = tf.data.Dataset.from_tensor_slices((data, labels))
dataset.shuffle(buffer_size=1024).batch(32)
>> <BatchDataset shapes: ((None, 32), (None, 10)), types: (tf.float64, tf.float64)>
使用函数式 keras 构建简化模型API(顺序没有区别):
inputs = keras.Input(shape=(32,))
hidden = keras.layers.Dense(64, activation='relu')(inputs)
hidden = keras.layers.Dense(64, activation='relu')(hidden)
output = keras.layers.Dense(10, activation='softmax')(hidden)
model = keras.Model(inputs=inputs, outputs=output)
model.compile(loss='mse',
optimizer=keras.optimizers.Adam(0.001),
metrics=['mae'])
运行 适合 numpy 数组的模型按预期工作:
model.fit(data, labels, epochs=10, batch_size=32)
>> Epoch 1/10
>> 1000/1000 [==============================] - 0s 124us/sample - loss: 0.2472 - mae: 0.4143
[...]
>> Epoch 10/10
>> 1000/1000 [==============================] - 0s 32us/sample - loss: 0.2451 - mae: 0.4132
与数据集拟合不起作用(而 docs/colab 示例起作用):
model.fit(dataset, epochs=10, steps_per_epoch=10)
这会在显然是 (1,) 的输入形状上引发 ValueError:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-6-9b010c729342> in <module>
----> 1 model.fit(dataset, epochs=10, steps_per_epoch=10)
~/venvs/tf/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_freq, max_queue_size, workers, use_multiprocessing, **kwargs)
789 workers=0,
790 shuffle=shuffle,
--> 791 initial_epoch=initial_epoch)
792
793 # Case 3: Symbolic tensors or Numpy array-like.
~/venvs/tf/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py in fit_generator(self, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, validation_freq, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
1513 shuffle=shuffle,
1514 initial_epoch=initial_epoch,
-> 1515 steps_name='steps_per_epoch')
1516
1517 def evaluate_generator(self,
~/venvs/tf/lib/python3.6/site-packages/tensorflow/python/keras/engine/training_generator.py in model_iteration(model, data, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, validation_freq, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch, mode, batch_size, steps_name, **kwargs)
255
256 is_deferred = not model._is_compiled
--> 257 batch_outs = batch_function(*batch_data)
258 if not isinstance(batch_outs, list):
259 batch_outs = [batch_outs]
~/venvs/tf/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py in train_on_batch(self, x, y, sample_weight, class_weight, reset_metrics)
1236 x, y, sample_weights = self._standardize_user_data(
1237 x, y, sample_weight=sample_weight, class_weight=class_weight,
-> 1238 extract_tensors_from_dataset=True)
1239
1240 if self.run_eagerly:
~/venvs/tf/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, batch_size, check_steps, steps_name, steps, validation_split, shuffle, extract_tensors_from_dataset)
2594 feed_input_shapes,
2595 check_batch_axis=False, # Don't enforce the batch size.
-> 2596 exception_prefix='input')
2597
2598 if y is not None:
~/venvs/tf/lib/python3.6/site-packages/tensorflow/python/keras/engine/training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
347 ': expected ' + names[i] + ' to have shape ' +
348 str(shape) + ' but got array with shape ' +
--> 349 str(data_shape))
350 return data
351
ValueError: Error when checking input: expected input_1 to have shape (32,) but got array with shape (1,)
这是在遇到更复杂模型的错误后最简单的压缩版本,首先是新的 tensorflow-datasets 包 - 它是如此简单,现在我不知道为什么它不能正常工作(并做(几乎)与 numpy 数组版本相同的事情)。
您正在为模型提供您在第
行创建的 dataset
对象
dataset = tf.data.Dataset.from_tensor_slices((data, labels))
这是一个包含 1000 对的数据集,每对 (32, 10)
。
下一行
dataset.shuffle(buffer_size=1024).batch(32)
生成一个数据集,该数据集生成形状为 (32, 32, 10)
的批次,但您从未将其分配给 dataset
变量(tf.data.Dataset
被设计为使用方法链,它们生成一个新的数据集并且不要就地更改数据集)。
因此你可以通过覆盖数据集变量来解决
dataset = dataset.shuffle(buffer_size=1024).batch(32)
在 this guide 之后设置简单的 tensorflow 2.0 测试时,keras 输入层的输入是错误的,但只有在转换为数据集(假装具有正确的形状)之后才会如此。
运行 文档中的 colab 笔记本当然可以工作,但我无法弄清楚我的设置可能有什么问题。任何提示表示赞赏!
在 jupyter 实验室中设置一些假数据:
data = np.random.random((1000, 32,))
labels = np.random.random((1000, 10,))
dataset = tf.data.Dataset.from_tensor_slices((data, labels))
dataset.shuffle(buffer_size=1024).batch(32)
>> <BatchDataset shapes: ((None, 32), (None, 10)), types: (tf.float64, tf.float64)>
使用函数式 keras 构建简化模型API(顺序没有区别):
inputs = keras.Input(shape=(32,))
hidden = keras.layers.Dense(64, activation='relu')(inputs)
hidden = keras.layers.Dense(64, activation='relu')(hidden)
output = keras.layers.Dense(10, activation='softmax')(hidden)
model = keras.Model(inputs=inputs, outputs=output)
model.compile(loss='mse',
optimizer=keras.optimizers.Adam(0.001),
metrics=['mae'])
运行 适合 numpy 数组的模型按预期工作:
model.fit(data, labels, epochs=10, batch_size=32)
>> Epoch 1/10
>> 1000/1000 [==============================] - 0s 124us/sample - loss: 0.2472 - mae: 0.4143
[...]
>> Epoch 10/10
>> 1000/1000 [==============================] - 0s 32us/sample - loss: 0.2451 - mae: 0.4132
与数据集拟合不起作用(而 docs/colab 示例起作用):
model.fit(dataset, epochs=10, steps_per_epoch=10)
这会在显然是 (1,) 的输入形状上引发 ValueError:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-6-9b010c729342> in <module>
----> 1 model.fit(dataset, epochs=10, steps_per_epoch=10)
~/venvs/tf/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_freq, max_queue_size, workers, use_multiprocessing, **kwargs)
789 workers=0,
790 shuffle=shuffle,
--> 791 initial_epoch=initial_epoch)
792
793 # Case 3: Symbolic tensors or Numpy array-like.
~/venvs/tf/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py in fit_generator(self, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, validation_freq, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
1513 shuffle=shuffle,
1514 initial_epoch=initial_epoch,
-> 1515 steps_name='steps_per_epoch')
1516
1517 def evaluate_generator(self,
~/venvs/tf/lib/python3.6/site-packages/tensorflow/python/keras/engine/training_generator.py in model_iteration(model, data, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, validation_freq, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch, mode, batch_size, steps_name, **kwargs)
255
256 is_deferred = not model._is_compiled
--> 257 batch_outs = batch_function(*batch_data)
258 if not isinstance(batch_outs, list):
259 batch_outs = [batch_outs]
~/venvs/tf/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py in train_on_batch(self, x, y, sample_weight, class_weight, reset_metrics)
1236 x, y, sample_weights = self._standardize_user_data(
1237 x, y, sample_weight=sample_weight, class_weight=class_weight,
-> 1238 extract_tensors_from_dataset=True)
1239
1240 if self.run_eagerly:
~/venvs/tf/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, batch_size, check_steps, steps_name, steps, validation_split, shuffle, extract_tensors_from_dataset)
2594 feed_input_shapes,
2595 check_batch_axis=False, # Don't enforce the batch size.
-> 2596 exception_prefix='input')
2597
2598 if y is not None:
~/venvs/tf/lib/python3.6/site-packages/tensorflow/python/keras/engine/training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
347 ': expected ' + names[i] + ' to have shape ' +
348 str(shape) + ' but got array with shape ' +
--> 349 str(data_shape))
350 return data
351
ValueError: Error when checking input: expected input_1 to have shape (32,) but got array with shape (1,)
这是在遇到更复杂模型的错误后最简单的压缩版本,首先是新的 tensorflow-datasets 包 - 它是如此简单,现在我不知道为什么它不能正常工作(并做(几乎)与 numpy 数组版本相同的事情)。
您正在为模型提供您在第
行创建的dataset
对象
dataset = tf.data.Dataset.from_tensor_slices((data, labels))
这是一个包含 1000 对的数据集,每对 (32, 10)
。
下一行
dataset.shuffle(buffer_size=1024).batch(32)
生成一个数据集,该数据集生成形状为 (32, 32, 10)
的批次,但您从未将其分配给 dataset
变量(tf.data.Dataset
被设计为使用方法链,它们生成一个新的数据集并且不要就地更改数据集)。
因此你可以通过覆盖数据集变量来解决
dataset = dataset.shuffle(buffer_size=1024).batch(32)