使用 model.fit() InvalidArgumentError 训练自定义 tf.keras.model

Training custom tf.keras.model with model.fit() InvalidArgumentError

我想通过实现一个简单的模型来找到 Tensorflow 2.0 中的 Keras API 的窍门,该模型可以找到 3 次多项式的 4 个系数。此代码工作正常:

import tensorflow as tf
import numpy as np

# Simple model that fits a 3rd degree polynomial

def f(x, w):
    return w[0] * x**3 + w[1] * x**2 + w[2] * x + w[3]

coeffs = [-4, 2, -1, 5]
x_train = np.linspace(0, 5, 100).astype(np.float32)
y_train = f(x_train, coeffs).astype(np.float32)

# Layer that computes polynomial w0*x^3 + w1*x^2 + w2*x + w3
class MyLayer(tf.keras.layers.Layer):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)

    def build(self, input_shape):
        self.w = self.add_weight(shape=(4,), initializer="glorot_normal")
        super().build(input_shape)

    def call(self, X):
        return self.w[0] * X**3 + self.w[1] * X**2 + self.w[2] * X + self.w[3]

x = tf.keras.layers.Input(shape=1, dtype=tf.float32)
yhat = MyLayer()(x)
model = tf.keras.Model(inputs=x, outputs=yhat)
opt = tf.keras.optimizers.Adam(0.2)
model.compile(loss='mse', optimizer=opt)
model.fit(x_train, y_train, epochs=10, batch_size=100, steps_per_epoch=500)

但是,如果我尝试扩展 tf.keras.Model 并自定义 class,它会失败:

class MyModel(tf.keras.Model):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.mylayer = MyLayer()

    def call(self, X):
        return self.mylayer(X)

model = MyModel()
opt = tf.keras.optimizers.Adam(0.2)
model.compile(loss='mse', optimizer=opt)
model.fit(x_train, y_train, epochs=10, batch_size=100, steps_per_epoch=500)

有错误:

InvalidArgumentError:  data[0].shape = [2] does not start with indices[0].shape = [1]
     [[node Adam/gradients_4/loss_4/output_1_loss/Mean_grad/DynamicStitch (defined at <ipython-input-32-86f7f8f08b63>:36) ]] [Op:__inference_keras_scratch_graph_2486]

Function call stack:
keras_scratch_graph

问题似乎出在 x_trainy_train 形状上。预计是 (n_samples, 1)。 试试看:

model.fit(x_train.reshape(-1, 1), y_train.reshape(-1, 1), epochs=10, 
          batch_size=100, steps_per_epoch=500)