为什么在我的顺序模型中 tape.gradient returns 全部 none?
Why tape.gradient returns all none in my Sequential model?
我必须计算这个模型的梯度:
model=Sequential()
model.add(Dense(40, activation='relu',input_dim=12))
model.add(Dense(60, activation='relu'))
model.add(Dense(units=3, activation='softmax'))
opt=tf.keras.optimizers.Adam(lr=0.001)
model.compile(loss="mse", optimizer=opt)
model_q=Sequential()
model_q.add(Dense(40, activation='relu',input_dim=15))
model_q.add(Dense(60, activation='relu'))
model_q.add(Dense(units=1, activation='linear'))
opt=tf.keras.optimizers.Adam(lr=0.001)
model_q.compile(loss="mse", optimizer=opt)
x=np.random.random(12)
x2=model.predict(x.reshape(-1,12))
with tf.GradientTape() as tape:
value = model_q([tf.convert_to_tensor(np.append(x,x2).reshape(-1,15))])
loss = -tf.reduce_mean(value)
grad = tape.gradient(loss, model.trainable_variables)
opt.apply_gradients(zip(grad, model.trainable_variables))
but grad returns all none 所以 opt 不能对模型应用渐变。为什么会这样?我知道这是一个很奇怪的损失,但这是我想要计算的东西
您的 model
没有被磁带录制。如果你想获得梯度,你必须将计算放入磁带的上下文中。
model=Sequential()
model.add(Dense(40, activation='relu',input_dim=12))
model.add(Dense(60, activation='relu'))
model.add(Dense(units=3, activation='softmax'))
opt=tf.keras.optimizers.Adam(lr=0.001)
model_q=Sequential()
model_q.add(Dense(40, activation='relu',input_dim=15))
model_q.add(Dense(60, activation='relu'))
model_q.add(Dense(units=1, activation='linear'))
opt=tf.keras.optimizers.Adam(lr=0.001)
x=np.random.random(12).reshape(-1,12)
with tf.GradientTape() as tape:
x2 = model([x])
value = model_q([tf.concat((x,x2), -1)])
loss = -tf.reduce_mean(value)
grad = tape.gradient(loss, model.trainable_variables)
opt.apply_gradients(zip(grad, model.trainable_variables))
我必须计算这个模型的梯度:
model=Sequential()
model.add(Dense(40, activation='relu',input_dim=12))
model.add(Dense(60, activation='relu'))
model.add(Dense(units=3, activation='softmax'))
opt=tf.keras.optimizers.Adam(lr=0.001)
model.compile(loss="mse", optimizer=opt)
model_q=Sequential()
model_q.add(Dense(40, activation='relu',input_dim=15))
model_q.add(Dense(60, activation='relu'))
model_q.add(Dense(units=1, activation='linear'))
opt=tf.keras.optimizers.Adam(lr=0.001)
model_q.compile(loss="mse", optimizer=opt)
x=np.random.random(12)
x2=model.predict(x.reshape(-1,12))
with tf.GradientTape() as tape:
value = model_q([tf.convert_to_tensor(np.append(x,x2).reshape(-1,15))])
loss = -tf.reduce_mean(value)
grad = tape.gradient(loss, model.trainable_variables)
opt.apply_gradients(zip(grad, model.trainable_variables))
but grad returns all none 所以 opt 不能对模型应用渐变。为什么会这样?我知道这是一个很奇怪的损失,但这是我想要计算的东西
您的 model
没有被磁带录制。如果你想获得梯度,你必须将计算放入磁带的上下文中。
model=Sequential()
model.add(Dense(40, activation='relu',input_dim=12))
model.add(Dense(60, activation='relu'))
model.add(Dense(units=3, activation='softmax'))
opt=tf.keras.optimizers.Adam(lr=0.001)
model_q=Sequential()
model_q.add(Dense(40, activation='relu',input_dim=15))
model_q.add(Dense(60, activation='relu'))
model_q.add(Dense(units=1, activation='linear'))
opt=tf.keras.optimizers.Adam(lr=0.001)
x=np.random.random(12).reshape(-1,12)
with tf.GradientTape() as tape:
x2 = model([x])
value = model_q([tf.concat((x,x2), -1)])
loss = -tf.reduce_mean(value)
grad = tape.gradient(loss, model.trainable_variables)
opt.apply_gradients(zip(grad, model.trainable_variables))