批量大小为 tensorflow/keras 的自定义损失 w 权重数组
Custom loss w weight arrays of batch size in tensorflow/keras
我正在创建一个自定义损失函数,它是一个 MAE(y_true, y_pred) ,由两个数组 a 和 b 加权,其中所有四个数组的大小相同 (10000 samples/timesteps).
def custom_loss(y_true, y_pred, a, b):
mae = K.abs(y_true - y_pred)
loss = mae * a * b
return loss
问题:如何将 a 和 b 送入函数? 两者都应该像 y_true 和 y_pred.
一样拆分和洗牌
到目前为止,我使用的是在形状数据 X(样本 x 时间步长 x 变量)上训练的 LSTM。在这里,我尝试了 tf 的 add_loss 函数来完成此操作,这导致在传递 a 和时由于不同的数据形状而出错b 作为进一步的输入层。
#LSTM
input_layer = Input(shape=input_shape)
in = LSTM(20, activation='relu', return_sequences=True)(input_layer)
out = LSTM(1, activation='linear', return_sequences=False)(in)
layer_a = Input(shape=(10000))
layer_b = Input(shape=(10000))
model = Model(inputs = [input_layer, layer_a, layer_b], outputs = out)
model.add_loss(custom_loss(input_layer, out, layer_a, layer_b))
model.compile(loss=None, optimizer=Adam(0.01))
# X=data of shape 20 variables x 10000 timesteps, y, a, b = data of shape 10000 timesteps
model.fit(x=[X, a, b], y=y, batch_size=1, shuffle=True)
如何正确执行此操作?
正如你介绍的,你必须使用add_loss
。记住将所有变量(正确格式的真实、预测和额外张量)传递给损失。
n_sample = 100
timesteps = 30
features = 5
X = np.random.uniform(0,1, (n_sample,timesteps,features))
y = np.random.uniform(0,1, n_sample)
a = np.random.uniform(0,1, n_sample)
b = np.random.uniform(0,1, n_sample)
def custom_loss(y_true, y_pred, a, b):
mae = K.abs(y_true - y_pred)
loss = mae * a * b
return loss
input_layer = Input(shape=(timesteps, features))
x = LSTM(20, activation='relu', return_sequences=True)(input_layer)
out = LSTM(1, activation='linear')(x)
layer_a = Input(shape=(1,))
layer_b = Input(shape=(1,))
target = Input(shape=(1,))
model = Model(inputs = [target, input_layer, layer_a, layer_b], outputs = out)
model.add_loss(custom_loss(target, out, layer_a, layer_b))
model.compile(loss=None, optimizer=Adam(0.01))
model.fit(x=[y, X, a, b], y=None, shuffle=True, epochs=3)
在推理模式下使用模型(删除 y 作为输入,如果不需要则删除 a 和 b):
final_model = Model(model.inputs[1], model.output)
final_model.predict(X)
如果你只需要 a
和 b
来计算损失函数,那么我会围绕你的自定义损失函数编写一个包装器,并传递一个元组 (y,a,b)
作为你的标签。
类似的东西:
n_sample = 100
timesteps = 30
features = 5
X = np.random.uniform(0,1, (n_sample,timesteps,features))
y = np.random.uniform(0,1, n_sample)
a = np.random.uniform(0,1, n_sample)
b = np.random.uniform(0,1, n_sample)
def custom_loss_wrapper(y_true, y_pred):
def custom_loss(y_true, y_pred, a, b):
mae = K.abs(y_true - y_pred)
loss = mae * a * b
return loss
return custom_loss(y_true[0], y_pred, y_true[1], y_true[2])
input_layer = Input(shape=(timesteps, features))
x = LSTM(20, activation='relu', return_sequences=True)(input_layer)
out = LSTM(1, activation='linear')(x)
model = Model(inputs =input_layer, outputs = out)
model.compile(loss=custom_loss_wrapper, optimizer=Adam(0.01))
model.fit(x=X, y=(y,a,b), shuffle=True, epochs=3)
它简化了网络架构并在推理时删除了不必要的 layer_a
和 layer_b
。
我正在创建一个自定义损失函数,它是一个 MAE(y_true, y_pred) ,由两个数组 a 和 b 加权,其中所有四个数组的大小相同 (10000 samples/timesteps).
def custom_loss(y_true, y_pred, a, b):
mae = K.abs(y_true - y_pred)
loss = mae * a * b
return loss
问题:如何将 a 和 b 送入函数? 两者都应该像 y_true 和 y_pred.
一样拆分和洗牌到目前为止,我使用的是在形状数据 X(样本 x 时间步长 x 变量)上训练的 LSTM。在这里,我尝试了 tf 的 add_loss 函数来完成此操作,这导致在传递 a 和时由于不同的数据形状而出错b 作为进一步的输入层。
#LSTM
input_layer = Input(shape=input_shape)
in = LSTM(20, activation='relu', return_sequences=True)(input_layer)
out = LSTM(1, activation='linear', return_sequences=False)(in)
layer_a = Input(shape=(10000))
layer_b = Input(shape=(10000))
model = Model(inputs = [input_layer, layer_a, layer_b], outputs = out)
model.add_loss(custom_loss(input_layer, out, layer_a, layer_b))
model.compile(loss=None, optimizer=Adam(0.01))
# X=data of shape 20 variables x 10000 timesteps, y, a, b = data of shape 10000 timesteps
model.fit(x=[X, a, b], y=y, batch_size=1, shuffle=True)
如何正确执行此操作?
正如你介绍的,你必须使用add_loss
。记住将所有变量(正确格式的真实、预测和额外张量)传递给损失。
n_sample = 100
timesteps = 30
features = 5
X = np.random.uniform(0,1, (n_sample,timesteps,features))
y = np.random.uniform(0,1, n_sample)
a = np.random.uniform(0,1, n_sample)
b = np.random.uniform(0,1, n_sample)
def custom_loss(y_true, y_pred, a, b):
mae = K.abs(y_true - y_pred)
loss = mae * a * b
return loss
input_layer = Input(shape=(timesteps, features))
x = LSTM(20, activation='relu', return_sequences=True)(input_layer)
out = LSTM(1, activation='linear')(x)
layer_a = Input(shape=(1,))
layer_b = Input(shape=(1,))
target = Input(shape=(1,))
model = Model(inputs = [target, input_layer, layer_a, layer_b], outputs = out)
model.add_loss(custom_loss(target, out, layer_a, layer_b))
model.compile(loss=None, optimizer=Adam(0.01))
model.fit(x=[y, X, a, b], y=None, shuffle=True, epochs=3)
在推理模式下使用模型(删除 y 作为输入,如果不需要则删除 a 和 b):
final_model = Model(model.inputs[1], model.output)
final_model.predict(X)
如果你只需要 a
和 b
来计算损失函数,那么我会围绕你的自定义损失函数编写一个包装器,并传递一个元组 (y,a,b)
作为你的标签。
类似的东西:
n_sample = 100
timesteps = 30
features = 5
X = np.random.uniform(0,1, (n_sample,timesteps,features))
y = np.random.uniform(0,1, n_sample)
a = np.random.uniform(0,1, n_sample)
b = np.random.uniform(0,1, n_sample)
def custom_loss_wrapper(y_true, y_pred):
def custom_loss(y_true, y_pred, a, b):
mae = K.abs(y_true - y_pred)
loss = mae * a * b
return loss
return custom_loss(y_true[0], y_pred, y_true[1], y_true[2])
input_layer = Input(shape=(timesteps, features))
x = LSTM(20, activation='relu', return_sequences=True)(input_layer)
out = LSTM(1, activation='linear')(x)
model = Model(inputs =input_layer, outputs = out)
model.compile(loss=custom_loss_wrapper, optimizer=Adam(0.01))
model.fit(x=X, y=(y,a,b), shuffle=True, epochs=3)
它简化了网络架构并在推理时删除了不必要的 layer_a
和 layer_b
。