Binary-CrossEntropy - 适用于 Keras 但不适用于千层面?
Binary-CrossEntropy - Works on Keras But Not on Lasagne?
我在 Keras 和 Lasagne 上使用相同的卷积神经网络结构。
现在,我只是换了一个简单的网络,看看它是否改变了什么,但它没有。
在 Keras 上它工作正常,它输出 0 到 1 之间的值,精度很高。
在烤宽面条上,这些值大多不会出错。看起来输出与输入相同。
基本上:它在 keras 上输出和训练良好。但不是我的千层面版本
千层面的结构:
def structure(w=5, h=5):
try:
input_var = T.tensor4('inputs')
target_var = T.bmatrix('targets')
network = lasagne.layers.InputLayer(shape=(None, 1, h, w), input_var=input_var)
network = lasagne.layers.Conv2DLayer(
network, num_filters=64, filter_size=(3, 3), stride=1, pad=0,
nonlinearity=lasagne.nonlinearities.rectify,
W=lasagne.init.GlorotUniform())
network = lasagne.layers.Conv2DLayer(
network, num_filters=64, filter_size=(3, 3), stride=1, pad=0,
nonlinearity=lasagne.nonlinearities.rectify,
W=lasagne.init.GlorotUniform())
network = lasagne.layers.MaxPool2DLayer(network, pool_size=(2, 2), stride=None, pad=(0, 0), ignore_border=True)
network = lasagne.layers.DenseLayer(
lasagne.layers.dropout(network, p=0.5),
num_units=256,
nonlinearity=lasagne.nonlinearities.rectify, W=lasagne.init.GlorotUniform())
network = lasagne.layers.DenseLayer(
lasagne.layers.dropout(network, p=0.5),
num_units=1,
nonlinearity=lasagne.nonlinearities.sigmoid)
print "...Output", lasagne.layers.get_output_shape(network)
return network, input_var, target_var
except Exception as inst:
print ("Failure to Build NN !", inst.message, (type(inst)), (inst.args), (inst))
return None
在 Keras 上:
def getModel(w,h):
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras.optimizers import SGD
model = Sequential()
model.add(Convolution2D(64, 3, 3, border_mode='valid', input_shape=(1, h, w)))
model.add(Activation('relu'))
model.add(Convolution2D(64, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Convolution2D(128, 3, 3, border_mode='valid'))
model.add(Activation('relu'))
model.add(Convolution2D(128, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
#
model.add(Flatten())
#
model.add(Dense(256))
model.add(Activation('relu'))
model.add(Dropout(0.25))
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.25))
#
model.add(Dense(1))
model.add(Activation('sigmoid'))
sgd = SGD(lr=0.001, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='binary_crossentropy', optimizer='sgd')
return model
并在 Keras 上训练..
model.fit(x, y, batch_size=512, nb_epoch=500, verbose=2, validation_split=0.2, shuffle=True, show_accuracy=True)
并在烤宽面条上进行训练和预测:
要训练:
prediction = lasagne.layers.get_output(network)
loss = lasagne.objectives.binary_crossentropy(prediction, target_var)
loss = loss.mean()
params = lasagne.layers.get_all_params(network, trainable=True)
# updates = lasagne.updates.sgd(loss, params, learning_rate=learning_rate)
updates = lasagne.updates.nesterov_momentum(loss_or_grads=loss, params=params, learning_rate=learning_rate, momentum=momentum_rho)
#
test_prediction = lasagne.layers.get_output(network, deterministic=True)
test_prediction = lasagne.layers.get_output(network, deterministic=True)
test_loss = lasagne.objectives.binary_crossentropy(test_prediction, target_var)
test_loss = test_loss.mean()
# Accuracy
test_acc = lasagne.objectives.binary_accuracy(test_prediction, target_var)
test_acc = test_acc.mean()
train_fn = theano.function([input_var, target_var], loss, updates=updates)
val_fn = theano.function([input_var, target_var], [test_loss, test_acc])
我正在使用这些迭代器,我希望这不是它的原因..也许是?
def iterate_minibatches_getOutput(self, inputs, batchsize):
for start_idx in range(0, len(inputs) - batchsize + 1, batchsize):
excerpt = slice(start_idx, start_idx + batchsize)
yield inputs[excerpt]
def iterate_minibatches(self, inputs, targets, batchsize, shuffle=False):
assert len(inputs) == len(targets)
if shuffle:
indices = np.arange(len(inputs))
np.random.shuffle(indices)
for start_idx in range(0, len(inputs) - batchsize + 1, batchsize):
if shuffle:
excerpt = indices[start_idx:start_idx + batchsize]
else:
excerpt = slice(start_idx, start_idx + batchsize)
yield inputs[excerpt], targets[excerpt]
预测:
test_prediction = lasagne.layers.get_output(self.network, deterministic=True)
predict_fn = theano.function([self.input_var], test_prediction)
index = 0
for batch in self.iterate_minibatches_getOutput(inputs=submission_feature_x, batchsize=self.batch_size):
inputs = batch
y = predict_fn(inputs)
start = index * self.batch_size
end = (index + 1) * self.batch_size
predictions[index * self.batch_size:self.batch_size * (index + 1)] = y
index += 1
print "debug -->", predictions[0:10]
print "debug max ---->", np.max(predictions)
print "debug min ----->", np.min(predictions)
这打印:
debug --> [[ 0.3252553 ]
[ 0.3252553 ]
[ 0.3252553 ]
[ 0.3252553 ]
[ 0.3252553 ]
[ 0.3252553 ]
[ 0.3252553 ]
[ 0.3252553 ]
[ 0.3252553 ]
[ 0.32534513]]
debug max ----> 1.0
debug min -----> 0.0
结果完全错误。
然而,让我感到困惑的是,它在 keras 上输出正常。
此外,验证帐户永远不会改变:
Epoch 2 of 30 took 9.5846s
Training loss: 0.22714619
Validation loss: 0.17278196
Validation accuracy: 95.85454545 %
Epoch 3 of 30 took 9.6437s
Training loss: 0.22646923
Validation loss: 0.17249792
Validation accuracy: 95.85454545 %
Epoch 4 of 30 took 9.6464s
Training loss: 0.22563262
Validation loss: 0.17235395
Validation accuracy: 95.85454545 %
Epoch 5 of 30 took 10.5069s
Training loss: 0.22464556
Validation loss: 0.17226825
Validation accuracy: 95.85454545 %
...
请帮忙!
我做错了什么?
这些是正在使用的形状:
x_train.shape (102746, 1, 17, 17)
y_train.shape (102746, 1)
x_val.shape (11416, 1, 17, 17)
y_val.shape (11416, 1)
问题是:
target_var = T.bmatrix('targets')
应该是:
target_var = T.fmatrix('targets')
另外,学习率太低了。
而在 Keras 脚本上,还有另一个错误:
sgd = SGD(lr=0.001, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='binary_crossentropy', optimizer='sgd')
应该是:
sgd = SGD(lr=0.001, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='binary_crossentropy', optimizer=sgd)
我在 Keras 和 Lasagne 上使用相同的卷积神经网络结构。 现在,我只是换了一个简单的网络,看看它是否改变了什么,但它没有。
在 Keras 上它工作正常,它输出 0 到 1 之间的值,精度很高。 在烤宽面条上,这些值大多不会出错。看起来输出与输入相同。
基本上:它在 keras 上输出和训练良好。但不是我的千层面版本
千层面的结构:
def structure(w=5, h=5):
try:
input_var = T.tensor4('inputs')
target_var = T.bmatrix('targets')
network = lasagne.layers.InputLayer(shape=(None, 1, h, w), input_var=input_var)
network = lasagne.layers.Conv2DLayer(
network, num_filters=64, filter_size=(3, 3), stride=1, pad=0,
nonlinearity=lasagne.nonlinearities.rectify,
W=lasagne.init.GlorotUniform())
network = lasagne.layers.Conv2DLayer(
network, num_filters=64, filter_size=(3, 3), stride=1, pad=0,
nonlinearity=lasagne.nonlinearities.rectify,
W=lasagne.init.GlorotUniform())
network = lasagne.layers.MaxPool2DLayer(network, pool_size=(2, 2), stride=None, pad=(0, 0), ignore_border=True)
network = lasagne.layers.DenseLayer(
lasagne.layers.dropout(network, p=0.5),
num_units=256,
nonlinearity=lasagne.nonlinearities.rectify, W=lasagne.init.GlorotUniform())
network = lasagne.layers.DenseLayer(
lasagne.layers.dropout(network, p=0.5),
num_units=1,
nonlinearity=lasagne.nonlinearities.sigmoid)
print "...Output", lasagne.layers.get_output_shape(network)
return network, input_var, target_var
except Exception as inst:
print ("Failure to Build NN !", inst.message, (type(inst)), (inst.args), (inst))
return None
在 Keras 上:
def getModel(w,h):
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras.optimizers import SGD
model = Sequential()
model.add(Convolution2D(64, 3, 3, border_mode='valid', input_shape=(1, h, w)))
model.add(Activation('relu'))
model.add(Convolution2D(64, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Convolution2D(128, 3, 3, border_mode='valid'))
model.add(Activation('relu'))
model.add(Convolution2D(128, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
#
model.add(Flatten())
#
model.add(Dense(256))
model.add(Activation('relu'))
model.add(Dropout(0.25))
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.25))
#
model.add(Dense(1))
model.add(Activation('sigmoid'))
sgd = SGD(lr=0.001, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='binary_crossentropy', optimizer='sgd')
return model
并在 Keras 上训练..
model.fit(x, y, batch_size=512, nb_epoch=500, verbose=2, validation_split=0.2, shuffle=True, show_accuracy=True)
并在烤宽面条上进行训练和预测:
要训练:
prediction = lasagne.layers.get_output(network)
loss = lasagne.objectives.binary_crossentropy(prediction, target_var)
loss = loss.mean()
params = lasagne.layers.get_all_params(network, trainable=True)
# updates = lasagne.updates.sgd(loss, params, learning_rate=learning_rate)
updates = lasagne.updates.nesterov_momentum(loss_or_grads=loss, params=params, learning_rate=learning_rate, momentum=momentum_rho)
#
test_prediction = lasagne.layers.get_output(network, deterministic=True)
test_prediction = lasagne.layers.get_output(network, deterministic=True)
test_loss = lasagne.objectives.binary_crossentropy(test_prediction, target_var)
test_loss = test_loss.mean()
# Accuracy
test_acc = lasagne.objectives.binary_accuracy(test_prediction, target_var)
test_acc = test_acc.mean()
train_fn = theano.function([input_var, target_var], loss, updates=updates)
val_fn = theano.function([input_var, target_var], [test_loss, test_acc])
我正在使用这些迭代器,我希望这不是它的原因..也许是?
def iterate_minibatches_getOutput(self, inputs, batchsize):
for start_idx in range(0, len(inputs) - batchsize + 1, batchsize):
excerpt = slice(start_idx, start_idx + batchsize)
yield inputs[excerpt]
def iterate_minibatches(self, inputs, targets, batchsize, shuffle=False):
assert len(inputs) == len(targets)
if shuffle:
indices = np.arange(len(inputs))
np.random.shuffle(indices)
for start_idx in range(0, len(inputs) - batchsize + 1, batchsize):
if shuffle:
excerpt = indices[start_idx:start_idx + batchsize]
else:
excerpt = slice(start_idx, start_idx + batchsize)
yield inputs[excerpt], targets[excerpt]
预测:
test_prediction = lasagne.layers.get_output(self.network, deterministic=True)
predict_fn = theano.function([self.input_var], test_prediction)
index = 0
for batch in self.iterate_minibatches_getOutput(inputs=submission_feature_x, batchsize=self.batch_size):
inputs = batch
y = predict_fn(inputs)
start = index * self.batch_size
end = (index + 1) * self.batch_size
predictions[index * self.batch_size:self.batch_size * (index + 1)] = y
index += 1
print "debug -->", predictions[0:10]
print "debug max ---->", np.max(predictions)
print "debug min ----->", np.min(predictions)
这打印:
debug --> [[ 0.3252553 ]
[ 0.3252553 ]
[ 0.3252553 ]
[ 0.3252553 ]
[ 0.3252553 ]
[ 0.3252553 ]
[ 0.3252553 ]
[ 0.3252553 ]
[ 0.3252553 ]
[ 0.32534513]]
debug max ----> 1.0
debug min -----> 0.0
结果完全错误。 然而,让我感到困惑的是,它在 keras 上输出正常。
此外,验证帐户永远不会改变:
Epoch 2 of 30 took 9.5846s
Training loss: 0.22714619
Validation loss: 0.17278196
Validation accuracy: 95.85454545 %
Epoch 3 of 30 took 9.6437s
Training loss: 0.22646923
Validation loss: 0.17249792
Validation accuracy: 95.85454545 %
Epoch 4 of 30 took 9.6464s
Training loss: 0.22563262
Validation loss: 0.17235395
Validation accuracy: 95.85454545 %
Epoch 5 of 30 took 10.5069s
Training loss: 0.22464556
Validation loss: 0.17226825
Validation accuracy: 95.85454545 %
...
请帮忙! 我做错了什么?
这些是正在使用的形状:
x_train.shape (102746, 1, 17, 17)
y_train.shape (102746, 1)
x_val.shape (11416, 1, 17, 17)
y_val.shape (11416, 1)
问题是:
target_var = T.bmatrix('targets')
应该是:
target_var = T.fmatrix('targets')
另外,学习率太低了。
而在 Keras 脚本上,还有另一个错误:
sgd = SGD(lr=0.001, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='binary_crossentropy', optimizer='sgd')
应该是:
sgd = SGD(lr=0.001, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='binary_crossentropy', optimizer=sgd)