使用 Chainer 训练 MLP 时出错
Error training a MLP using Chainer
我正在尝试训练和测试一个简单的多层感知器,就像在第一个 Chainer 教程中一样,但使用我自己的数据集而不是 MNIST。这是我正在使用的代码(主要来自教程):
class MLP(Chain):
def __init__(self, n_units, n_out):
super(MLP, self).__init__()
with self.init_scope():
self.l1 = L.Linear(None, n_units)
self.l2 = L.Linear(None, n_units)
self.l3 = L.Linear(None, n_out)
def __call__(self, x):
h1 = F.relu(self.l1(x))
h2 = F.relu(self.l2(h1))
y = self.l3(h2)
return y
X, X_test, y, y_test, xHeaders, yHeaders = load_train_test_data('xHeuristicData.csv', 'yHeuristicData.csv')
print 'dataset shape X:', X.shape, ' y:', y.shape
model = MLP(100, 1)
optimizer = optimizers.SGD()
optimizer.setup(model)
train = tuple_dataset.TupleDataset(X, y)
test = tuple_dataset.TupleDataset(X_test, y_test)
train_iter = iterators.SerialIterator(train, batch_size=100, shuffle=True)
test_iter = iterators.SerialIterator(test, batch_size=100, repeat=False, shuffle=False)
updater = training.StandardUpdater(train_iter, optimizer)
trainer = training.Trainer(updater, (10, 'epoch'), out='result')
trainer.extend(extensions.Evaluator(test_iter, model))
trainer.extend(extensions.LogReport())
trainer.extend(extensions.PrintReport(['epoch', 'main/accuracy', 'validation/main/accuracy']))
trainer.extend(extensions.ProgressBar())
trainer.run()
print 'Predicted value for a test example'
print model(X_test[0])
我没有训练和打印预测值,而是在 "trainer.run()" 处收到以下错误:
dataset shape X: (1003, 116) y: (1003,)
Exception in main training loop: __call__() takes exactly 2 arguments (3 given)
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/chainer/training/trainer.py", line 299, in run
update()
File "/usr/local/lib/python2.7/dist-packages/chainer/training/updater.py", line 223, in update
self.update_core()
File "/usr/local/lib/python2.7/dist-packages/chainer/training/updater.py", line 234, in update_core
optimizer.update(loss_func, *in_arrays)
File "/usr/local/lib/python2.7/dist-packages/chainer/optimizer.py", line 534, in update
loss = lossfun(*args, **kwds)
Will finalize trainer extensions and updater before reraising the exception.
Traceback (most recent call last):
File "trainHeuristicChainer.py", line 76, in <module>
trainer.run()
File "/usr/local/lib/python2.7/dist-packages/chainer/training/trainer.py", line 313, in run
six.reraise(*sys.exc_info())
File "/usr/local/lib/python2.7/dist-packages/chainer/training/trainer.py", line 299, in run
update()
File "/usr/local/lib/python2.7/dist-packages/chainer/training/updater.py", line 223, in update
self.update_core()
File "/usr/local/lib/python2.7/dist-packages/chainer/training/updater.py", line 234, in update_core
optimizer.update(loss_func, *in_arrays)
File "/usr/local/lib/python2.7/dist-packages/chainer/optimizer.py", line 534, in update
loss = lossfun(*args, **kwds)
TypeError: __call__() takes exactly 2 arguments (3 given)
我不知道如何处理这个错误。我已经使用其他框架成功地训练了类似的网络,但我对 Chainer 很感兴趣,因为它与 PyPy 兼容。
此处提供包含文件的 tgz:https://mega.nz/#!wwsBiSwY!g72pC5ZgekeMiVr-UODJOqQfQZZU3lCqm9Er2jH4UD8
您正在将 (X, y)
的元组发送到 MLP,而实现的 __call__
仅接受 x
.
您可以将实现修改为
class MLP(Chain):
def __init__(self, n_units, n_out):
super(MLP, self).__init__()
with self.init_scope():
self.l1 = L.Linear(None, n_units)
self.l2 = L.Linear(None, n_units)
self.l3 = L.Linear(None, n_out)
def __call__(self, x, y):
h1 = F.relu(self.l1(x))
h2 = F.relu(self.l2(h1))
predict = self.l3(h2)
loss = F.squared_error(predict, y)
// or you can write it on your own as follows
// loss = F.sum(F.square(predict - y))
return loss
chainer 中可能与其他框架不同,默认情况下标准更新程序假定 __call__
是损失函数。所以调用 model(X, y)
会 return 损失当前的 mini-batch。这就是为什么 chainer 教程引入了另一个 Classifier
class 来计算损失函数并保持 MLP 简单。分类器在 MNIST 中有意义,但不适合您的任务,因此您需要自行实现损失函数。
当你完成训练后,你可以只保存模型实例(也许通过在训练器中添加 snapshot_object 的扩展)。
要使用保存的模型,就像在测试中一样,您必须在 class 中编写另一个方法,可能命名为 test
,使用与当前 __call__
相同的代码,这手头只有 X
输入,因此不需要其他 y
。
此外,如果你不喜欢在MLP中添加任何额外的方法class,让它变得纯粹,那么你需要自己编写updater并更自然地计算损失函数。继承标准的比较简单,可以这样写,
class MyUpdater(chainer.training.StandardUpdater):
def __init__(self, data_iter, model, opt, device=-1):
super(MyUpdater, self).__init__(data_iter, opt, device=device)
self.mlp = model
def update_core(self):
batch = self.get_iterator('main').next()
x, y = self.converter(batch, self.device)
predict = self.mlp(x)
loss = F.squared_error(predict, y)
self.mlp.cleargrads()
loss.backward()
self.get_iterator('main').update()
updater = MyUpdater(train_iter, model, optimizer)
我正在尝试训练和测试一个简单的多层感知器,就像在第一个 Chainer 教程中一样,但使用我自己的数据集而不是 MNIST。这是我正在使用的代码(主要来自教程):
class MLP(Chain):
def __init__(self, n_units, n_out):
super(MLP, self).__init__()
with self.init_scope():
self.l1 = L.Linear(None, n_units)
self.l2 = L.Linear(None, n_units)
self.l3 = L.Linear(None, n_out)
def __call__(self, x):
h1 = F.relu(self.l1(x))
h2 = F.relu(self.l2(h1))
y = self.l3(h2)
return y
X, X_test, y, y_test, xHeaders, yHeaders = load_train_test_data('xHeuristicData.csv', 'yHeuristicData.csv')
print 'dataset shape X:', X.shape, ' y:', y.shape
model = MLP(100, 1)
optimizer = optimizers.SGD()
optimizer.setup(model)
train = tuple_dataset.TupleDataset(X, y)
test = tuple_dataset.TupleDataset(X_test, y_test)
train_iter = iterators.SerialIterator(train, batch_size=100, shuffle=True)
test_iter = iterators.SerialIterator(test, batch_size=100, repeat=False, shuffle=False)
updater = training.StandardUpdater(train_iter, optimizer)
trainer = training.Trainer(updater, (10, 'epoch'), out='result')
trainer.extend(extensions.Evaluator(test_iter, model))
trainer.extend(extensions.LogReport())
trainer.extend(extensions.PrintReport(['epoch', 'main/accuracy', 'validation/main/accuracy']))
trainer.extend(extensions.ProgressBar())
trainer.run()
print 'Predicted value for a test example'
print model(X_test[0])
我没有训练和打印预测值,而是在 "trainer.run()" 处收到以下错误:
dataset shape X: (1003, 116) y: (1003,)
Exception in main training loop: __call__() takes exactly 2 arguments (3 given)
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/chainer/training/trainer.py", line 299, in run
update()
File "/usr/local/lib/python2.7/dist-packages/chainer/training/updater.py", line 223, in update
self.update_core()
File "/usr/local/lib/python2.7/dist-packages/chainer/training/updater.py", line 234, in update_core
optimizer.update(loss_func, *in_arrays)
File "/usr/local/lib/python2.7/dist-packages/chainer/optimizer.py", line 534, in update
loss = lossfun(*args, **kwds)
Will finalize trainer extensions and updater before reraising the exception.
Traceback (most recent call last):
File "trainHeuristicChainer.py", line 76, in <module>
trainer.run()
File "/usr/local/lib/python2.7/dist-packages/chainer/training/trainer.py", line 313, in run
six.reraise(*sys.exc_info())
File "/usr/local/lib/python2.7/dist-packages/chainer/training/trainer.py", line 299, in run
update()
File "/usr/local/lib/python2.7/dist-packages/chainer/training/updater.py", line 223, in update
self.update_core()
File "/usr/local/lib/python2.7/dist-packages/chainer/training/updater.py", line 234, in update_core
optimizer.update(loss_func, *in_arrays)
File "/usr/local/lib/python2.7/dist-packages/chainer/optimizer.py", line 534, in update
loss = lossfun(*args, **kwds)
TypeError: __call__() takes exactly 2 arguments (3 given)
我不知道如何处理这个错误。我已经使用其他框架成功地训练了类似的网络,但我对 Chainer 很感兴趣,因为它与 PyPy 兼容。
此处提供包含文件的 tgz:https://mega.nz/#!wwsBiSwY!g72pC5ZgekeMiVr-UODJOqQfQZZU3lCqm9Er2jH4UD8
您正在将 (X, y)
的元组发送到 MLP,而实现的 __call__
仅接受 x
.
您可以将实现修改为
class MLP(Chain):
def __init__(self, n_units, n_out):
super(MLP, self).__init__()
with self.init_scope():
self.l1 = L.Linear(None, n_units)
self.l2 = L.Linear(None, n_units)
self.l3 = L.Linear(None, n_out)
def __call__(self, x, y):
h1 = F.relu(self.l1(x))
h2 = F.relu(self.l2(h1))
predict = self.l3(h2)
loss = F.squared_error(predict, y)
// or you can write it on your own as follows
// loss = F.sum(F.square(predict - y))
return loss
chainer 中可能与其他框架不同,默认情况下标准更新程序假定 __call__
是损失函数。所以调用 model(X, y)
会 return 损失当前的 mini-batch。这就是为什么 chainer 教程引入了另一个 Classifier
class 来计算损失函数并保持 MLP 简单。分类器在 MNIST 中有意义,但不适合您的任务,因此您需要自行实现损失函数。
当你完成训练后,你可以只保存模型实例(也许通过在训练器中添加 snapshot_object 的扩展)。
要使用保存的模型,就像在测试中一样,您必须在 class 中编写另一个方法,可能命名为 test
,使用与当前 __call__
相同的代码,这手头只有 X
输入,因此不需要其他 y
。
此外,如果你不喜欢在MLP中添加任何额外的方法class,让它变得纯粹,那么你需要自己编写updater并更自然地计算损失函数。继承标准的比较简单,可以这样写,
class MyUpdater(chainer.training.StandardUpdater):
def __init__(self, data_iter, model, opt, device=-1):
super(MyUpdater, self).__init__(data_iter, opt, device=device)
self.mlp = model
def update_core(self):
batch = self.get_iterator('main').next()
x, y = self.converter(batch, self.device)
predict = self.mlp(x)
loss = F.squared_error(predict, y)
self.mlp.cleargrads()
loss.backward()
self.get_iterator('main').update()
updater = MyUpdater(train_iter, model, optimizer)