Nolearn 在 运行 分类时引发索引错误,但在回归时不会
Nolearn raises an index error when running a classification, but not with regression
我几天前就被我要描述的问题困住了。我正在关注 Daniel Nouri 关于深度学习的教程:http://danielnouri.org/notes/category/deep-learning/ 并且我尝试将他的示例改编为分类数据集。我这里的问题是,如果我将数据集视为回归问题,它会正常工作,但如果我尝试执行分类,它就会失败。我试着写了 2 个可重现的例子。
1) 回归(效果很好)
import lasagne
from sklearn import datasets
import numpy as np
from lasagne import layers
from lasagne.updates import nesterov_momentum
from nolearn.lasagne import NeuralNet
from sklearn.preprocessing import StandardScaler
iris = datasets.load_iris()
X = iris.data[iris.target<2] # we only take the first two features.
Y = iris.target[iris.target<2]
stdscaler = StandardScaler(copy=True, with_mean=True, with_std=True)
X = stdscaler.fit_transform(X).astype(np.float32)
y = np.asmatrix((Y-0.5)*2).T.astype(np.float32)
print X.shape, type(X)
print y.shape, type(y)
net1 = NeuralNet(
layers=[ # three layers: one hidden layer
('input', layers.InputLayer),
('hidden', layers.DenseLayer),
('output', layers.DenseLayer),
],
# layer parameters:
input_shape=(None, 4), # 96x96 input pixels per batch
hidden_num_units=10, # number of units in hidden layer
output_nonlinearity=None, # output layer uses identity function
output_num_units=1, # 1 target value
# optimization method:
update=nesterov_momentum,
update_learning_rate=0.01,
update_momentum=0.9,
regression=True, # flag to indicate we're dealing with regression problem
max_epochs=400, # we want to train this many epochs
verbose=1,
)
net1.fit(X, y)
2)分类(会引发矩阵维数错误,我贴在下面)
import lasagne
from sklearn import datasets
import numpy as np
from lasagne import layers
from lasagne.nonlinearities import softmax
from lasagne.updates import nesterov_momentum
from nolearn.lasagne import NeuralNet
from sklearn.preprocessing import StandardScaler
iris = datasets.load_iris()
X = iris.data[iris.target<2] # we only take the first two features.
Y = iris.target[iris.target<2]
stdscaler = StandardScaler(copy=True, with_mean=True, with_std=True)
X = stdscaler.fit_transform(X).astype(np.float32)
y = np.asmatrix((Y-0.5)*2).T.astype(np.int32)
print X.shape, type(X)
print y.shape, type(y)
net1 = NeuralNet(
layers=[ # three layers: one hidden layer
('input', layers.InputLayer),
('hidden', layers.DenseLayer),
('output', layers.DenseLayer),
],
# layer parameters:
input_shape=(None, 4), # 96x96 input pixels per batch
hidden_num_units=10, # number of units in hidden layer
output_nonlinearity=softmax, # output layer uses identity function
output_num_units=1, # 1 target value
# optimization method:
update=nesterov_momentum,
update_learning_rate=0.01,
update_momentum=0.9,
regression=False, # flag to indicate we're dealing with classification problem
max_epochs=400, # we want to train this many epochs
verbose=1,
)
net1.fit(X, y)
我使用代码 2 得到的失败输出。
(100, 4) <type 'numpy.ndarray'>
(100, 1) <type 'numpy.ndarray'>
input (None, 4) produces 4 outputs
hidden (None, 10) produces 10 outputs
output (None, 1) produces 1 outputs
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-13-184a45e5abaa> in <module>()
40 )
41
---> 42 net1.fit(X, y)
/Users/ivanvallesperez/anaconda/lib/python2.7/site-packages/nolearn/lasagne/base.pyc in fit(self, X, y)
291
292 try:
--> 293 self.train_loop(X, y)
294 except KeyboardInterrupt:
295 pass
/Users/ivanvallesperez/anaconda/lib/python2.7/site-packages/nolearn/lasagne/base.pyc in train_loop(self, X, y)
298 def train_loop(self, X, y):
299 X_train, X_valid, y_train, y_valid = self.train_test_split(
--> 300 X, y, self.eval_size)
301
302 on_epoch_finished = self.on_epoch_finished
/Users/ivanvallesperez/anaconda/lib/python2.7/site-packages/nolearn/lasagne/base.pyc in train_test_split(self, X, y, eval_size)
399 kf = KFold(y.shape[0], round(1. / eval_size))
400 else:
--> 401 kf = StratifiedKFold(y, round(1. / eval_size))
402
403 train_indices, valid_indices = next(iter(kf))
/Users/ivanvallesperez/anaconda/lib/python2.7/site-packages/sklearn/cross_validation.pyc in __init__(self, y, n_folds, shuffle, random_state)
531 for test_fold_idx, per_label_splits in enumerate(zip(*per_label_cvs)):
532 for label, (_, test_split) in zip(unique_labels, per_label_splits):
--> 533 label_test_folds = test_folds[y == label]
534 # the test split can be too big because we used
535 # KFold(max(c, self.n_folds), self.n_folds) instead of
IndexError: too many indices for array
这是怎么回事?我在做坏事吗?我想我尝试了一切,但我无法弄清楚发生了什么。
请注意,我今天刚刚使用以下命令更新了千层面和依赖项:pip install -r https://raw.githubusercontent.com/dnouri/kfkd-tutorial/master/requirements.txt
提前致谢
编辑
我通过执行后续更改实现了它的工作,但我仍然有一些疑问:
我把Y定义为0/1值的一维向量为:y = Y.astype(np.int32)
但我还是有些疑惑
我不得不将参数 output_num_units=1
更改为 output_num_units=2
我不确定是否理解这一点,因为我正在处理二元分类问题,我认为这个多层感知器应该只有 1 个输出神经元,而不是其中的 2 个...我错了吗?
我还尝试将成本函数更改为 ROC-AUC。我知道有一个名为 objective_loss_function
的参数,默认情况下定义为 objective_loss_function=lasagne.objectives.categorical_crossentropy
但是......我如何使用 ROC AUC 作为成本函数而不是分类交叉熵?
谢谢
在 nolearn 中,如果你进行 class化,output_num_units
就是你有多少 classes。虽然可以仅使用一个输出单元实现两个 class class 化,但在 nolearn 中并没有以这种方式进行特殊处理,例如 [1]:
if not self.regression:
predict = predict_proba.argmax(axis=1)
请注意,无论您有多少 classes,预测始终是 argmax(这意味着两个 class classification 有两个输出,而不是一个)。
因此您的更改是正确的:output_num_units
应该始终是您拥有的 classes 的数量,即使您有两个,并且 Y
的形状应该是 (num_samples)
或 (num_samples, 1)
包含代表类别的整数值,而不是,例如,每个类别都有一个具有形状 (num_samples, num_categories)
.
的向量
回答你的另一个问题,Lasagne 似乎没有 ROC-AUC
objective,所以你需要实现它。请注意,您不能使用 scikit-learn 的实现,例如,因为 Lasagne 需要 objective 函数将 theano 张量作为参数,而不是列表或 ndarrays。要了解如何在 Lasagne 中实现 objective 函数,您可以查看现有的 objective 函数 [2]。他们中的许多人都参考了 theano 内部的那些,你可以在 [3] 中看到他们的实现(它会自动滚动到 binary_crossentropy
,这是 objective 函数的一个很好的例子)。
[1] https://github.com/dnouri/nolearn/blob/master/nolearn/lasagne/base.py#L414
[2] https://github.com/Lasagne/Lasagne/blob/master/lasagne/objectives.py
[3] https://github.com/Theano/Theano/blob/master/theano/tensor/nnet/nnet.py#L1809
我几天前就被我要描述的问题困住了。我正在关注 Daniel Nouri 关于深度学习的教程:http://danielnouri.org/notes/category/deep-learning/ 并且我尝试将他的示例改编为分类数据集。我这里的问题是,如果我将数据集视为回归问题,它会正常工作,但如果我尝试执行分类,它就会失败。我试着写了 2 个可重现的例子。
1) 回归(效果很好)
import lasagne
from sklearn import datasets
import numpy as np
from lasagne import layers
from lasagne.updates import nesterov_momentum
from nolearn.lasagne import NeuralNet
from sklearn.preprocessing import StandardScaler
iris = datasets.load_iris()
X = iris.data[iris.target<2] # we only take the first two features.
Y = iris.target[iris.target<2]
stdscaler = StandardScaler(copy=True, with_mean=True, with_std=True)
X = stdscaler.fit_transform(X).astype(np.float32)
y = np.asmatrix((Y-0.5)*2).T.astype(np.float32)
print X.shape, type(X)
print y.shape, type(y)
net1 = NeuralNet(
layers=[ # three layers: one hidden layer
('input', layers.InputLayer),
('hidden', layers.DenseLayer),
('output', layers.DenseLayer),
],
# layer parameters:
input_shape=(None, 4), # 96x96 input pixels per batch
hidden_num_units=10, # number of units in hidden layer
output_nonlinearity=None, # output layer uses identity function
output_num_units=1, # 1 target value
# optimization method:
update=nesterov_momentum,
update_learning_rate=0.01,
update_momentum=0.9,
regression=True, # flag to indicate we're dealing with regression problem
max_epochs=400, # we want to train this many epochs
verbose=1,
)
net1.fit(X, y)
2)分类(会引发矩阵维数错误,我贴在下面)
import lasagne
from sklearn import datasets
import numpy as np
from lasagne import layers
from lasagne.nonlinearities import softmax
from lasagne.updates import nesterov_momentum
from nolearn.lasagne import NeuralNet
from sklearn.preprocessing import StandardScaler
iris = datasets.load_iris()
X = iris.data[iris.target<2] # we only take the first two features.
Y = iris.target[iris.target<2]
stdscaler = StandardScaler(copy=True, with_mean=True, with_std=True)
X = stdscaler.fit_transform(X).astype(np.float32)
y = np.asmatrix((Y-0.5)*2).T.astype(np.int32)
print X.shape, type(X)
print y.shape, type(y)
net1 = NeuralNet(
layers=[ # three layers: one hidden layer
('input', layers.InputLayer),
('hidden', layers.DenseLayer),
('output', layers.DenseLayer),
],
# layer parameters:
input_shape=(None, 4), # 96x96 input pixels per batch
hidden_num_units=10, # number of units in hidden layer
output_nonlinearity=softmax, # output layer uses identity function
output_num_units=1, # 1 target value
# optimization method:
update=nesterov_momentum,
update_learning_rate=0.01,
update_momentum=0.9,
regression=False, # flag to indicate we're dealing with classification problem
max_epochs=400, # we want to train this many epochs
verbose=1,
)
net1.fit(X, y)
我使用代码 2 得到的失败输出。
(100, 4) <type 'numpy.ndarray'>
(100, 1) <type 'numpy.ndarray'>
input (None, 4) produces 4 outputs
hidden (None, 10) produces 10 outputs
output (None, 1) produces 1 outputs
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-13-184a45e5abaa> in <module>()
40 )
41
---> 42 net1.fit(X, y)
/Users/ivanvallesperez/anaconda/lib/python2.7/site-packages/nolearn/lasagne/base.pyc in fit(self, X, y)
291
292 try:
--> 293 self.train_loop(X, y)
294 except KeyboardInterrupt:
295 pass
/Users/ivanvallesperez/anaconda/lib/python2.7/site-packages/nolearn/lasagne/base.pyc in train_loop(self, X, y)
298 def train_loop(self, X, y):
299 X_train, X_valid, y_train, y_valid = self.train_test_split(
--> 300 X, y, self.eval_size)
301
302 on_epoch_finished = self.on_epoch_finished
/Users/ivanvallesperez/anaconda/lib/python2.7/site-packages/nolearn/lasagne/base.pyc in train_test_split(self, X, y, eval_size)
399 kf = KFold(y.shape[0], round(1. / eval_size))
400 else:
--> 401 kf = StratifiedKFold(y, round(1. / eval_size))
402
403 train_indices, valid_indices = next(iter(kf))
/Users/ivanvallesperez/anaconda/lib/python2.7/site-packages/sklearn/cross_validation.pyc in __init__(self, y, n_folds, shuffle, random_state)
531 for test_fold_idx, per_label_splits in enumerate(zip(*per_label_cvs)):
532 for label, (_, test_split) in zip(unique_labels, per_label_splits):
--> 533 label_test_folds = test_folds[y == label]
534 # the test split can be too big because we used
535 # KFold(max(c, self.n_folds), self.n_folds) instead of
IndexError: too many indices for array
这是怎么回事?我在做坏事吗?我想我尝试了一切,但我无法弄清楚发生了什么。
请注意,我今天刚刚使用以下命令更新了千层面和依赖项:pip install -r https://raw.githubusercontent.com/dnouri/kfkd-tutorial/master/requirements.txt
提前致谢
编辑
我通过执行后续更改实现了它的工作,但我仍然有一些疑问:
我把Y定义为0/1值的一维向量为:
y = Y.astype(np.int32)
但我还是有些疑惑我不得不将参数
output_num_units=1
更改为output_num_units=2
我不确定是否理解这一点,因为我正在处理二元分类问题,我认为这个多层感知器应该只有 1 个输出神经元,而不是其中的 2 个...我错了吗?
我还尝试将成本函数更改为 ROC-AUC。我知道有一个名为 objective_loss_function
的参数,默认情况下定义为 objective_loss_function=lasagne.objectives.categorical_crossentropy
但是......我如何使用 ROC AUC 作为成本函数而不是分类交叉熵?
谢谢
在 nolearn 中,如果你进行 class化,output_num_units
就是你有多少 classes。虽然可以仅使用一个输出单元实现两个 class class 化,但在 nolearn 中并没有以这种方式进行特殊处理,例如 [1]:
if not self.regression:
predict = predict_proba.argmax(axis=1)
请注意,无论您有多少 classes,预测始终是 argmax(这意味着两个 class classification 有两个输出,而不是一个)。
因此您的更改是正确的:output_num_units
应该始终是您拥有的 classes 的数量,即使您有两个,并且 Y
的形状应该是 (num_samples)
或 (num_samples, 1)
包含代表类别的整数值,而不是,例如,每个类别都有一个具有形状 (num_samples, num_categories)
.
回答你的另一个问题,Lasagne 似乎没有 ROC-AUC
objective,所以你需要实现它。请注意,您不能使用 scikit-learn 的实现,例如,因为 Lasagne 需要 objective 函数将 theano 张量作为参数,而不是列表或 ndarrays。要了解如何在 Lasagne 中实现 objective 函数,您可以查看现有的 objective 函数 [2]。他们中的许多人都参考了 theano 内部的那些,你可以在 [3] 中看到他们的实现(它会自动滚动到 binary_crossentropy
,这是 objective 函数的一个很好的例子)。
[1] https://github.com/dnouri/nolearn/blob/master/nolearn/lasagne/base.py#L414
[2] https://github.com/Lasagne/Lasagne/blob/master/lasagne/objectives.py
[3] https://github.com/Theano/Theano/blob/master/theano/tensor/nnet/nnet.py#L1809