由于元组索引超出范围,无法适应 scikit 神经网络分类器
Cant fit scikit-neuralnetwork classifier because of tuple index out of range
我正在努力让这个 classifier 正常工作。它是 scikit 学习的扩展,依赖于 Theano。
我的目标是用年份列表拟合神经网络并教它知道现在是否是闰年(稍后我会增加范围)。但是如果我想测试这个例子,我 运行 出错了。
我的代码如下所示:
leapyear.py
import numpy as np
import calendar
from sknn.mlp import Classifier, Layer
from sklearn.cross_validation import train_test_split
# create years in range
years = np.arange(1970, 2001)
pre_is_leap = []
# test if year is a leapyear
for x in years:
pre_is_leap.append(calendar.isleap(x))
# convert true, false list to 0,1 list
is_leap = np.array(pre_is_leap, dtype=bool).astype(int)
# split
years_train, years_test, is_leap_train, is_leap_test = train_test_split(years, is_leap, test_size=0.33, random_state=42)
# test output
print(len(years_train))
print(len(is_leap_train))
print(years_train)
print(is_leap_train)
#neural network
nn = Classifier(
layers=[
Layer("Maxout", units=100, pieces=2),
Layer("Softmax")],
learning_rate=0.001,
n_iter=25)
# fit
nn.fit(years_train, is_leap_train)
#nn.fit(np.array(years_train), np.array(is_leap_train))
requirements.txt
numpy==1.9.2
PyYAML==3.11
scikit-learn==0.16.1
scikit-neuralnetwork==0.3
scipy==0.16.0
Theano==0.7.0
我的错误输出:
20
20
[1986 1975 1983 1981 1992 1971 1972 1995 1973 1991 1996 1988 2000 1990 1977
1980 1984 1998 1989 1976]
[0 0 0 0 1 0 1 0 0 0 1 1 1 0 0 1 1 0 0 1]
/home/devnull/master/scikit/env/lib/python3.4/site-packages/sklearn/utils/validation.py:498: UserWarning: MinMaxScaler assumes floating point values as input, got int64
"got %s" % (estimator, X.dtype))
/home/devnull/master/scikit/env/lib/python3.4/site-packages/sklearn/preprocessing/data.py:256: DeprecationWarning: Implicitly casting between incompatible kinds. In a future numpy release, this will raise an error. Use casting="unsafe" if this is intentional.
X *= self.scale_
/home/devnull/master/scikit/env/lib/python3.4/site-packages/sklearn/preprocessing/data.py:257: DeprecationWarning: Implicitly casting between incompatible kinds. In a future numpy release, this will raise an error. Use casting="unsafe" if this is intentional.
X += self.min_
Traceback (most recent call last):
File "/home/devnull/master/scikit/leapyear.py", line 47, in <module>
pipeline.fit(years_train, is_leap_train)
File "/home/devnull/master/scikit/env/lib/python3.4/site-packages/sklearn/pipeline.py", line 141, in fit
self.steps[-1][-1].fit(Xt, y, **fit_params)
File "/home/devnull/master/scikit/env/lib/python3.4/site-packages/sknn/mlp.py", line 283, in fit
return super(Classifier, self)._fit(X, yp)
File "/home/devnull/master/scikit/env/lib/python3.4/site-packages/sknn/mlp.py", line 127, in _fit
X, y = self._initialize(X, y)
File "/home/devnull/master/scikit/env/lib/python3.4/site-packages/sknn/mlp.py", line 37, in _initialize
self._create_specs(X, y)
File "/home/devnull/master/scikit/env/lib/python3.4/site-packages/sknn/mlp.py", line 67, in _create_specs
self.unit_counts = [numpy.product(X.shape[1:]) if self.is_convolution else X.shape[1]]
IndexError: tuple index out of range
我查看了 mlp.py 的来源,但我不知道如何修复它。必须改变什么才能适合我的网络?
更新与问题无关:
我只是想补充一点,我需要将年份转换为二进制表示,在此之后神经网络才能工作。
问题在于分类器要求将数据呈现为二维 numpy 数组,第一个轴是样本,第二个轴是特征。
在您的情况下,您只有一个 "feature"(年份),因此您需要将年份数据转换为 Nx1 二维 numpy 数组。这可以通过在数据拆分语句之前添加以下行来实现:
years = np.array([[year] for year in years])
我正在努力让这个 classifier 正常工作。它是 scikit 学习的扩展,依赖于 Theano。
我的目标是用年份列表拟合神经网络并教它知道现在是否是闰年(稍后我会增加范围)。但是如果我想测试这个例子,我 运行 出错了。
我的代码如下所示:
leapyear.py
import numpy as np
import calendar
from sknn.mlp import Classifier, Layer
from sklearn.cross_validation import train_test_split
# create years in range
years = np.arange(1970, 2001)
pre_is_leap = []
# test if year is a leapyear
for x in years:
pre_is_leap.append(calendar.isleap(x))
# convert true, false list to 0,1 list
is_leap = np.array(pre_is_leap, dtype=bool).astype(int)
# split
years_train, years_test, is_leap_train, is_leap_test = train_test_split(years, is_leap, test_size=0.33, random_state=42)
# test output
print(len(years_train))
print(len(is_leap_train))
print(years_train)
print(is_leap_train)
#neural network
nn = Classifier(
layers=[
Layer("Maxout", units=100, pieces=2),
Layer("Softmax")],
learning_rate=0.001,
n_iter=25)
# fit
nn.fit(years_train, is_leap_train)
#nn.fit(np.array(years_train), np.array(is_leap_train))
requirements.txt
numpy==1.9.2
PyYAML==3.11
scikit-learn==0.16.1
scikit-neuralnetwork==0.3
scipy==0.16.0
Theano==0.7.0
我的错误输出:
20
20
[1986 1975 1983 1981 1992 1971 1972 1995 1973 1991 1996 1988 2000 1990 1977
1980 1984 1998 1989 1976]
[0 0 0 0 1 0 1 0 0 0 1 1 1 0 0 1 1 0 0 1]
/home/devnull/master/scikit/env/lib/python3.4/site-packages/sklearn/utils/validation.py:498: UserWarning: MinMaxScaler assumes floating point values as input, got int64
"got %s" % (estimator, X.dtype))
/home/devnull/master/scikit/env/lib/python3.4/site-packages/sklearn/preprocessing/data.py:256: DeprecationWarning: Implicitly casting between incompatible kinds. In a future numpy release, this will raise an error. Use casting="unsafe" if this is intentional.
X *= self.scale_
/home/devnull/master/scikit/env/lib/python3.4/site-packages/sklearn/preprocessing/data.py:257: DeprecationWarning: Implicitly casting between incompatible kinds. In a future numpy release, this will raise an error. Use casting="unsafe" if this is intentional.
X += self.min_
Traceback (most recent call last):
File "/home/devnull/master/scikit/leapyear.py", line 47, in <module>
pipeline.fit(years_train, is_leap_train)
File "/home/devnull/master/scikit/env/lib/python3.4/site-packages/sklearn/pipeline.py", line 141, in fit
self.steps[-1][-1].fit(Xt, y, **fit_params)
File "/home/devnull/master/scikit/env/lib/python3.4/site-packages/sknn/mlp.py", line 283, in fit
return super(Classifier, self)._fit(X, yp)
File "/home/devnull/master/scikit/env/lib/python3.4/site-packages/sknn/mlp.py", line 127, in _fit
X, y = self._initialize(X, y)
File "/home/devnull/master/scikit/env/lib/python3.4/site-packages/sknn/mlp.py", line 37, in _initialize
self._create_specs(X, y)
File "/home/devnull/master/scikit/env/lib/python3.4/site-packages/sknn/mlp.py", line 67, in _create_specs
self.unit_counts = [numpy.product(X.shape[1:]) if self.is_convolution else X.shape[1]]
IndexError: tuple index out of range
我查看了 mlp.py 的来源,但我不知道如何修复它。必须改变什么才能适合我的网络?
更新与问题无关: 我只是想补充一点,我需要将年份转换为二进制表示,在此之后神经网络才能工作。
问题在于分类器要求将数据呈现为二维 numpy 数组,第一个轴是样本,第二个轴是特征。
在您的情况下,您只有一个 "feature"(年份),因此您需要将年份数据转换为 Nx1 二维 numpy 数组。这可以通过在数据拆分语句之前添加以下行来实现:
years = np.array([[year] for year in years])