Python/Keras/Theano - 索引越界
Python/Keras/Theano - Index out of bounds
我是 Keras 的新手,在形状方面遇到了一些问题,特别是涉及到 RNN 和 LSTM 时。
我正在运行宁此代码:
model=Sequential()
model.add(Embedding(input_dim=col,output_dim=70))
model.add(SimpleRNN(init='uniform',output_dim=30))
model.add(Dropout(0.5))
model.add(Dense(1))
model.compile(loss="mse", optimizer="sgd")
model.fit(X=predictor_train, y=target_train, nb_epoch=5, batch_size=1,show_accuracy=True)
我遇到了这个错误:
IndexError: index 143 is out of bounds for size 80
Apply node that caused the error: AdvancedSubtensor1(<TensorType(float32, matrix)>, Flatten{1}.0)
Inputs types: [TensorType(float32, matrix), TensorType(int32, vector)]
Inputs shapes: [(80, 70), (80,)]
Inputs strides: [(280, 4), (4,)]
Inputs values: ['not shown', 'not shown']
我不明白“索引 143”是从哪里来的,我该如何解决它。
有没有人可以启发我的旅程?
下面有更多信息。
-- 编辑 --
每次我 运行 代码时,这个“索引 143”实际上都不同。这些数字不遵循任何明显的逻辑,我唯一能注意到的是,无论是否巧合,出现的最小数字是 80(我 运行 代码超过 20 次)
额外信息
About predictor_train (X)
类型:'numpy.ndarray'
形状:(119,80)
dtype: float64
About target_train (Y)
类型:class'pandas.core.series.Series'
形状:(119,)
dtype: float64
Date
2004-10-01 0.003701
2005-05-01 0.001715
2005-06-01 0.002031
2005-07-01 0.002818
...
2015-05-01 -0.007597
2015-06-01 -0.007597
2015-07-01 -0.007597
2015-08-01 -0.007597
model.summary()
--------------------------------------------------------------------------------
Initial input shape: (None, 80)
--------------------------------------------------------------------------------
Layer (name) Output Shape Param #
--------------------------------------------------------------------------------
Embedding (Unnamed) (None, None, 70) 5600
SimpleRNN (Unnamed) (None, 30) 3030
Dropout (Unnamed) (None, 30) 0
Dense (Unnamed) (None, 1) 31
--------------------------------------------------------------------------------
Total params: 8661
--------------------------------------------------------------------------------
FULL TRACEBACK
File "/Users/file.py", line 1523, in Pred
model.fit(X=predictor_train, y=target_train, nb_epoch=5, batch_size=1,show_accuracy=True)
File "/Library/Python/2.7/site-packages/keras/models.py", line 581, in fit
shuffle=shuffle, metrics=metrics)
File "/Library/Python/2.7/site-packages/keras/models.py", line 239, in _fit
outs = f(ins_batch)
File "/Library/Python/2.7/site-packages/keras/backend/theano_backend.py", line 365, in __call__
return self.function(*inputs)
File "/Library/Python/2.7/site-packages/theano/compile/function_module.py", line 595, in __call__
outputs = self.fn()
File "/Library/Python/2.7/site-packages/theano/gof/vm.py", line 233, in __call__
link.raise_with_op(node, thunk)
File "/Library/Python/2.7/site-packages/theano/gof/vm.py", line 229, in __call__
thunk()
File "/Library/Python/2.7/site-packages/theano/gof/op.py", line 768, in rval
r = p(n, [x[0] for x in i], o)
File "/Library/Python/2.7/site-packages/theano/tensor/subtensor.py", line 1657, in perform
out[0] = x.take(i, axis=0, out=o)
IndexError: index 143 is out of bounds for size 80
Apply node that caused the error: AdvancedSubtensor1(<TensorType(float32, matrix)>, Flatten{1}.0)
Inputs types: [TensorType(float32, matrix), TensorType(int32, vector)]
Inputs shapes: [(80, 70), (80,)]
Inputs strides: [(280, 4), (4,)]
Inputs values: ['not shown', 'not shown']
您的 X
变量可能包含值 143。Embedding
图层的尺寸为 80x70。
我假设这是在 NLP 领域。这意味着您的词汇表大小为 80 个单词,每个单词由长度为 70 的向量表示。您的 X
变量代表 119 个长度为 80 的句子(或 80 个长度为 119 的句子),其内容代表词汇表的索引。如果它包含大于 80 的单词索引,则会弹出此错误。
您的 col
变量的更常见值高于 10.000。当然,这取决于你在做什么。
我是 Keras 的新手,在形状方面遇到了一些问题,特别是涉及到 RNN 和 LSTM 时。
我正在运行宁此代码:
model=Sequential()
model.add(Embedding(input_dim=col,output_dim=70))
model.add(SimpleRNN(init='uniform',output_dim=30))
model.add(Dropout(0.5))
model.add(Dense(1))
model.compile(loss="mse", optimizer="sgd")
model.fit(X=predictor_train, y=target_train, nb_epoch=5, batch_size=1,show_accuracy=True)
我遇到了这个错误:
IndexError: index 143 is out of bounds for size 80
Apply node that caused the error: AdvancedSubtensor1(<TensorType(float32, matrix)>, Flatten{1}.0)
Inputs types: [TensorType(float32, matrix), TensorType(int32, vector)]
Inputs shapes: [(80, 70), (80,)]
Inputs strides: [(280, 4), (4,)]
Inputs values: ['not shown', 'not shown']
我不明白“索引 143”是从哪里来的,我该如何解决它。
有没有人可以启发我的旅程?
下面有更多信息。
-- 编辑 -- 每次我 运行 代码时,这个“索引 143”实际上都不同。这些数字不遵循任何明显的逻辑,我唯一能注意到的是,无论是否巧合,出现的最小数字是 80(我 运行 代码超过 20 次)
额外信息
About predictor_train (X)
类型:'numpy.ndarray'
形状:(119,80)
dtype: float64
About target_train (Y)
类型:class'pandas.core.series.Series'
形状:(119,)
dtype: float64
Date
2004-10-01 0.003701
2005-05-01 0.001715
2005-06-01 0.002031
2005-07-01 0.002818
...
2015-05-01 -0.007597
2015-06-01 -0.007597
2015-07-01 -0.007597
2015-08-01 -0.007597
model.summary()
--------------------------------------------------------------------------------
Initial input shape: (None, 80)
--------------------------------------------------------------------------------
Layer (name) Output Shape Param #
--------------------------------------------------------------------------------
Embedding (Unnamed) (None, None, 70) 5600
SimpleRNN (Unnamed) (None, 30) 3030
Dropout (Unnamed) (None, 30) 0
Dense (Unnamed) (None, 1) 31
--------------------------------------------------------------------------------
Total params: 8661
--------------------------------------------------------------------------------
FULL TRACEBACK
File "/Users/file.py", line 1523, in Pred
model.fit(X=predictor_train, y=target_train, nb_epoch=5, batch_size=1,show_accuracy=True)
File "/Library/Python/2.7/site-packages/keras/models.py", line 581, in fit
shuffle=shuffle, metrics=metrics)
File "/Library/Python/2.7/site-packages/keras/models.py", line 239, in _fit
outs = f(ins_batch)
File "/Library/Python/2.7/site-packages/keras/backend/theano_backend.py", line 365, in __call__
return self.function(*inputs)
File "/Library/Python/2.7/site-packages/theano/compile/function_module.py", line 595, in __call__
outputs = self.fn()
File "/Library/Python/2.7/site-packages/theano/gof/vm.py", line 233, in __call__
link.raise_with_op(node, thunk)
File "/Library/Python/2.7/site-packages/theano/gof/vm.py", line 229, in __call__
thunk()
File "/Library/Python/2.7/site-packages/theano/gof/op.py", line 768, in rval
r = p(n, [x[0] for x in i], o)
File "/Library/Python/2.7/site-packages/theano/tensor/subtensor.py", line 1657, in perform
out[0] = x.take(i, axis=0, out=o)
IndexError: index 143 is out of bounds for size 80
Apply node that caused the error: AdvancedSubtensor1(<TensorType(float32, matrix)>, Flatten{1}.0)
Inputs types: [TensorType(float32, matrix), TensorType(int32, vector)]
Inputs shapes: [(80, 70), (80,)]
Inputs strides: [(280, 4), (4,)]
Inputs values: ['not shown', 'not shown']
您的 X
变量可能包含值 143。Embedding
图层的尺寸为 80x70。
我假设这是在 NLP 领域。这意味着您的词汇表大小为 80 个单词,每个单词由长度为 70 的向量表示。您的 X
变量代表 119 个长度为 80 的句子(或 80 个长度为 119 的句子),其内容代表词汇表的索引。如果它包含大于 80 的单词索引,则会弹出此错误。
您的 col
变量的更常见值高于 10.000。当然,这取决于你在做什么。