无法将形状 (27839,1) 中的输入数组广播到形状 (27839)
could not broadcast input array from shape (27839,1) into shape (27839)
我正在为在链中使用 Keras 二元分类器模型的多类问题构建链分类器。我有 17 个标签作为分类目标,X_train 的形状是 (111300,107),y_train 的形状是 (111300,17)。训练后,我在预测方法中得到以下错误;
*could not broadcast input array from shape (27839,1) into shape (27839)*
我的代码在这里:
def create_model():
input_size=length_long_sentence
embedding_size=128
lstm_size=64
output_size=len(unique_tag_set)
#----------------------------Model--------------------------------
current_input=Input(shape=(input_size,))
emb_current = Embedding(vocab_size, embedding_size, input_length=input_size)(current_input)
out_current=Bidirectional(LSTM(units=lstm_size))(emb_current )
#out_current = Reshape((1,2*lstm_size))(out_current)
output = Dense(units=1, activation= 'sigmoid')(out_current)
#output = Dense(units=1, activation='softmax')(out_current)
model = Model(inputs=current_input, outputs=output)
#-------------------------------compile-------------
model.compile(optimizer='Adam', loss='binary_crossentropy', metrics=['accuracy'])
return model
model = KerasClassifier(build_fn=create_model, epochs=1,batch_size=256, shuffle = True, verbose = 1,validation_split=0.2)
chain=ClassifierChain(model, order='random', random_state=42)
history=chain.fit(X_train, y_train)
chain.classes_ 的结果如下:
[array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8)]
然后尝试预测测试数据:
Y_pred_chain = chain.predict(X_test)
模型总结如下:
完整的错误痕迹在这里:
109/109 [==============================] - 22s 202ms/step
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-28-34a25ad06cd4> in <module>()
----> 1 Y_pred_chain = chain.predict(X_test)
/usr/local/lib/python3.6/dist-packages/sklearn/multioutput.py in predict(self, X)
523 else:
524 X_aug = np.hstack((X, previous_predictions))
--> 525 Y_pred_chain[:, chain_idx] = estimator.predict(X_aug)
526
527 inv_order = np.empty_like(self.order_)
ValueError: could not broadcast input array from shape (27839,1) into shape (27839)
谁能帮忙解决这个错误?
这里是一个完整的工作示例...
我使用顺序模型和 softmax 作为最后一个激活解决了问题
import numpy as np
import tensorflow as tf
from tensorflow.keras.layers import *
from tensorflow.keras.models import *
from sklearn.multioutput import ClassifierChain
n_sample = 20
vocab_size = 33
input_size = 100
X = np.random.randint(0,vocab_size, (n_sample,input_size))
y = np.random.randint(0,2, (n_sample,17))
def create_model():
global input_size
embedding_size = 128
lstm_size = 64
model = Sequential([
Embedding(vocab_size, embedding_size, input_length=input_size),
Bidirectional(LSTM(units=lstm_size)),
Dense(units=2, activation= 'softmax')
])
model.compile(optimizer='Adam', loss='binary_crossentropy', metrics=['accuracy'])
input_size += 1
return model
model = tf.keras.wrappers.scikit_learn.KerasClassifier(build_fn=create_model, epochs=1, batch_size=256,
shuffle = True, verbose = 1, validation_split=0.2)
chain = ClassifierChain(model, order='random', random_state=42)
chain.fit(X, y)
chain.predict_proba(X)
此处 运行 代码:https://colab.research.google.com/drive/1aVjjh6VPmAyBddwU4ff2w9y_LmmC02W_?usp=sharing
第 1 阶段
根据问题中发布的模型摘要,我从输入大小 107
开始,输出大小为 1
(二进制 class 化任务)
我们拆开来理解一下。
模型架构
input_size = 107
# define the model
def create_model():
global input_size
embedding_size=128
lstm_size=64
output_size=1
vocab_size = 100
current_input=Input(shape=(input_size,))
emb_current = Embedding(vocab_size, embedding_size, input_length=input_size)(current_input)
out_current=Bidirectional(LSTM(units=lstm_size))(emb_current )
output = Dense(units=output_size, activation= 'sigmoid')(out_current)
model = Model(inputs=current_input, outputs=output)
model.compile(optimizer='Adam', loss='binary_crossentropy', metrics=['accuracy'])
return model
一些虚拟数据
X = np.random.randint(0,100,(111, 107))
y = np.random.randint(0,2,(111,1)) # NOTE: The y should have two dimensions
让我们直接测试keras模型
model = KerasClassifier(build_fn=create_model, epochs=1, batch_size=8, shuffle = True, verbose = 1,validation_split=0.2)
model.fit(X, y)
y_hat = model.predict(X)
输出:
Train on 88 samples, validate on 23 samples
Epoch 1/1
88/88 [==============================] - 2s 21ms/step - loss: 0.6951 - accuracy: 0.4432 - val_loss: 0.6898 - val_accuracy: 0.5652
111/111 [==============================] - 0s 2ms/step
(111, 1)
哒哒!有效
现在让我们将它们链接起来 运行
model=KerasClassifier(build_fn=create_model, epochs=1, batch_size=8, shuffle=True, verbose=1,validation_split=0.2)
chain=ClassifierChain(model, order='random', random_state=42)
chain.fit(X, y)
print (chain.predict(X).shape)
糟糕!它训练但预测失败,因为 OP 指出
错误:
ValueError: could not broadcast input array from shape (111,1) into shape (111)
问题
此错误是由于 sklearn 中的以下行引起的
--> 525 Y_pred_chain[:, chain_idx] = estimator.predict(X_aug)
这是因为 classifier chain 运行 一次一个估计器,并将每个估计器预测保存在 Y_pred_chain
中的估计器索引(由 order
范围)。它假设估计器 return 一维数组中的预测。但是 keras 模型 return 输出形状 batch_size x output_size
在我们的例子中是 111 x 1
.
解决方案
我们需要一种方法来将形状 111 X 1
的预测重塑为 111
或通常 batch_size x 1
为 batch_size
。让我们依靠 OOPS 的概念并重载 KerasClassifier
的预测方法
class MyKerasClassifier(KerasClassifier):
def __init__(self, **args):
super().__init__(**args)
def predict(self, X):
return super().predict(X).reshape(len(X)) # Here we are flattening 2D array to 1D
model=MyKerasClassifier(build_fn=create_model, epochs=1, batch_size=8, shuffle=True, verbose=1,validation_split=0.2)
chain=ClassifierChain(model, order='random', random_state=42)
chain.fit(X, y)
print (chain.predict(X).shape)
输出:
Epoch 1/1
88/88 [==============================] - 2s 19ms/step - loss: 0.6919 - accuracy: 0.5227 - val_loss: 0.6892 - val_accuracy: 0.5652
111/111 [==============================] - 0s 3ms/step
(111, 1)
哒哒!有效
第 2 阶段
让我们深入了解 ClassifierChain class
A multi-label model that arranges binary classifiers into a chain.
Each model makes a prediction in the order specified by the chain
using all of the available features provided to the model plus the
predictions of models that are earlier in the chain.
所以我们真正需要的是 y
形状 111 X 17
以便链包含 17 个估计器。让我们试试
真正的分类器链
y = np.random.randint(0,2,(111,17))
model=MyKerasClassifier(build_fn=create_model, epochs=1, batch_size=8, shuffle=True, verbose=1,validation_split=0.2)
chain=ClassifierChain(model, order='random', random_state=42)
chain.fit(X, y)
输出:
ValueError: Error when checking input: expected input_62 to have shape (107,) but got array with shape (108,)
无法训练模型;原因很简单。该链首先用 107
特征训练第一个估计器,效果很好。接下来,链选择下一个估计器,然后使用 107
个特征 + 前一个估计器的单个输出 (=108) 对其进行训练。但是由于我们的模型的输入大小为 107
,因此它会像错误消息一样失败。每个估计器将获得 107
个输入特征 + 所有先前估计器的输出。
解决方案[hacky]
我们需要一种方法来更改模型的 input_size
,因为它们是从 ClassifierChain
创建的。 ClassifierChain
似乎没有回调或挂钩,所以我有一个 hacky 解决方案。
input_size = 107
# define the model
def create_model():
global input_size
embedding_size=128
lstm_size=64
output_size=1
vocab_size = 100
current_input=Input(shape=(input_size,))
emb_current = Embedding(vocab_size, embedding_size, input_length=input_size)(current_input)
out_current=Bidirectional(LSTM(units=lstm_size))(emb_current )
output = Dense(units=output_size, activation= 'sigmoid')(out_current)
model = Model(inputs=current_input, outputs=output)
model.compile(optimizer='Adam', loss='binary_crossentropy', metrics=['accuracy'])
input_size += 1 # <-- This does the magic
return model
X = np.random.randint(0,100,(111, 107))
y = np.random.randint(0,2,(111,17))
model=MyKerasClassifier(build_fn=create_model, epochs=1, batch_size=8, shuffle=True, verbose=1,validation_split=0.2)
chain=ClassifierChain(model, order='random', random_state=42)
chain.fit(X, y)
print (chain.predict(X).shape)
输出:
Train on 88 samples, validate on 23 samples
Epoch 1/1
88/88 [==============================] - 2s 22ms/step - loss: 0.6901 - accuracy: 0.6023 - val_loss: 0.7002 - val_accuracy: 0.4783
Train on 88 samples, validate on 23 samples
Epoch 1/1
88/88 [==============================] - 2s 22ms/step - loss: 0.6976 - accuracy: 0.5000 - val_loss: 0.7070 - val_accuracy: 0.3913
Train on 88 samples, validate on 23 samples
Epoch 1/1
----------- [Output truncated] ----------------
111/111 [==============================] - 0s 3ms/step
111/111 [==============================] - 0s 3ms/step
(111, 17)
正如预期的那样,它训练 17
估计器和 predict
方法 return 形状的输出 111 x 17
每列对应于相应估计器所做的预测。
我正在为在链中使用 Keras 二元分类器模型的多类问题构建链分类器。我有 17 个标签作为分类目标,X_train 的形状是 (111300,107),y_train 的形状是 (111300,17)。训练后,我在预测方法中得到以下错误;
*could not broadcast input array from shape (27839,1) into shape (27839)*
我的代码在这里:
def create_model():
input_size=length_long_sentence
embedding_size=128
lstm_size=64
output_size=len(unique_tag_set)
#----------------------------Model--------------------------------
current_input=Input(shape=(input_size,))
emb_current = Embedding(vocab_size, embedding_size, input_length=input_size)(current_input)
out_current=Bidirectional(LSTM(units=lstm_size))(emb_current )
#out_current = Reshape((1,2*lstm_size))(out_current)
output = Dense(units=1, activation= 'sigmoid')(out_current)
#output = Dense(units=1, activation='softmax')(out_current)
model = Model(inputs=current_input, outputs=output)
#-------------------------------compile-------------
model.compile(optimizer='Adam', loss='binary_crossentropy', metrics=['accuracy'])
return model
model = KerasClassifier(build_fn=create_model, epochs=1,batch_size=256, shuffle = True, verbose = 1,validation_split=0.2)
chain=ClassifierChain(model, order='random', random_state=42)
history=chain.fit(X_train, y_train)
chain.classes_ 的结果如下:
[array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8)]
然后尝试预测测试数据:
Y_pred_chain = chain.predict(X_test)
模型总结如下:
完整的错误痕迹在这里:
109/109 [==============================] - 22s 202ms/step
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-28-34a25ad06cd4> in <module>()
----> 1 Y_pred_chain = chain.predict(X_test)
/usr/local/lib/python3.6/dist-packages/sklearn/multioutput.py in predict(self, X)
523 else:
524 X_aug = np.hstack((X, previous_predictions))
--> 525 Y_pred_chain[:, chain_idx] = estimator.predict(X_aug)
526
527 inv_order = np.empty_like(self.order_)
ValueError: could not broadcast input array from shape (27839,1) into shape (27839)
谁能帮忙解决这个错误?
这里是一个完整的工作示例...
我使用顺序模型和 softmax 作为最后一个激活解决了问题
import numpy as np
import tensorflow as tf
from tensorflow.keras.layers import *
from tensorflow.keras.models import *
from sklearn.multioutput import ClassifierChain
n_sample = 20
vocab_size = 33
input_size = 100
X = np.random.randint(0,vocab_size, (n_sample,input_size))
y = np.random.randint(0,2, (n_sample,17))
def create_model():
global input_size
embedding_size = 128
lstm_size = 64
model = Sequential([
Embedding(vocab_size, embedding_size, input_length=input_size),
Bidirectional(LSTM(units=lstm_size)),
Dense(units=2, activation= 'softmax')
])
model.compile(optimizer='Adam', loss='binary_crossentropy', metrics=['accuracy'])
input_size += 1
return model
model = tf.keras.wrappers.scikit_learn.KerasClassifier(build_fn=create_model, epochs=1, batch_size=256,
shuffle = True, verbose = 1, validation_split=0.2)
chain = ClassifierChain(model, order='random', random_state=42)
chain.fit(X, y)
chain.predict_proba(X)
此处 运行 代码:https://colab.research.google.com/drive/1aVjjh6VPmAyBddwU4ff2w9y_LmmC02W_?usp=sharing
第 1 阶段
根据问题中发布的模型摘要,我从输入大小 107
开始,输出大小为 1
(二进制 class 化任务)
我们拆开来理解一下。
模型架构
input_size = 107
# define the model
def create_model():
global input_size
embedding_size=128
lstm_size=64
output_size=1
vocab_size = 100
current_input=Input(shape=(input_size,))
emb_current = Embedding(vocab_size, embedding_size, input_length=input_size)(current_input)
out_current=Bidirectional(LSTM(units=lstm_size))(emb_current )
output = Dense(units=output_size, activation= 'sigmoid')(out_current)
model = Model(inputs=current_input, outputs=output)
model.compile(optimizer='Adam', loss='binary_crossentropy', metrics=['accuracy'])
return model
一些虚拟数据
X = np.random.randint(0,100,(111, 107))
y = np.random.randint(0,2,(111,1)) # NOTE: The y should have two dimensions
让我们直接测试keras模型
model = KerasClassifier(build_fn=create_model, epochs=1, batch_size=8, shuffle = True, verbose = 1,validation_split=0.2)
model.fit(X, y)
y_hat = model.predict(X)
输出:
Train on 88 samples, validate on 23 samples
Epoch 1/1
88/88 [==============================] - 2s 21ms/step - loss: 0.6951 - accuracy: 0.4432 - val_loss: 0.6898 - val_accuracy: 0.5652
111/111 [==============================] - 0s 2ms/step
(111, 1)
哒哒!有效
现在让我们将它们链接起来 运行
model=KerasClassifier(build_fn=create_model, epochs=1, batch_size=8, shuffle=True, verbose=1,validation_split=0.2)
chain=ClassifierChain(model, order='random', random_state=42)
chain.fit(X, y)
print (chain.predict(X).shape)
糟糕!它训练但预测失败,因为 OP 指出 错误:
ValueError: could not broadcast input array from shape (111,1) into shape (111)
问题
此错误是由于 sklearn 中的以下行引起的
--> 525 Y_pred_chain[:, chain_idx] = estimator.predict(X_aug)
这是因为 classifier chain 运行 一次一个估计器,并将每个估计器预测保存在 Y_pred_chain
中的估计器索引(由 order
范围)。它假设估计器 return 一维数组中的预测。但是 keras 模型 return 输出形状 batch_size x output_size
在我们的例子中是 111 x 1
.
解决方案
我们需要一种方法来将形状 111 X 1
的预测重塑为 111
或通常 batch_size x 1
为 batch_size
。让我们依靠 OOPS 的概念并重载 KerasClassifier
class MyKerasClassifier(KerasClassifier):
def __init__(self, **args):
super().__init__(**args)
def predict(self, X):
return super().predict(X).reshape(len(X)) # Here we are flattening 2D array to 1D
model=MyKerasClassifier(build_fn=create_model, epochs=1, batch_size=8, shuffle=True, verbose=1,validation_split=0.2)
chain=ClassifierChain(model, order='random', random_state=42)
chain.fit(X, y)
print (chain.predict(X).shape)
输出:
Epoch 1/1
88/88 [==============================] - 2s 19ms/step - loss: 0.6919 - accuracy: 0.5227 - val_loss: 0.6892 - val_accuracy: 0.5652
111/111 [==============================] - 0s 3ms/step
(111, 1)
哒哒!有效
第 2 阶段
让我们深入了解 ClassifierChain class
A multi-label model that arranges binary classifiers into a chain.
Each model makes a prediction in the order specified by the chain using all of the available features provided to the model plus the predictions of models that are earlier in the chain.
所以我们真正需要的是 y
形状 111 X 17
以便链包含 17 个估计器。让我们试试
真正的分类器链
y = np.random.randint(0,2,(111,17))
model=MyKerasClassifier(build_fn=create_model, epochs=1, batch_size=8, shuffle=True, verbose=1,validation_split=0.2)
chain=ClassifierChain(model, order='random', random_state=42)
chain.fit(X, y)
输出:
ValueError: Error when checking input: expected input_62 to have shape (107,) but got array with shape (108,)
无法训练模型;原因很简单。该链首先用 107
特征训练第一个估计器,效果很好。接下来,链选择下一个估计器,然后使用 107
个特征 + 前一个估计器的单个输出 (=108) 对其进行训练。但是由于我们的模型的输入大小为 107
,因此它会像错误消息一样失败。每个估计器将获得 107
个输入特征 + 所有先前估计器的输出。
解决方案[hacky]
我们需要一种方法来更改模型的 input_size
,因为它们是从 ClassifierChain
创建的。 ClassifierChain
似乎没有回调或挂钩,所以我有一个 hacky 解决方案。
input_size = 107
# define the model
def create_model():
global input_size
embedding_size=128
lstm_size=64
output_size=1
vocab_size = 100
current_input=Input(shape=(input_size,))
emb_current = Embedding(vocab_size, embedding_size, input_length=input_size)(current_input)
out_current=Bidirectional(LSTM(units=lstm_size))(emb_current )
output = Dense(units=output_size, activation= 'sigmoid')(out_current)
model = Model(inputs=current_input, outputs=output)
model.compile(optimizer='Adam', loss='binary_crossentropy', metrics=['accuracy'])
input_size += 1 # <-- This does the magic
return model
X = np.random.randint(0,100,(111, 107))
y = np.random.randint(0,2,(111,17))
model=MyKerasClassifier(build_fn=create_model, epochs=1, batch_size=8, shuffle=True, verbose=1,validation_split=0.2)
chain=ClassifierChain(model, order='random', random_state=42)
chain.fit(X, y)
print (chain.predict(X).shape)
输出:
Train on 88 samples, validate on 23 samples
Epoch 1/1
88/88 [==============================] - 2s 22ms/step - loss: 0.6901 - accuracy: 0.6023 - val_loss: 0.7002 - val_accuracy: 0.4783
Train on 88 samples, validate on 23 samples
Epoch 1/1
88/88 [==============================] - 2s 22ms/step - loss: 0.6976 - accuracy: 0.5000 - val_loss: 0.7070 - val_accuracy: 0.3913
Train on 88 samples, validate on 23 samples
Epoch 1/1
----------- [Output truncated] ----------------
111/111 [==============================] - 0s 3ms/step
111/111 [==============================] - 0s 3ms/step
(111, 17)
正如预期的那样,它训练 17
估计器和 predict
方法 return 形状的输出 111 x 17
每列对应于相应估计器所做的预测。