将模型的数据输入从 (1, 5) 重塑为 (1, 3000)

Question

模型的训练和测试数据的形状为（行，3000）。我喜欢调用模型来预测形状为 (1, 5) 的 A。如何重塑变量 A 以便模型将其用于 return 预测？这是一个文本分类模型，因此数据已经被向量化。

A = ['The dog is so cute']
A = vectorizer.fit_transform(A)

#pretrained model
classifier.predict(A)

错误

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-145-90d6770bbdca> in <module>
----> 1 classifier.predict(a)

/opt/conda/lib/python3.7/site-packages/sklearn/linear_model/_base.py in predict(self, X)
    305             Predicted class label per sample.
    306         """
--> 307         scores = self.decision_function(X)
    308         if len(scores.shape) == 1:
    309             indices = (scores > 0).astype(np.int)

/opt/conda/lib/python3.7/site-packages/sklearn/linear_model/_base.py in decision_function(self, X)
    285         if X.shape[1] != n_features:
    286             raise ValueError("X has %d features per sample; expecting %d"
--> 287                              % (X.shape[1], n_features))
    288 
    289         scores = safe_sparse_dot(X, self.coef_.T,

ValueError: X has 5 features per sample; expecting 3000

非常感谢。

Answer 1

当您在 X 上调用 .fit_transform() 时，您将在 X 上再次改装矢量化器。只使用 .transform() 你应该没问题:

A = ['The dog is so cute']
A = vectorizer.transform(A) # <-- change this line

#pretrained model
classifier.predict(A)

这当然是假设 vectorizer 是 用于转换训练样本的相同矢量化器 并且它已根据它们进行了拟合。

将模型的数据输入从 (1, 5) 重塑为 (1, 3000)

Reshaping data input for the model from (1, 5) to (1, 3000)

python

numpy

reshape

scikit-learn