仅针对一篇评论的情感分析.. 这里的代码应该是 classifier.fit(new_X_test, ) 的第二个参数？

Question

这是仅针对一个评论的情感分析代码，因为我们没有数据集，我无法弄清楚朴素贝叶斯模型中 classifier.fit 方法的第二个参数是什么？

# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# Cleaning the code
import re   
import nltk    
nltk.download('stopwords') 
from nltk.corpus import stopwords 
from nltk.stem.porter import PorterStemmer 
new_review = 'I love this restaurant so much'
new_review = re.sub('[^a-zA-Z]', ' ', new_review)
new_review = new_review.lower()
new_review = new_review.split()
ps = PorterStemmer()
all_stopwords = stopwords.words('english')
all_stopwords.remove('not')
new_review = [ps.stem(word) for word in new_review if not word in set(all_stopwords)]
new_review = ' '.join(new_review)
new_corpus = [new_review]


#Creating the bag of word model
from sklearn.feature_extraction.text import CountVectorizer
cv = CountVectorizer(3)
new_X_test = cv.fit_transform(new_corpus).toarray()
#new_X_test = cv.transform(new_corpus).toarray()

# training in Naive bayes model

from sklearn.naive_bayes import GaussianNB
classifier = GaussianNB()
classifier.fit(new_X_test, )

# predict the result
#y_pred = classifier.predict(X)
new_y_pred = classifier.predict(new_X_test)
print(new_y_pred)

#new_X_test = cv.transform(new_corpus).toarray()
#new_y_pred = classifier.predict(X)
#print(new_y_pred)

Answer 1

根据sklearn.naive_bayes.GaussianNB.fit()手册页，第二个参数是y，其中：

y: array-like of shape (n_samples,)
Target values.

您案例中的目标值是您独特评论的情绪。朴素贝叶斯是一种监督分类算法。 “监督”意味着您必须在训练（或模型拟合）期间通过提供正确的目标值（或标签）来指导算法。

现在的代码并没有多大意义。您不能 train/fit 有意义地建立一个只有一个样本的模型。您将需要一个包含许多评论的数据集来拟合模型，然后尝试预测新样本。

仅针对一篇评论的情感分析.. 这里的代码应该是 classifier.fit(new_X_test, ) 的第二个参数？

sentimental analysis only for one review.. here's the code what supposed to be second argument for classifier.fit(new_X_test, )?

python

machine-learning

sentiment-analysis

naivebayes