如何计算情感分析算法（朴素贝叶斯）的准确性

Question

我目前正在研究朴素贝叶斯情绪分析程序，但我不太确定如何确定它的准确性。我的代码是：

x = df["Text"]
y = df["Mood"]

test_size = 1785
x_train = x[:-test_size]
y_train = y[:-test_size]

x_test = x[-test_size:]
y_test = y[-test_size:]

count_vect = CountVectorizer()
X_train_counts = count_vect.fit_transform(x_train)
tfidf_transformer = TfidfTransformer()
X_train_tfidf = tfidf_transformer.fit_transform(X_train_counts)
clf = MultinomialNB().fit(X_train_tfidf, y_train)

print(clf.predict(count_vect.transform(["Random text"])))

对于我给出的句子，预测效果很好，但是我想运行它来自我的数据库（x_test 和 y_test）的 20%，并计算准确性。我不太确定如何处理这个问题。任何帮助将不胜感激。

我也试过以下方法：

predictions = clf.predict(x_test)

print(accuracy_score(y_test, predictions))

这给了我以下错误：

ValueError: could not convert string to float: "A sentence from the dataset"

Answer 1

在使用预测之前 = clf.predict(x_test) 请将测试集也转换为数字

x_test = count_vect.transform(x_test).toarray()

你可以找到一步一步来做到这一点 [here]

如何计算情感分析算法（朴素贝叶斯）的准确性

How to calculate accuracy of a sentiment analysis algorithm (Naive Bayes)

machine-learning

sentiment-analysis

naivebayes