如何将概率添加到 model.predict 输出?
How to add probabilities to model.predict output?
我已经按照 this tutorial.
构建了一个功能分类模型
教程只输出预测的类别名称。我希望它输出类别名称及其概率,我只想输出高于某个概率的类别。例如,我只想要超过 .5
的类别
这是用于访问模型的函数:
import pickle
import numpy as np
category_model_path="categorymodel.pkl"
category_transformer_path="categorytransformer.pkl"
sentiment_model_path="sentimentmodel.pkl"
sentiment_transformer_path="sentimenttransformer.pkl"
def get_top_k_predictions(model,X_test,k):
# get probabilities instead of predicted labels, since we want to collect top 3
np.set_printoptions(suppress=True)
probs = model.predict_proba(X_test)
# GET TOP K PREDICTIONS BY PROB - note these are just index
best_n = np.argsort(probs, axis=1)[:,-k:]
# GET CATEGORY OF PREDICTIONS
preds=[[model.classes_[predicted_cat] for predicted_cat in prediction] for prediction in best_n]
preds=[ item[::-1] for item in preds]
return preds
category_loaded_model = pickle.load(open(category_model_path, 'rb'))
category_loaded_transformer = pickle.load(open(category_transformer_path, 'rb'))
sentiment_loaded_model = pickle.load(open(sentiment_model_path, 'rb'))
sentiment_loaded_transformer = pickle.load(open(sentiment_transformer_path, 'rb'))
那么这段代码就是用来调用函数的:
category_test_features=category_loaded_transformer.transform(["I absolutley loved the organization "])
get_top_k_predictions(category_loaded_model,category_test_features,2)
这是当前输出:
[['Course Structure', 'Learning Materials']]
概率是在 probs
变量的函数中计算的。我不知道如何只得到超过 .5 的那些并将它们添加到 preds
输出。
best_n
数组包含概率数组 probs
的索引。您可以像获取标签一样使用它。您可以获得这样的标签概率元组:
preds = [
[(model.classes_[predicted_cat], distribution[predicted_cat])
for predicted_cat in prediction]
for distribution, prediction in zip(probs, best_n)]
如果您不想 return 概率而只想过滤它们,您可以这样做:
preds=[
[model.classes_[predicted_cat]
for predicted_cat in prediction if distribution[predicted_cat] > 0.5]
for distribution, prediction in zip(probs, best_n)]
我已经按照 this tutorial.
构建了一个功能分类模型教程只输出预测的类别名称。我希望它输出类别名称及其概率,我只想输出高于某个概率的类别。例如,我只想要超过 .5
的类别这是用于访问模型的函数:
import pickle
import numpy as np
category_model_path="categorymodel.pkl"
category_transformer_path="categorytransformer.pkl"
sentiment_model_path="sentimentmodel.pkl"
sentiment_transformer_path="sentimenttransformer.pkl"
def get_top_k_predictions(model,X_test,k):
# get probabilities instead of predicted labels, since we want to collect top 3
np.set_printoptions(suppress=True)
probs = model.predict_proba(X_test)
# GET TOP K PREDICTIONS BY PROB - note these are just index
best_n = np.argsort(probs, axis=1)[:,-k:]
# GET CATEGORY OF PREDICTIONS
preds=[[model.classes_[predicted_cat] for predicted_cat in prediction] for prediction in best_n]
preds=[ item[::-1] for item in preds]
return preds
category_loaded_model = pickle.load(open(category_model_path, 'rb'))
category_loaded_transformer = pickle.load(open(category_transformer_path, 'rb'))
sentiment_loaded_model = pickle.load(open(sentiment_model_path, 'rb'))
sentiment_loaded_transformer = pickle.load(open(sentiment_transformer_path, 'rb'))
那么这段代码就是用来调用函数的:
category_test_features=category_loaded_transformer.transform(["I absolutley loved the organization "])
get_top_k_predictions(category_loaded_model,category_test_features,2)
这是当前输出:
[['Course Structure', 'Learning Materials']]
概率是在 probs
变量的函数中计算的。我不知道如何只得到超过 .5 的那些并将它们添加到 preds
输出。
best_n
数组包含概率数组 probs
的索引。您可以像获取标签一样使用它。您可以获得这样的标签概率元组:
preds = [
[(model.classes_[predicted_cat], distribution[predicted_cat])
for predicted_cat in prediction]
for distribution, prediction in zip(probs, best_n)]
如果您不想 return 概率而只想过滤它们,您可以这样做:
preds=[
[model.classes_[predicted_cat]
for predicted_cat in prediction if distribution[predicted_cat] > 0.5]
for distribution, prediction in zip(probs, best_n)]