使用 Pickle 保存的机器学习模型无法正确预测文本值

Saved Machine Learning Model using Pickle won't predict text values properly

我目前有一个机器学习模型可以预测当前单词属于哪个词性

penn_results = penn_crf.predict_single(features)

然后,我编写了一个代码,其中它制作了一个(word,POS)样式的打印;

penn_tups = [(sent.split()[idx], penn_results[idx]) for idx in range(len(sent.split()))]

当我尝试 运行 这个时,它给了我这个输出。

[('The', 'DT'), ('quick', 'JJ'), ('brown', 'NN'), ('fox', 'NN'), ('jumps', 'NNS'), ('over', 'IN')] [('The', 'DET'), ('quick', 'NOUN'), ('brown', 'ADJ'), ('fox', 'NOUN' ), ('jumps', 'NOUN'), ('over', 'ADP')]

所以我使用

保存了这个模型
penn_filename = 'ptcp.sav'
pickle.dump(penn_crf, open(penn_filename, 'wb'))

尝试通过使用此

加载已保存的泡菜文件来 运行 模型
test = "The quick brown fox jumps over the head"
pickled_model = pickle.load(open('penn_treebank_crf_postagger.sav', 'rb'))
pickled_model.predict(test)
print(pickled_model.predict(test))

它打印这个 [['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP' ], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP']]

如何让它像这样打印准确的预测值 [('The', 'DT'), ('quick', 'JJ'), ('brown', 'NN'), ('fox', 'NN'), ('jumps', 'NNS'), ('over', 'IN')] [('The', 'DET'), ('quick', 'NOUN'), ('brown', 'ADJ'), ('fox', 'NOUN'), ('jumps', 'NOUN'), ('over', 'ADP')]

注意:此代码未经测试。

替换最后一行

print(pickled_model.predict(test))

像这样:

tokens_test = test.split()
predictions_test = pickled_model.predict(test)
pairs_test = [(tokens_test[idx], predictions_test[idx]) for idx in range(len(tokens_test))]
print(pairs_test)

您需要包含特征函数

penn_results = penn_crf.predict_single(**features**)
penn_tups = [(sent.split()[idx], penn_results[idx]) for idx in range(len(sent.split()))]

在您当前的代码中

tokens_test = test.split()
**features function**
predictions_test = pickled_model.predict(test)
pairs_test = [(tokens_test[idx], predictions_test[idx]) for idx in range(len(tokens_test))]
print(pairs_test)

以便它可以预测与您的 pre-saved 模型相同的结果。