使用 Pickle 保存的机器学习模型无法正确预测文本值
Saved Machine Learning Model using Pickle won't predict text values properly
我目前有一个机器学习模型可以预测当前单词属于哪个词性
penn_results = penn_crf.predict_single(features)
然后,我编写了一个代码,其中它制作了一个(word,POS)样式的打印;
penn_tups = [(sent.split()[idx], penn_results[idx]) for idx in range(len(sent.split()))]
当我尝试 运行 这个时,它给了我这个输出。
[('The', 'DT'), ('quick', 'JJ'), ('brown', 'NN'), ('fox', 'NN'), ('jumps', 'NNS'), ('over', 'IN')] [('The', 'DET'), ('quick', 'NOUN'), ('brown', 'ADJ'), ('fox', 'NOUN' ), ('jumps', 'NOUN'), ('over', 'ADP')]
所以我使用
保存了这个模型
penn_filename = 'ptcp.sav'
pickle.dump(penn_crf, open(penn_filename, 'wb'))
尝试通过使用此
加载已保存的泡菜文件来 运行 模型
test = "The quick brown fox jumps over the head"
pickled_model = pickle.load(open('penn_treebank_crf_postagger.sav', 'rb'))
pickled_model.predict(test)
print(pickled_model.predict(test))
它打印这个
[['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP' ], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP']]
如何让它像这样打印准确的预测值
[('The', 'DT'), ('quick', 'JJ'), ('brown', 'NN'), ('fox', 'NN'), ('jumps', 'NNS'), ('over', 'IN')] [('The', 'DET'), ('quick', 'NOUN'), ('brown', 'ADJ'), ('fox', 'NOUN'), ('jumps', 'NOUN'), ('over', 'ADP')]
注意:此代码未经测试。
替换最后一行
print(pickled_model.predict(test))
像这样:
tokens_test = test.split()
predictions_test = pickled_model.predict(test)
pairs_test = [(tokens_test[idx], predictions_test[idx]) for idx in range(len(tokens_test))]
print(pairs_test)
您需要包含特征函数
penn_results = penn_crf.predict_single(**features**)
penn_tups = [(sent.split()[idx], penn_results[idx]) for idx in range(len(sent.split()))]
在您当前的代码中
tokens_test = test.split()
**features function**
predictions_test = pickled_model.predict(test)
pairs_test = [(tokens_test[idx], predictions_test[idx]) for idx in range(len(tokens_test))]
print(pairs_test)
以便它可以预测与您的 pre-saved 模型相同的结果。
我目前有一个机器学习模型可以预测当前单词属于哪个词性
penn_results = penn_crf.predict_single(features)
然后,我编写了一个代码,其中它制作了一个(word,POS)样式的打印;
penn_tups = [(sent.split()[idx], penn_results[idx]) for idx in range(len(sent.split()))]
当我尝试 运行 这个时,它给了我这个输出。
[('The', 'DT'), ('quick', 'JJ'), ('brown', 'NN'), ('fox', 'NN'), ('jumps', 'NNS'), ('over', 'IN')] [('The', 'DET'), ('quick', 'NOUN'), ('brown', 'ADJ'), ('fox', 'NOUN' ), ('jumps', 'NOUN'), ('over', 'ADP')]
所以我使用
保存了这个模型penn_filename = 'ptcp.sav'
pickle.dump(penn_crf, open(penn_filename, 'wb'))
尝试通过使用此
加载已保存的泡菜文件来 运行 模型test = "The quick brown fox jumps over the head"
pickled_model = pickle.load(open('penn_treebank_crf_postagger.sav', 'rb'))
pickled_model.predict(test)
print(pickled_model.predict(test))
它打印这个 [['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP' ], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP'], ['NNP']]
如何让它像这样打印准确的预测值 [('The', 'DT'), ('quick', 'JJ'), ('brown', 'NN'), ('fox', 'NN'), ('jumps', 'NNS'), ('over', 'IN')] [('The', 'DET'), ('quick', 'NOUN'), ('brown', 'ADJ'), ('fox', 'NOUN'), ('jumps', 'NOUN'), ('over', 'ADP')]
注意:此代码未经测试。
替换最后一行
print(pickled_model.predict(test))
像这样:
tokens_test = test.split()
predictions_test = pickled_model.predict(test)
pairs_test = [(tokens_test[idx], predictions_test[idx]) for idx in range(len(tokens_test))]
print(pairs_test)
您需要包含特征函数
penn_results = penn_crf.predict_single(**features**)
penn_tups = [(sent.split()[idx], penn_results[idx]) for idx in range(len(sent.split()))]
在您当前的代码中
tokens_test = test.split()
**features function**
predictions_test = pickled_model.predict(test)
pairs_test = [(tokens_test[idx], predictions_test[idx]) for idx in range(len(tokens_test))]
print(pairs_test)
以便它可以预测与您的 pre-saved 模型相同的结果。