Python NLTK:斯坦福 NER 标注器错误消息:NLTK 无法找到 java 文件
Python NLTK: Stanford NER tagger error message: NLTK was unable to find the java file
试图让 Stanford NER 与 Python 一起工作。按照网上的一些说明进行操作,但收到错误消息:“NLTK 无法找到 java 文件!
使用特定于软件的配置参数或设置 JAVAHOME 环境变量。”怎么了?谢谢!
from nltk.tag.stanford import StanfordNERTagger
from nltk.tokenize import word_tokenize
model = r'C:\Stanford\NER\classifiers\english.muc.7class.distsim.crf.ser.gz'
jar = r'C:\Stanford\NER\stanford-ner-3.9.1.jar'
ner_tagger = StanfordNERTagger(model, jar, encoding = 'utf-8')
text = 'While in France, Christine Lagarde discussed short-term stimulus ' \
'efforts in a recent interview with the Wall Street Journal.'
words = word_tokenize(text)
classified_words = ner_tagger.tag(words)
在网上找到了解决方法。用您自己的路径替换路径。
import os
java_path = "C:/../../jdk1.8.0_101/bin/java.exe"
os.environ['JAVAHOME'] = java_path
或:
import nltk
nltk.internals.config_java('C:/../../jdk1.8.0_101/bin/java.exe')
来源:https://tianyouhu.wordpress.com/2016/09/01/problem-of-nltk-with-stanfordtokenizer/
试图让 Stanford NER 与 Python 一起工作。按照网上的一些说明进行操作,但收到错误消息:“NLTK 无法找到 java 文件! 使用特定于软件的配置参数或设置 JAVAHOME 环境变量。”怎么了?谢谢!
from nltk.tag.stanford import StanfordNERTagger
from nltk.tokenize import word_tokenize
model = r'C:\Stanford\NER\classifiers\english.muc.7class.distsim.crf.ser.gz'
jar = r'C:\Stanford\NER\stanford-ner-3.9.1.jar'
ner_tagger = StanfordNERTagger(model, jar, encoding = 'utf-8')
text = 'While in France, Christine Lagarde discussed short-term stimulus ' \
'efforts in a recent interview with the Wall Street Journal.'
words = word_tokenize(text)
classified_words = ner_tagger.tag(words)
在网上找到了解决方法。用您自己的路径替换路径。
import os java_path = "C:/../../jdk1.8.0_101/bin/java.exe" os.environ['JAVAHOME'] = java_path
或:
import nltk nltk.internals.config_java('C:/../../jdk1.8.0_101/bin/java.exe')
来源:https://tianyouhu.wordpress.com/2016/09/01/problem-of-nltk-with-stanfordtokenizer/