Stanford CoreNLP - 找不到 egw4-reut.512.clusters
Stanford CoreNLP - the egw4-reut.512.clusters cannot be found
我正在使用 CoreNLP 包对用户评论做一些注释,自从我升级到 3.5.0 版本后,我似乎反复 运行 进入相同的错误:
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ...
Loading distsim lexicon from /u/nlp/data/pos_tags_are_useless/egw4-reut.512.clusters ...
java.lang.RuntimeException: java.io.FileNotFoundException: \u\nlp\data\pos_tags_are_useless\egw4-reut.512.clusters (The system cannot find the path specified)
at edu.stanford.nlp.objectbank.ReaderIteratorFactory$ReaderIterator.setNextObject(ReaderIteratorFactory.java:225) (cue fifty lines of error)
这里的一些搜索让我得到了这些类似的问题:
Stanford NER Error: Loading distsim lexicon Failed and Stanford NER tagger generates 'file not found' exception with provided models 没有解决我的问题:我只使用 3.5.0 中的代码和模型(通过 Maven Central)。我尝试从 NER 模型修改 props 文件并指向用户目录中的另一个 .clusters 文件但没有成功(完全相同的错误)。
我用来实例化 CoreNLP 对象的代码也很标准,但这里是:
Properties props = new Properties();
props.setProperty("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref");
stan = new StanfordCoreNLP(props);
现在我在想,我显然遗漏了一些东西。任何帮助将不胜感激。
更完整的堆栈跟踪(如果有帮助的话)如下:
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... Loading distsim lexicon from /u/nlp/data/pos_tags_are_useless/egw4-reut.512.clusters ... java.lang.RuntimeException: java.io.FileNotFoundException: \u\nlp\data\pos_tags_are_useless\egw4-reut.512.clusters (The system cannot find the path specified)
at edu.stanford.nlp.objectbank.ReaderIteratorFactory$ReaderIterator.setNextObject(ReaderIteratorFactory.java:225)
at edu.stanford.nlp.objectbank.ReaderIteratorFactory$ReaderIterator.<init>(ReaderIteratorFactory.java:161)
at edu.stanford.nlp.objectbank.ReaderIteratorFactory.iterator(ReaderIteratorFactory.java:98)
at edu.stanford.nlp.objectbank.ObjectBank$OBIterator.<init>(ObjectBank.java:404)
at edu.stanford.nlp.objectbank.ObjectBank.iterator(ObjectBank.java:242)
at edu.stanford.nlp.ie.NERFeatureFactory.initLexicon(NERFeatureFactory.java:471)
at edu.stanford.nlp.ie.NERFeatureFactory.init(NERFeatureFactory.java:379)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.reinit(AbstractSequenceClassifier.java:171)
at edu.stanford.nlp.ie.crf.CRFClassifier.loadClassifier(CRFClassifier.java:2630)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1620)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1675)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1662)
at edu.stanford.nlp.ie.crf.CRFClassifier.getClassifier(CRFClassifier.java:2851)
at edu.stanford.nlp.ie.ClassifierCombiner.loadClassifierFromPath(ClassifierCombiner.java:189)
at edu.stanford.nlp.ie.ClassifierCombiner.loadClassifiers(ClassifierCombiner.java:173)
at edu.stanford.nlp.ie.ClassifierCombiner.<init>(ClassifierCombiner.java:113)
at edu.stanford.nlp.ie.NERClassifierCombiner.<init>(NERClassifierCombiner.java:64)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.create(StanfordCoreNLP.java:617)
at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:85)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:267)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:129)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:125)
at rgu.jclos.quilt.utilities.nlp.DependenciesTagger.<init>(DependenciesTagger.java:99)
at rgu.jclos.quilt.eca.approaches.ApproachC_USS.main(ApproachC_USS.java:47)
Caused by: java.io.FileNotFoundException: \u\nlp\data\pos_tags_are_useless\egw4-reut.512.clusters (The system cannot find the path specified)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:131)
at edu.stanford.nlp.io.EncodingFileReader.<init>(EncodingFileReader.java:78)
at edu.stanford.nlp.objectbank.ReaderIteratorFactory$ReaderIterator.setNextObject(ReaderIteratorFactory.java:192)
... 23 more
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... Exception in thread "main" edu.stanford.nlp.io.RuntimeIOException: java.io.FileNotFoundException
at edu.stanford.nlp.pipeline.StanfordCoreNLP.create(StanfordCoreNLP.java:621)
at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:85)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:267)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:129)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:125)
at rgu.jclos.quilt.utilities.nlp.DependenciesTagger.<init>(DependenciesTagger.java:99)
at rgu.jclos.quilt.eca.approaches.ApproachC_USS.main(ApproachC_USS.java:47)
Caused by: java.io.FileNotFoundException
at edu.stanford.nlp.ie.ClassifierCombiner.loadClassifierFromPath(ClassifierCombiner.java:199)
at edu.stanford.nlp.ie.ClassifierCombiner.loadClassifiers(ClassifierCombiner.java:173)
at edu.stanford.nlp.ie.ClassifierCombiner.<init>(ClassifierCombiner.java:113)
at edu.stanford.nlp.ie.NERClassifierCombiner.<init>(NERClassifierCombiner.java:64)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.create(StanfordCoreNLP.java:617)
... 6 more
Caused by: java.lang.ClassCastException: java.util.ArrayList cannot be cast to edu.stanford.nlp.classify.LinearClassifier
at edu.stanford.nlp.ie.ner.CMMClassifier.loadClassifier(CMMClassifier.java:1070)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1620)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1675)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1662)
at edu.stanford.nlp.ie.ner.CMMClassifier.getClassifier(CMMClassifier.java:1116)
at edu.stanford.nlp.ie.ClassifierCombiner.loadClassifierFromPath(ClassifierCombiner.java:195)
... 10 more
原来是Maven的问题
我的代码位于一个实用程序库中,该程序库封装了 Stanford CoreNLP 以提供额外的处理,并且它本身运行良好。但是,在将这个项目作为依赖项添加到我的主项目时,Maven 默认将 Stanford CoreNLP 库模型的 3.4 版本导入主项目,这导致了上述错误。
我正在使用 CoreNLP 包对用户评论做一些注释,自从我升级到 3.5.0 版本后,我似乎反复 运行 进入相同的错误:
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ...
Loading distsim lexicon from /u/nlp/data/pos_tags_are_useless/egw4-reut.512.clusters ...
java.lang.RuntimeException: java.io.FileNotFoundException: \u\nlp\data\pos_tags_are_useless\egw4-reut.512.clusters (The system cannot find the path specified) at edu.stanford.nlp.objectbank.ReaderIteratorFactory$ReaderIterator.setNextObject(ReaderIteratorFactory.java:225) (cue fifty lines of error)
这里的一些搜索让我得到了这些类似的问题:
Stanford NER Error: Loading distsim lexicon Failed and Stanford NER tagger generates 'file not found' exception with provided models 没有解决我的问题:我只使用 3.5.0 中的代码和模型(通过 Maven Central)。我尝试从 NER 模型修改 props 文件并指向用户目录中的另一个 .clusters 文件但没有成功(完全相同的错误)。
我用来实例化 CoreNLP 对象的代码也很标准,但这里是:
Properties props = new Properties();
props.setProperty("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref");
stan = new StanfordCoreNLP(props);
现在我在想,我显然遗漏了一些东西。任何帮助将不胜感激。
更完整的堆栈跟踪(如果有帮助的话)如下:
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... Loading distsim lexicon from /u/nlp/data/pos_tags_are_useless/egw4-reut.512.clusters ... java.lang.RuntimeException: java.io.FileNotFoundException: \u\nlp\data\pos_tags_are_useless\egw4-reut.512.clusters (The system cannot find the path specified)
at edu.stanford.nlp.objectbank.ReaderIteratorFactory$ReaderIterator.setNextObject(ReaderIteratorFactory.java:225)
at edu.stanford.nlp.objectbank.ReaderIteratorFactory$ReaderIterator.<init>(ReaderIteratorFactory.java:161)
at edu.stanford.nlp.objectbank.ReaderIteratorFactory.iterator(ReaderIteratorFactory.java:98)
at edu.stanford.nlp.objectbank.ObjectBank$OBIterator.<init>(ObjectBank.java:404)
at edu.stanford.nlp.objectbank.ObjectBank.iterator(ObjectBank.java:242)
at edu.stanford.nlp.ie.NERFeatureFactory.initLexicon(NERFeatureFactory.java:471)
at edu.stanford.nlp.ie.NERFeatureFactory.init(NERFeatureFactory.java:379)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.reinit(AbstractSequenceClassifier.java:171)
at edu.stanford.nlp.ie.crf.CRFClassifier.loadClassifier(CRFClassifier.java:2630)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1620)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1675)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1662)
at edu.stanford.nlp.ie.crf.CRFClassifier.getClassifier(CRFClassifier.java:2851)
at edu.stanford.nlp.ie.ClassifierCombiner.loadClassifierFromPath(ClassifierCombiner.java:189)
at edu.stanford.nlp.ie.ClassifierCombiner.loadClassifiers(ClassifierCombiner.java:173)
at edu.stanford.nlp.ie.ClassifierCombiner.<init>(ClassifierCombiner.java:113)
at edu.stanford.nlp.ie.NERClassifierCombiner.<init>(NERClassifierCombiner.java:64)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.create(StanfordCoreNLP.java:617)
at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:85)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:267)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:129)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:125)
at rgu.jclos.quilt.utilities.nlp.DependenciesTagger.<init>(DependenciesTagger.java:99)
at rgu.jclos.quilt.eca.approaches.ApproachC_USS.main(ApproachC_USS.java:47)
Caused by: java.io.FileNotFoundException: \u\nlp\data\pos_tags_are_useless\egw4-reut.512.clusters (The system cannot find the path specified)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:131)
at edu.stanford.nlp.io.EncodingFileReader.<init>(EncodingFileReader.java:78)
at edu.stanford.nlp.objectbank.ReaderIteratorFactory$ReaderIterator.setNextObject(ReaderIteratorFactory.java:192)
... 23 more
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... Exception in thread "main" edu.stanford.nlp.io.RuntimeIOException: java.io.FileNotFoundException
at edu.stanford.nlp.pipeline.StanfordCoreNLP.create(StanfordCoreNLP.java:621)
at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:85)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:267)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:129)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:125)
at rgu.jclos.quilt.utilities.nlp.DependenciesTagger.<init>(DependenciesTagger.java:99)
at rgu.jclos.quilt.eca.approaches.ApproachC_USS.main(ApproachC_USS.java:47)
Caused by: java.io.FileNotFoundException
at edu.stanford.nlp.ie.ClassifierCombiner.loadClassifierFromPath(ClassifierCombiner.java:199)
at edu.stanford.nlp.ie.ClassifierCombiner.loadClassifiers(ClassifierCombiner.java:173)
at edu.stanford.nlp.ie.ClassifierCombiner.<init>(ClassifierCombiner.java:113)
at edu.stanford.nlp.ie.NERClassifierCombiner.<init>(NERClassifierCombiner.java:64)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.create(StanfordCoreNLP.java:617)
... 6 more
Caused by: java.lang.ClassCastException: java.util.ArrayList cannot be cast to edu.stanford.nlp.classify.LinearClassifier
at edu.stanford.nlp.ie.ner.CMMClassifier.loadClassifier(CMMClassifier.java:1070)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1620)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1675)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1662)
at edu.stanford.nlp.ie.ner.CMMClassifier.getClassifier(CMMClassifier.java:1116)
at edu.stanford.nlp.ie.ClassifierCombiner.loadClassifierFromPath(ClassifierCombiner.java:195)
... 10 more
原来是Maven的问题
我的代码位于一个实用程序库中,该程序库封装了 Stanford CoreNLP 以提供额外的处理,并且它本身运行良好。但是,在将这个项目作为依赖项添加到我的主项目时,Maven 默认将 Stanford CoreNLP 库模型的 3.4 版本导入主项目,这导致了上述错误。