ProtobufAnnotationSerializer 问题 - Stanford CoreNLP
Issue with ProtobufAnnotationSerializer - Stanford CoreNLP
我尝试使用 ProtobufAnnotationSerializer
序列化一个 Annotation
对象,如下所示:
String text = "Stanford University is located in California. It is a great university, founded in 1891.";
Annotation document = new Annotation(text);
Properties props = new Properties();
props.setProperty("annotators", "tokenize, ssplit, pos, lemma, ner, parse,depparse");
StanfordCoreNLP pip = new StanfordCoreNLP(props);
pip.annotate(document);
ProtobufAnnotationSerializer serializer = new ProtobufAnnotationSerializer();
FileOutputStream fileOut = new FileOutputStream("path/to/anno.ser");
ObjectOutputStream out = new ObjectOutputStream(fileOut);
serializer.write(document, out);
这个bug出来了:
Exception in thread "main" java.lang.VerifyError: Bad type on operand stack
Exception Details:
Location:
com/google/protobuf/GeneratedMessageV3$ExtendableMessage.getExtension(Lcom/google/protobuf/Extension;I)Ljava/lang/Object; @3: invokevirtual
Reason:
Type 'com/google/protobuf/Extension' (current frame, stack[1]) is not assignable to 'com/google/protobuf/ExtensionLite'
Current Frame:
bci: @3
flags: { }
locals: { 'com/google/protobuf/GeneratedMessageV3$ExtendableMessage', 'com/google/protobuf/Extension', integer }
stack: { 'com/google/protobuf/GeneratedMessageV3$ExtendableMessage', 'com/google/protobuf/Extension', integer }
Bytecode:
0x0000000: 2a2b 1cb6 0024 b0
at edu.stanford.nlp.pipeline.ProtobufAnnotationSerializer.toProtoBuilder(ProtobufAnnotationSerializer.java:611)
at edu.stanford.nlp.pipeline.ProtobufAnnotationSerializer.toProto(ProtobufAnnotationSerializer.java:579)
at edu.stanford.nlp.pipeline.ProtobufAnnotationSerializer.write(ProtobufAnnotationSerializer.java:184)
at xxxxxxxxxxxxxx.main(xxxxxxx.java:303) \ line: serializer.write(document, out);
我认为 CoreNLP ProtobufAnnotationSerializer 和 protobuf 包之间存在不一致。我使用的是直接从 CoreNLP home page 下载的 3.9.1 版,我什至尝试了一些替代解决方案,但其中 none 有效。我试过了:
- 版本 3.9 3.8
- 直接从 maven 下载包及其依赖项
- 在 github.
下载并构建(使用 ant)源代码
其他语言(我用法语测试过)甚至调用服务器时也会出现错误。
哎呀我的坏事...我的依赖项之一包括 protobuf-lite,它会导致与 corenlp 的冲突...非常感谢@GaborAngeli
我尝试使用 ProtobufAnnotationSerializer
序列化一个 Annotation
对象,如下所示:
String text = "Stanford University is located in California. It is a great university, founded in 1891.";
Annotation document = new Annotation(text);
Properties props = new Properties();
props.setProperty("annotators", "tokenize, ssplit, pos, lemma, ner, parse,depparse");
StanfordCoreNLP pip = new StanfordCoreNLP(props);
pip.annotate(document);
ProtobufAnnotationSerializer serializer = new ProtobufAnnotationSerializer();
FileOutputStream fileOut = new FileOutputStream("path/to/anno.ser");
ObjectOutputStream out = new ObjectOutputStream(fileOut);
serializer.write(document, out);
这个bug出来了:
Exception in thread "main" java.lang.VerifyError: Bad type on operand stack
Exception Details:
Location:
com/google/protobuf/GeneratedMessageV3$ExtendableMessage.getExtension(Lcom/google/protobuf/Extension;I)Ljava/lang/Object; @3: invokevirtual
Reason:
Type 'com/google/protobuf/Extension' (current frame, stack[1]) is not assignable to 'com/google/protobuf/ExtensionLite'
Current Frame:
bci: @3
flags: { }
locals: { 'com/google/protobuf/GeneratedMessageV3$ExtendableMessage', 'com/google/protobuf/Extension', integer }
stack: { 'com/google/protobuf/GeneratedMessageV3$ExtendableMessage', 'com/google/protobuf/Extension', integer }
Bytecode:
0x0000000: 2a2b 1cb6 0024 b0
at edu.stanford.nlp.pipeline.ProtobufAnnotationSerializer.toProtoBuilder(ProtobufAnnotationSerializer.java:611)
at edu.stanford.nlp.pipeline.ProtobufAnnotationSerializer.toProto(ProtobufAnnotationSerializer.java:579)
at edu.stanford.nlp.pipeline.ProtobufAnnotationSerializer.write(ProtobufAnnotationSerializer.java:184)
at xxxxxxxxxxxxxx.main(xxxxxxx.java:303) \ line: serializer.write(document, out);
我认为 CoreNLP ProtobufAnnotationSerializer 和 protobuf 包之间存在不一致。我使用的是直接从 CoreNLP home page 下载的 3.9.1 版,我什至尝试了一些替代解决方案,但其中 none 有效。我试过了:
- 版本 3.9 3.8
- 直接从 maven 下载包及其依赖项
- 在 github. 下载并构建(使用 ant)源代码
其他语言(我用法语测试过)甚至调用服务器时也会出现错误。
哎呀我的坏事...我的依赖项之一包括 protobuf-lite,它会导致与 corenlp 的冲突...非常感谢@GaborAngeli