当前状态 END_ELEMENT 不在状态 CHARACTERS、COMMENT、CDATA、SPACE、ENTITY_REFERENCE 中,DTD 对 getText() 有效
Current state END_ELEMENT is not among the statesCHARACTERS, COMMENT, CDATA, SPACE, ENTITY_REFERENCE, DTD valid for getText()
我是 java 的新手,但我正在为学校做这个项目。我有一个 4GB XML 文件(这是一个维基百科转储)需要解析。我使用 StAX 和我的代码 运行 成功地运行了超过 400,000 行(将近 50MB),但随后出现此错误。
Exception in thread "main" java.lang.IllegalStateException: Current
state END_ELEMENT is not among the statesCHARACTERS, COMMENT, CDATA,
SPACE, ENTITY_REFERENCE, DTD valid for getText() at
com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.getText(XMLStreamReaderImpl.java:1081)
at tagremoving1.TagRemoving1.main(TagRemoving1.java:65)
我在某处读到,当我使用 getText() 时,我应该检查 null 或空元素,所以我这样做了。然后它走得更远,但又因同样的错误而停止。我几乎无处不在。我不知道出了什么问题。
这是我的代码:
XMLInputFactory factory = XMLInputFactory.newInstance();
File file = new File("source.xml");
FileInputStream fileReader = new FileInputStream(file);
factory.setProperty(XMLInputFactory.IS_COALESCING, true);
factory.setProperty(XMLInputFactory.IS_REPLACING_ENTITY_REFERENCES,true);
factory.setProperty(XMLInputFactory.IS_SUPPORTING_EXTERNAL_ENTITIES,false);
PrintWriter writer1 = new PrintWriter("result.txt", "UTF-8");
XMLStreamReader reader = factory.createXMLStreamReader(fileReader);
int counter = 1;
while(reader.hasNext()){
if(reader.next() == 1){ //If it is START_ELEMENT
String name = reader.getLocalName();
switch(name){
case "page":
writer1.println("\r\npage" + counter + ":");
counter++;
break;
case "title":
reader.next();
if(reader != null && !"".equals(reader.toString()))
writer1.println("Title: " + reader.getText());
break;
case "text":
reader.next();
if(reader != null && !"".equals(reader.toString()))
writer1.println("Text: " + reader.getText());
break;
default:
break;
}
}
}
writer1.flush();
writer1.close();
有什么建议吗?
嗯,我想通了!
我将另一个条件 reader.hasText() 添加到最终 'if' 然后一切正常。这是代码:
case "text":
reader.next();
if(reader != null && !"".equals(reader.toString()) && reader.hasText())
writer1.println("Text: " + reader.getText());
break;
我是 java 的新手,但我正在为学校做这个项目。我有一个 4GB XML 文件(这是一个维基百科转储)需要解析。我使用 StAX 和我的代码 运行 成功地运行了超过 400,000 行(将近 50MB),但随后出现此错误。
Exception in thread "main" java.lang.IllegalStateException: Current state END_ELEMENT is not among the statesCHARACTERS, COMMENT, CDATA, SPACE, ENTITY_REFERENCE, DTD valid for getText() at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.getText(XMLStreamReaderImpl.java:1081) at tagremoving1.TagRemoving1.main(TagRemoving1.java:65)
我在某处读到,当我使用 getText() 时,我应该检查 null 或空元素,所以我这样做了。然后它走得更远,但又因同样的错误而停止。我几乎无处不在。我不知道出了什么问题。 这是我的代码:
XMLInputFactory factory = XMLInputFactory.newInstance();
File file = new File("source.xml");
FileInputStream fileReader = new FileInputStream(file);
factory.setProperty(XMLInputFactory.IS_COALESCING, true);
factory.setProperty(XMLInputFactory.IS_REPLACING_ENTITY_REFERENCES,true);
factory.setProperty(XMLInputFactory.IS_SUPPORTING_EXTERNAL_ENTITIES,false);
PrintWriter writer1 = new PrintWriter("result.txt", "UTF-8");
XMLStreamReader reader = factory.createXMLStreamReader(fileReader);
int counter = 1;
while(reader.hasNext()){
if(reader.next() == 1){ //If it is START_ELEMENT
String name = reader.getLocalName();
switch(name){
case "page":
writer1.println("\r\npage" + counter + ":");
counter++;
break;
case "title":
reader.next();
if(reader != null && !"".equals(reader.toString()))
writer1.println("Title: " + reader.getText());
break;
case "text":
reader.next();
if(reader != null && !"".equals(reader.toString()))
writer1.println("Text: " + reader.getText());
break;
default:
break;
}
}
}
writer1.flush();
writer1.close();
有什么建议吗?
嗯,我想通了!
我将另一个条件 reader.hasText() 添加到最终 'if' 然后一切正常。这是代码:
case "text":
reader.next();
if(reader != null && !"".equals(reader.toString()) && reader.hasText())
writer1.println("Text: " + reader.getText());
break;