使用 Stax 解析器解析 XML 1.1 文档时出错
Getting error while parsing an XML 1.1 document with Stax parser
我正在尝试解析 Burp Suite XML 导出。我使用过 Stax 解析器和 XPath 解析器。但是我得到
Location: /py/message/viewBill.pt [id parameter]]]></location>
<severity>High</severity>
<confidence>Certain</confidence>
<issueBackground><![CDATA[Reflected
javax.xml.stream.XMLStreamException: ParseError at [row,col]:[66,2357]
Message: The element type "location" must be terminated by the matching end-tag "< /location>".
at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:604)
at com.sun.xml.internal.stream.XMLEventReaderImpl.nextEvent(XMLEventReaderImpl.java:83)
一直出错。虽然有一个结束标记,但解析器找不到它。我的代码是:
XMLInputFactory factory = XMLInputFactory.newInstance();
XMLEventReader eventReader = factory.createXMLEventReader(new StringReader(str));
while (eventReader.hasNext()) {
XMLEvent event = eventReader.nextEvent();
switch (event.getEventType()) {
case XMLStreamConstants.START_ELEMENT:
StartElement startElement = event.asStartElement();
String qName = startElement.getName().getLocalPart();
if (qName.equalsIgnoreCase(ISSUES)) {
issues = true;
} else if (qName.equalsIgnoreCase(ISSUE)) {
issue = true;
} else if (qName.equalsIgnoreCase(NAME)) {
name = true;
} else if (qName.equalsIgnoreCase(HOST)) {
host = true;
} else if (qName.equalsIgnoreCase(PATH)) {
path = true;
} else if (qName.equalsIgnoreCase(LOCATION)) {
location = true;
} else if (qName.equalsIgnoreCase(SEVERITY)) {
severity = true;
}
break;
case XMLStreamConstants.CHARACTERS:
Characters characters = event.asCharacters();
if (name) {
System.out.println("Name: " + characters.getData());
name = false;
} else if (host) {
System.out.println("Host: " + characters.getData());
host = false;
} else if (path) {
System.out.println("Path: " + characters.getData());
path = false;
} else if (location) {
System.out.println("Location: " + characters.getData());
location = false;
} else if (severity) {
System.out.println("severity: " + characters.getData());
severity = false;
}
break;
case XMLStreamConstants.END_ELEMENT:
EndElement endElement = event.asEndElement();
String endElementName = endElement.getName().getLocalPart();
if (endElementName.equalsIgnoreCase(ISSUE)) {
issue = false;
} else if (endElementName.equalsIgnoreCase(NAME)) {
name = false;
} else if (endElementName.equalsIgnoreCase(HOST)) {
host = false;
} else if (endElementName.equalsIgnoreCase(PATH)) {
path = false;
} else if (endElementName.equalsIgnoreCase(LOCATION)) {
location = false;
}
break;
}
}
我正在尝试解析我在 https://github.com/mtesauro/parse-tools/blob/master/examples/brief-burp-export.xml 上找到的报告。
有人可以给点建议吗?
我敢猜测这是 XML 解析器中的错误。具体来说,我怀疑它没有将第 63 行的 ]]]>
识别为终止 CDATA 部分,因此它继续认为它在 CDATA 中,直到第 66 行末尾的 ]]>
,此时它发现它正在寻找 </location>
的结束标记 </issueBackground>
。向 XML 解析器的供应商提出请求,或者切换到一个有效的。
我发现一些示例使用 CSS 解析 Burp Export。比起我在 Java 中发现 Jsoup 用于 CSS 解析。它有点复杂,但效果很好。
Document document = Jsoup.parse(str);
Elements allElements = document.getAllElements();
for (Element element : allElements) {
String tagName = element.tagName();
String text = element.text();
if (tagName.equalsIgnoreCase("name")) {
System.out.println("name " + text);
} else if (tagName.equalsIgnoreCase("host")) {
System.out.println("host " + text);
System.out.println("ip " + element.attr("ip"));
}
}
我也遇到了同样的问题。花了一些时间在网上搜索后,我找到了以下解决方案
由于 xml 值有 CDATA,事件类型将是 XMLEvent.CDATA 而不是 XMLEvent.CHARACTERS
- https://docs.oracle.com/javase/8/docs/api/javax/xml/stream/events/XMLEvent.html
- https://github.com/dturanski/stax-xml-parser/blob/master/src/main/java/staxparser/xml/CDataContentExtractor.java
Switch(reader.hasNext()) {
case TAG:
eventType = reader.next();
if (eventType == XMLEvent.CDATA || eventType == XMLEvent.CHARACTERS) {
System.out.println(reader.getText());
}
break;
........
}
我还添加了以下依赖项。我不确定这种依赖性有何帮助,但如果没有这种依赖性,我们将得到与上述相同的异常。
但是添加这个依赖后问题就解决了。
<dependency>
<groupId>com.fasterxml.woodstox</groupId>
<artifactId>woodstox-core</artifactId>
<version>5.0.0</version>
</dependency>
https://github.com/FasterXML/woodstox
https://mvnrepository.com/artifact/com.fasterxml.woodstox/woodstox-core/5.0.0
我正在尝试解析 Burp Suite XML 导出。我使用过 Stax 解析器和 XPath 解析器。但是我得到
Location: /py/message/viewBill.pt [id parameter]]]></location>
<severity>High</severity>
<confidence>Certain</confidence>
<issueBackground><![CDATA[Reflected
javax.xml.stream.XMLStreamException: ParseError at [row,col]:[66,2357]
Message: The element type "location" must be terminated by the matching end-tag "< /location>".
at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:604)
at com.sun.xml.internal.stream.XMLEventReaderImpl.nextEvent(XMLEventReaderImpl.java:83)
一直出错。虽然有一个结束标记,但解析器找不到它。我的代码是:
XMLInputFactory factory = XMLInputFactory.newInstance();
XMLEventReader eventReader = factory.createXMLEventReader(new StringReader(str));
while (eventReader.hasNext()) {
XMLEvent event = eventReader.nextEvent();
switch (event.getEventType()) {
case XMLStreamConstants.START_ELEMENT:
StartElement startElement = event.asStartElement();
String qName = startElement.getName().getLocalPart();
if (qName.equalsIgnoreCase(ISSUES)) {
issues = true;
} else if (qName.equalsIgnoreCase(ISSUE)) {
issue = true;
} else if (qName.equalsIgnoreCase(NAME)) {
name = true;
} else if (qName.equalsIgnoreCase(HOST)) {
host = true;
} else if (qName.equalsIgnoreCase(PATH)) {
path = true;
} else if (qName.equalsIgnoreCase(LOCATION)) {
location = true;
} else if (qName.equalsIgnoreCase(SEVERITY)) {
severity = true;
}
break;
case XMLStreamConstants.CHARACTERS:
Characters characters = event.asCharacters();
if (name) {
System.out.println("Name: " + characters.getData());
name = false;
} else if (host) {
System.out.println("Host: " + characters.getData());
host = false;
} else if (path) {
System.out.println("Path: " + characters.getData());
path = false;
} else if (location) {
System.out.println("Location: " + characters.getData());
location = false;
} else if (severity) {
System.out.println("severity: " + characters.getData());
severity = false;
}
break;
case XMLStreamConstants.END_ELEMENT:
EndElement endElement = event.asEndElement();
String endElementName = endElement.getName().getLocalPart();
if (endElementName.equalsIgnoreCase(ISSUE)) {
issue = false;
} else if (endElementName.equalsIgnoreCase(NAME)) {
name = false;
} else if (endElementName.equalsIgnoreCase(HOST)) {
host = false;
} else if (endElementName.equalsIgnoreCase(PATH)) {
path = false;
} else if (endElementName.equalsIgnoreCase(LOCATION)) {
location = false;
}
break;
}
}
我正在尝试解析我在 https://github.com/mtesauro/parse-tools/blob/master/examples/brief-burp-export.xml 上找到的报告。
有人可以给点建议吗?
我敢猜测这是 XML 解析器中的错误。具体来说,我怀疑它没有将第 63 行的 ]]]>
识别为终止 CDATA 部分,因此它继续认为它在 CDATA 中,直到第 66 行末尾的 ]]>
,此时它发现它正在寻找 </location>
的结束标记 </issueBackground>
。向 XML 解析器的供应商提出请求,或者切换到一个有效的。
我发现一些示例使用 CSS 解析 Burp Export。比起我在 Java 中发现 Jsoup 用于 CSS 解析。它有点复杂,但效果很好。
Document document = Jsoup.parse(str);
Elements allElements = document.getAllElements();
for (Element element : allElements) {
String tagName = element.tagName();
String text = element.text();
if (tagName.equalsIgnoreCase("name")) {
System.out.println("name " + text);
} else if (tagName.equalsIgnoreCase("host")) {
System.out.println("host " + text);
System.out.println("ip " + element.attr("ip"));
}
}
我也遇到了同样的问题。花了一些时间在网上搜索后,我找到了以下解决方案
由于 xml 值有 CDATA,事件类型将是 XMLEvent.CDATA 而不是 XMLEvent.CHARACTERS
- https://docs.oracle.com/javase/8/docs/api/javax/xml/stream/events/XMLEvent.html
- https://github.com/dturanski/stax-xml-parser/blob/master/src/main/java/staxparser/xml/CDataContentExtractor.java
Switch(reader.hasNext()) {
case TAG:
eventType = reader.next();
if (eventType == XMLEvent.CDATA || eventType == XMLEvent.CHARACTERS) {
System.out.println(reader.getText());
}
break;
........
}
我还添加了以下依赖项。我不确定这种依赖性有何帮助,但如果没有这种依赖性,我们将得到与上述相同的异常。
但是添加这个依赖后问题就解决了。
<dependency>
<groupId>com.fasterxml.woodstox</groupId>
<artifactId>woodstox-core</artifactId>
<version>5.0.0</version>
</dependency>
https://github.com/FasterXML/woodstox https://mvnrepository.com/artifact/com.fasterxml.woodstox/woodstox-core/5.0.0