如何在使用 JDOM2 解析 XML 时忽略注释内容
how to Ignore Commented content while parsing XML using JDOM2
我在使用 JDOM xml 解析我的 xml 时遇到了一些问题,当我尝试以某种方式检索 content.Is 时,给了我注释行,以便我们可以忽略这些注释行。
Java代码:
SAXBuilder jdomBuilder = new SAXBuilder();
// jdomDocument is the JDOM2 Object
Document jdomDocument = jdomBuilder.build("C:/manu/WebservicesWS/DynamicXmlParse/src/PO_XML.xml");
// The root element is the root of the document. we print its name
System.out.println(jdomDocument.getRootElement().getName()); // prints
// "rss"
Element rss = jdomDocument.getRootElement();
System.out.println(rss.getNamespaceURI());
List<Element> rssChildren = rss.getChildren();
// getElement(rssChildren);
for (int i = 0; i < rssChildren.size(); i++) {
Element rssChild = rssChildren.get(i);
System.out.println(rssChild.getName());// prints 'title' and 'link'
List<Content> rssContents = rssChild.getContent();
for (int j = 0; j < rssContents.size(); j++) {
Content content = rssContents.get(j);
System.out.println(content.getValue());
}
}
XML结构
<interchange-control-header>
<control-number>2</control-number>
<sender-id>ZZ:IQAAOBUYER7</sender-id>
<receiver-id>ZZ:33347456972</receiver-id>
<!--sender-id>ZZ:IQAAOBUYER2</sender-id>
<receiver-id>ZZ:IQAAOSUPPLIER2</receiver-id>
<sender-id>IQAOrionBuyer</sender-id>
<receiver-id>IQAOrionSupplier</receiver-id-->
<date-time>2012-06-29T09:30:47-05:00</date-time>
<control-version>1</control-version>
<usage-indicator>T</usage-indicator>
<is-copy>0</is-copy>
</interchange-control-header>
当前输出
interchange-control-header
2
ZZ:IQAAOBUYER7
ZZ:33347456972
sender-id>ZZ:IQAAOBUYER2</sender-id>
<receiver-id>ZZ:IQAAOSUPPLIER2</receiver-id>
<sender-id>IQAOrionBuyer</sender-id>
<receiver-id>IQAOrionSupplier</receiver-id
2012-06-29T09:30:47-05:00
1
T
0
要求输出:
interchange-control-header
2
ZZ:IQAAOBUYER7
ZZ:33347456972
2012-06-29T09:30:47-05:00
1
T
0
评论被认为是 XML 文档的可识别部分,以及更明显的元素,如元素。其他需要注意的内容是处理指令、文本和实体引用。
当您在 rssChild
元素上调用 getContent 时,您会获得 Comment 内容,它的值是该内容中的文本。
看来您只想打印出每个子元素的文本内容,而不是所有内容。
获取所有子元素的简单方法是使用the getChildren()
method(而不是getContent)。您已经在其他地方使用了getChildren,所以我不确定您为什么忘记在这里使用它....
此外,您可以将循环简化为 for-each 样式...此代码:
List<Element> rssChildren = rss.getChildren();
// getElement(rssChildren);
for (int i = 0; i < rssChildren.size(); i++) {
Element rssChild = rssChildren.get(i);
System.out.println(rssChild.getName());// prints 'title' and 'link'
List<Content> rssContents = rssChild.getContent();
for (int j = 0; j < rssContents.size(); j++) {
Content content = rssContents.get(j);
System.out.println(content.getValue());
}
}
可能是:
for (Element rssChild : rss.getChildren()) {
System.out.println(rssChild.getName());// prints 'title' and 'link'
for (Element subRss : rssChild.getChildren()) {
System.out.println(subRss.getValue());
}
}
我在使用 JDOM xml 解析我的 xml 时遇到了一些问题,当我尝试以某种方式检索 content.Is 时,给了我注释行,以便我们可以忽略这些注释行。
Java代码:
SAXBuilder jdomBuilder = new SAXBuilder();
// jdomDocument is the JDOM2 Object
Document jdomDocument = jdomBuilder.build("C:/manu/WebservicesWS/DynamicXmlParse/src/PO_XML.xml");
// The root element is the root of the document. we print its name
System.out.println(jdomDocument.getRootElement().getName()); // prints
// "rss"
Element rss = jdomDocument.getRootElement();
System.out.println(rss.getNamespaceURI());
List<Element> rssChildren = rss.getChildren();
// getElement(rssChildren);
for (int i = 0; i < rssChildren.size(); i++) {
Element rssChild = rssChildren.get(i);
System.out.println(rssChild.getName());// prints 'title' and 'link'
List<Content> rssContents = rssChild.getContent();
for (int j = 0; j < rssContents.size(); j++) {
Content content = rssContents.get(j);
System.out.println(content.getValue());
}
}
XML结构
<interchange-control-header>
<control-number>2</control-number>
<sender-id>ZZ:IQAAOBUYER7</sender-id>
<receiver-id>ZZ:33347456972</receiver-id>
<!--sender-id>ZZ:IQAAOBUYER2</sender-id>
<receiver-id>ZZ:IQAAOSUPPLIER2</receiver-id>
<sender-id>IQAOrionBuyer</sender-id>
<receiver-id>IQAOrionSupplier</receiver-id-->
<date-time>2012-06-29T09:30:47-05:00</date-time>
<control-version>1</control-version>
<usage-indicator>T</usage-indicator>
<is-copy>0</is-copy>
</interchange-control-header>
当前输出
interchange-control-header
2
ZZ:IQAAOBUYER7
ZZ:33347456972
sender-id>ZZ:IQAAOBUYER2</sender-id>
<receiver-id>ZZ:IQAAOSUPPLIER2</receiver-id>
<sender-id>IQAOrionBuyer</sender-id>
<receiver-id>IQAOrionSupplier</receiver-id
2012-06-29T09:30:47-05:00
1
T
0
要求输出:
interchange-control-header
2
ZZ:IQAAOBUYER7
ZZ:33347456972
2012-06-29T09:30:47-05:00
1
T
0
评论被认为是 XML 文档的可识别部分,以及更明显的元素,如元素。其他需要注意的内容是处理指令、文本和实体引用。
当您在 rssChild
元素上调用 getContent 时,您会获得 Comment 内容,它的值是该内容中的文本。
看来您只想打印出每个子元素的文本内容,而不是所有内容。
获取所有子元素的简单方法是使用the getChildren()
method(而不是getContent)。您已经在其他地方使用了getChildren,所以我不确定您为什么忘记在这里使用它....
此外,您可以将循环简化为 for-each 样式...此代码:
List<Element> rssChildren = rss.getChildren(); // getElement(rssChildren); for (int i = 0; i < rssChildren.size(); i++) { Element rssChild = rssChildren.get(i); System.out.println(rssChild.getName());// prints 'title' and 'link' List<Content> rssContents = rssChild.getContent(); for (int j = 0; j < rssContents.size(); j++) { Content content = rssContents.get(j); System.out.println(content.getValue()); } }
可能是:
for (Element rssChild : rss.getChildren()) {
System.out.println(rssChild.getName());// prints 'title' and 'link'
for (Element subRss : rssChild.getChildren()) {
System.out.println(subRss.getValue());
}
}