正在解析 xml DOM 个具有 java 的子节点
Parsing xml DOM child nodes with java
我有以下结构 xml
<entities>
<entity>
<type>FieldTerminology</type>
<relevance>0.732316</relevance>
<sentiment>
<type>negative</type>
<score>-0.351864</score>
</sentiment>
<count>2</count>
<text>financial crisis</text>
</entity>
<entity>
<type>Company</type>
<relevance>0.496572</relevance>
<sentiment>
<type>neutral</type>
</sentiment>
<count>1</count>
<text>Goldman Sachs</text>
<disambiguated>
<name>Goldman Sachs</name>
<subType>CompanyShareholder</subType>
<website>http://www.gs.com/</website>
<dbpedia>http://dbpedia.org/resource/Goldman_Sachs</dbpedia>
<freebase>http://rdf.freebase.com/ns/m.01xdn1</freebase>
<yago>http://yago-knowledge.org/resource/Goldman_Sachs</yago>
<crunchbase>http://www.crunchbase.com/company/goldman-sachs</crunchbase>
</disambiguated>
</entity>
我正在解析所有,只有我无法访问子情感
有了这个我怎样才能在每个实体节点中也访问 "sentiment"?
NodeList feeds = docs.getElementsByTagName("entities");
for (int i = 0; i < feeds.getLength(); i++) {
Node mainNode = feeds.item(i);
if (mainNode.getNodeType() == Node.ELEMENT_NODE) {
Element firstElement = (Element) mainNode;
System.out.println("First element " + firstElement.getTagName());
NodeList forumidNameList = firstElement.getElementsByTagName("entity");
for (int j = 0; j < forumidNameList.getLength(); ++j) {
Element value = (Element) forumidNameList.item(j);
NodeList conditionList = value.getElementsByTagName("relevance");
for (int k = 0; k < conditionList.getLength(); ++k) {
Element condition = (Element) conditionList.item(k);
String conditionText = condition.getFirstChild().getNodeValue();
System.out.println("relevance " + conditionText);
}
NodeList conditionList1 = value.getElementsByTagName("type");
for (int k = 0; k < conditionList1.getLength(); ++k) {
Element condition = (Element) conditionList1.item(k);
String conditionText = condition.getFirstChild().getNodeValue();
System.out.println("type " + conditionText);
}
NodeList conditionList2 = value.getElementsByTagName("count");
for (int k = 0; k < conditionList2.getLength(); ++k) {
Element condition = (Element) conditionList2.item(k);
String conditionText = condition.getFirstChild().getNodeValue();
System.out.println("count " + conditionText);
}
NodeList conditionList3 = value.getElementsByTagName("text");
for (int k = 0; k < conditionList3.getLength(); ++k) {
Element condition = (Element) conditionList3.item(k);
String conditionText = condition.getFirstChild().getNodeValue();
System.out.println("text " + conditionText);
}
我需要解析实体列表和子节点。
您是否考虑过使用不同的解析器?我发现 DOM 很难处理更复杂的 XML 结构。我建议尝试使用 JDOM,我发现它可以更好地处理像您这样的访问问题。
我试图让你的问题得到解决,我做了以下更改
1. 为 <sentiment>
添加了缺失节点解析
2. 改进了 <type>
节点的解析逻辑,因为它会因为 dom 结构中的同名出现两次。
注意:我仍然建议使用 JAXB、XPath 进行高效的 xml 解析。希望有帮助
代码在这里。
NodeList feeds = doc.getElementsByTagName("entities");
for (int i = 0; i < feeds.getLength(); i++) {
Node mainNode = feeds.item(i);
if (mainNode.getNodeType() == Node.ELEMENT_NODE) {
Element firstElement = (Element) mainNode;
System.out.println("First element "
+ firstElement.getTagName());
NodeList forumidNameList = firstElement
.getElementsByTagName("entity");
for (int j = 0; j < forumidNameList.getLength(); ++j) {
Element value = (Element) forumidNameList.item(j);
NodeList conditionList = value
.getElementsByTagName("type");
for (int k = 0; k < conditionList.getLength(); ++k) {
Element condition = (Element) conditionList.item(k);
if (condition.getParentNode().getNodeName()
.equals("entity")) {
String conditionText = condition
.getFirstChild().getNodeValue();
System.out.println("type " + conditionText);
}
}
NodeList conditionList1 = value
.getElementsByTagName("relevance");
for (int k = 0; k < conditionList1.getLength(); ++k) {
Element condition = (Element) conditionList1
.item(k);
String conditionText = condition.getFirstChild()
.getNodeValue();
System.out.println("relevance " + conditionText);
}
NodeList conditionList2 = value
.getElementsByTagName("sentiment");
for (int k = 0; k < conditionList2.getLength(); ++k) {
Element condition = (Element) conditionList2
.item(k);
for (int l = 0; l < condition.getChildNodes()
.getLength(); ++l) {
Element condition2 = (Element) condition
.getChildNodes().item(l);
String conditionText = condition2
.getFirstChild().getNodeValue();
System.out
.println("sentiment " + conditionText);
}
}
NodeList conditionList3 = value
.getElementsByTagName("count");
for (int k = 0; k < conditionList3.getLength(); ++k) {
Element condition = (Element) conditionList3
.item(k);
String conditionText = condition.getFirstChild()
.getNodeValue();
System.out.println("count " + conditionText);
}
NodeList conditionList4 = value
.getElementsByTagName("text");
for (int k = 0; k < conditionList4.getLength(); ++k) {
Element condition = (Element) conditionList4
.item(k);
String conditionText = condition.getFirstChild()
.getNodeValue();
System.out.println("text " + conditionText);
}
}
}
}
output
----------------
First element entities
type FieldTerminology
relevance 0.732316
sentiment negative
sentiment -0.351864
count 2
text financial crisis
type Company
relevance 0.496572
sentiment neutral
count 1
text Goldman Sachs
我有以下结构 xml
<entities>
<entity>
<type>FieldTerminology</type>
<relevance>0.732316</relevance>
<sentiment>
<type>negative</type>
<score>-0.351864</score>
</sentiment>
<count>2</count>
<text>financial crisis</text>
</entity>
<entity>
<type>Company</type>
<relevance>0.496572</relevance>
<sentiment>
<type>neutral</type>
</sentiment>
<count>1</count>
<text>Goldman Sachs</text>
<disambiguated>
<name>Goldman Sachs</name>
<subType>CompanyShareholder</subType>
<website>http://www.gs.com/</website>
<dbpedia>http://dbpedia.org/resource/Goldman_Sachs</dbpedia>
<freebase>http://rdf.freebase.com/ns/m.01xdn1</freebase>
<yago>http://yago-knowledge.org/resource/Goldman_Sachs</yago>
<crunchbase>http://www.crunchbase.com/company/goldman-sachs</crunchbase>
</disambiguated>
</entity>
我正在解析所有,只有我无法访问子情感 有了这个我怎样才能在每个实体节点中也访问 "sentiment"?
NodeList feeds = docs.getElementsByTagName("entities");
for (int i = 0; i < feeds.getLength(); i++) {
Node mainNode = feeds.item(i);
if (mainNode.getNodeType() == Node.ELEMENT_NODE) {
Element firstElement = (Element) mainNode;
System.out.println("First element " + firstElement.getTagName());
NodeList forumidNameList = firstElement.getElementsByTagName("entity");
for (int j = 0; j < forumidNameList.getLength(); ++j) {
Element value = (Element) forumidNameList.item(j);
NodeList conditionList = value.getElementsByTagName("relevance");
for (int k = 0; k < conditionList.getLength(); ++k) {
Element condition = (Element) conditionList.item(k);
String conditionText = condition.getFirstChild().getNodeValue();
System.out.println("relevance " + conditionText);
}
NodeList conditionList1 = value.getElementsByTagName("type");
for (int k = 0; k < conditionList1.getLength(); ++k) {
Element condition = (Element) conditionList1.item(k);
String conditionText = condition.getFirstChild().getNodeValue();
System.out.println("type " + conditionText);
}
NodeList conditionList2 = value.getElementsByTagName("count");
for (int k = 0; k < conditionList2.getLength(); ++k) {
Element condition = (Element) conditionList2.item(k);
String conditionText = condition.getFirstChild().getNodeValue();
System.out.println("count " + conditionText);
}
NodeList conditionList3 = value.getElementsByTagName("text");
for (int k = 0; k < conditionList3.getLength(); ++k) {
Element condition = (Element) conditionList3.item(k);
String conditionText = condition.getFirstChild().getNodeValue();
System.out.println("text " + conditionText);
}
我需要解析实体列表和子节点。
您是否考虑过使用不同的解析器?我发现 DOM 很难处理更复杂的 XML 结构。我建议尝试使用 JDOM,我发现它可以更好地处理像您这样的访问问题。
我试图让你的问题得到解决,我做了以下更改
1. 为 <sentiment>
添加了缺失节点解析
2. 改进了 <type>
节点的解析逻辑,因为它会因为 dom 结构中的同名出现两次。
注意:我仍然建议使用 JAXB、XPath 进行高效的 xml 解析。希望有帮助 代码在这里。
NodeList feeds = doc.getElementsByTagName("entities");
for (int i = 0; i < feeds.getLength(); i++) {
Node mainNode = feeds.item(i);
if (mainNode.getNodeType() == Node.ELEMENT_NODE) {
Element firstElement = (Element) mainNode;
System.out.println("First element "
+ firstElement.getTagName());
NodeList forumidNameList = firstElement
.getElementsByTagName("entity");
for (int j = 0; j < forumidNameList.getLength(); ++j) {
Element value = (Element) forumidNameList.item(j);
NodeList conditionList = value
.getElementsByTagName("type");
for (int k = 0; k < conditionList.getLength(); ++k) {
Element condition = (Element) conditionList.item(k);
if (condition.getParentNode().getNodeName()
.equals("entity")) {
String conditionText = condition
.getFirstChild().getNodeValue();
System.out.println("type " + conditionText);
}
}
NodeList conditionList1 = value
.getElementsByTagName("relevance");
for (int k = 0; k < conditionList1.getLength(); ++k) {
Element condition = (Element) conditionList1
.item(k);
String conditionText = condition.getFirstChild()
.getNodeValue();
System.out.println("relevance " + conditionText);
}
NodeList conditionList2 = value
.getElementsByTagName("sentiment");
for (int k = 0; k < conditionList2.getLength(); ++k) {
Element condition = (Element) conditionList2
.item(k);
for (int l = 0; l < condition.getChildNodes()
.getLength(); ++l) {
Element condition2 = (Element) condition
.getChildNodes().item(l);
String conditionText = condition2
.getFirstChild().getNodeValue();
System.out
.println("sentiment " + conditionText);
}
}
NodeList conditionList3 = value
.getElementsByTagName("count");
for (int k = 0; k < conditionList3.getLength(); ++k) {
Element condition = (Element) conditionList3
.item(k);
String conditionText = condition.getFirstChild()
.getNodeValue();
System.out.println("count " + conditionText);
}
NodeList conditionList4 = value
.getElementsByTagName("text");
for (int k = 0; k < conditionList4.getLength(); ++k) {
Element condition = (Element) conditionList4
.item(k);
String conditionText = condition.getFirstChild()
.getNodeValue();
System.out.println("text " + conditionText);
}
}
}
}
output
----------------
First element entities
type FieldTerminology
relevance 0.732316
sentiment negative
sentiment -0.351864
count 2
text financial crisis
type Company
relevance 0.496572
sentiment neutral
count 1
text Goldman Sachs