如何使用 DOM select 具有不同标签的节点？

Question

我有一个 xml 文件，看起来像：

<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
 <HWData xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <NE MOID="WBTS-42" NEType="WBTS">
   <EQHO MOID="EQHO-1-0" >
     <UNIT MOID="UNIT-FAN-1" State="enabled"></UNIT>
     <UNIT MOID="UNIT-FAN-3" State="enabled"></UNIT>
   </EQHO>
  </NE>
  <NE MOID="RNC-40" NEType="RNC">
   <EQHO MOID="EQHO-3-0" >
     <UNIT MOID="UNIT-FAN-5" State="disabled"></UNIT>
     <UNIT MOID="UNIT-FAN-6" State="disabled"></UNIT>
   </EQHO>
  </NE>
</HWData>

我想知道如何使用 DOM 获取包含 "NE" 和 "UNIT" 标签的 NodeList？谢谢

Answer 1

您可以手动完成：

import java.io.File;
import java.util.Arrays;
import java.util.HashSet;
import java.util.Set;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;

import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;

public class XmlDomTest {
    public static void main(String[] args) throws Exception {
        File file = new File("/path/to/your/file");
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        DocumentBuilder db = dbf.newDocumentBuilder(); 
        Document doc = db.parse(file);
        Set<String> filteredNames = new HashSet<String>(Arrays.asList("NE", "UNIT"));
        NodeList list = collectNodes(doc, filteredNames);
        for (int i = 0; i < list.getLength(); i++)
            System.out.println(list.item(i).getNodeName());
    }

    private static NodeList collectNodes(Document doc, Set<String> filteredNames) {
        Node ret = doc.createElement("NodeList");
        collectNodes(doc, filteredNames, ret);
        return ret.getChildNodes();
    }

    private static void collectNodes(Node node, Set<String> filteredNames, Node ret) {
        NodeList chn = node.getChildNodes();
        for (int i = 0; i < chn.getLength(); i++) {
            Node child = chn.item(i);
            if (filteredNames.contains(child.getNodeName()))
                ret.appendChild(child);
            collectNodes(child, filteredNames, ret);
        }
    }
}

Answer 2

试试这个：

public static List<String> MOIDList(File file) throws SAXException, IOException, ParserConfigurationException, XPathExpressionException{
    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
    DocumentBuilder db = dbf.newDocumentBuilder(); 
    Document doc = db.parse(file);

       XPath xPath = XPathFactory.newInstance().newXPath();
       XPathExpression exp = xPath.compile("//NE | //UNIT");
       NodeList nl = (NodeList)exp.evaluate(doc, XPathConstants.NODESET);

        List<String> MoidList = new ArrayList<>();
    for (int i = 0; i < nl.getLength(); i++) {
        String moid=((Element)nl.item(i)).getAttribute("MOID");
            MoidList.add(moid);
    }
    return MoidList;

}

Answer 3

只有 select MOIDS 的 xpath 是 //NE/@MOID | //UNIT/@MOID。

你应该看看我的开源 Xml-parser-library unXml. It's available on Maven Central.

然后您可以执行以下操作：

import com.nerdforge.unxml.Parsing;
import com.nerdforge.unxml.factory.ParsingFactory;
import org.w3c.dom.Document;
import java.util.List;

public class Parser {
    public List<String> parseXml(String xml){
        Parsing parsing = ParsingFactory.getInstance().create();
        Document document = parsing.xml().document(xml);

        List<String> result = parsing
            .arr("//NE/@MOID | //UNIT/@MOID", parsing.text())
            .as(String.class)
            .apply(document);
        return result;
    }
}

parseXml 将 return 结果：

[WBTS-42, UNIT-FAN-1, UNIT-FAN-3, RNC-40, UNIT-FAN-5, UNIT-FAN-6]

如果需要，您还可以创建更复杂的嵌套数据结构。如果你想要一个关于如何做的例子，请在这里给我评论。

如何使用 DOM select 具有不同标签的节点？

how to select nodes with different tags using DOM?

java

xml

dom

domparser