使用 Java 在 DOM4j 中解析具有自己命名空间的子节点

Parsing child node with own namespace in DOM4j with Java

我希望有人能帮助修复我的 foobar。我在 DOM4j 解析器上工作了大约一个月,使用 XPATH 从 XML 文件中提取了 500 多个数据元素。不幸的是,我使用旧的测试文件作为模型来创建我的代码,并且在插入生产文件后才发现我的方式错误。这是我的代码的一小部分示例。从 Hashmap 中可以看出,完整的 XML 中使用了多个命名空间。我将代码缩减为仅提取 3 个元素。

import java.io.File;
import java.util.HashMap;
import java.util.Map;

import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.DocumentHelper;
import org.dom4j.Node;
import org.dom4j.XPath;
import org.dom4j.io.SAXReader;

public class CLOBTest {

public static void main(String[] args) {
      try {
         File inputFile = new File("C:/test.xml");
         //File inputFile = new File("C:/test1.xml");
         SAXReader reader = new SAXReader();
         Document document = reader.read( inputFile );

         Map<String, String> map = new HashMap<String, String>();

         map.put("exch", "http://at.dsh.cms.gov/exchange/1.0");
         map.put("ext", "http://at.dsh.cms.gov/extension/1.0");
         map.put("hix-core", "http://hix.cms.gov/0.1/hix-core");
         map.put("hix-ee", "http://hix.cms.gov/0.1/hix-ee");
         map.put("hix-pm", "http://hix.cms.gov/0.1/hix-pm");
         map.put("nc", "http://niem.gov/niem/niem-core/2.0");
         map.put("niem-core", "http://niem.gov/niem/niem-core/2.0");
         map.put("s", "http://niem.gov/niem/structures/2.0");
         map.put("scr", "http://niem.gov/niem/domains/screening/2.1");
         map.put("xsi", "http://www.w3.org/2001/XMLSchema-instance");


         XPath Request = DocumentHelper.createXPath("//exch:AccountTransferRequest");
         Request.setNamespaceURIs(map);

         Node request =  Request.selectSingleNode(document);

         System.out.println("  ID:        \t" + request.valueOf("ext:TransferHeader/ext:TransferActivity/niem-core:ActivityIdentification/niem-core:IdentificationID")); 
         System.out.println("  First Name:\t" + request.valueOf("hix-core:Person/niem-core:PersonName/niem-core:PersonGivenName")); 
         System.out.println("  Last Name: \t" + request.valueOf("hix-core:Person/niem-core:PersonName/niem-core:PersonSurName")); 

      } catch (DocumentException e) {
         e.printStackTrace();
      }
   }
}

示例 XML 文件 (test.xml) 给出的正确结果为:

ID:         XXX012345
First Name: gina
Last Name:  davis

test.xml

 <H15>
 <requestMSG>
 <exch:AccountTransferRequest xmlns:exch="http://at.dsh.cms.gov/exchange/1.0" xmlns:hix-core="http://hix.cms.gov/0.1/hix-core" xmlns:niem-core="http://niem.gov/niem/niem-core/2.0" xmlns:s="http://niem.gov/niem/structures/2.0" xmlns:ext="http://at.dsh.cms.gov/extension/1.0" ext:atVersionText="2.3">
 <ext:TransferHeader>
 <ext:TransferActivity>
 <niem-core:ActivityIdentification xmlns:niem-core="http://niem.gov/niem/niem-core/2.0">
 <niem-core:IdentificationID>XXX012345</niem-core:IdentificationID>
 </niem-core:ActivityIdentification>
 </ext:TransferActivity>
 </ext:TransferHeader>
 <hix-core:Person xmlns:hix-core="http://hix.cms.gov/0.1/hix-core" xmlns:s="http://niem.gov/niem/structures/2.0" s:id="Mom">
 <niem-core:PersonName xmlns:niem-core="http://niem.gov/niem/niem-core/2.0">
 <niem-core:PersonGivenName>gina</niem-core:PersonGivenName>
 <niem-core:PersonSurName>davis</niem-core:PersonSurName>
 </niem-core:PersonName>
 </hix-core:Person>
 </exch:AccountTransferRequest>
 </requestMSG>
 </H15>

但是,如果元素 exch:AccountTransferRequest 不包含所有名称空间引用,我会在子节点上收到无界前缀错误。我假设分配给 Request XPath 的 Hashmap 已经处理了所有前缀绑定。我在 exch:AccountTransferRequest 元素中没有完整的 URI 的生产文件 (test1.xml) 上尝试后意识到我错了。

test1.xml

 <H15>
 <requestMSG>
 <exch:AccountTransferRequest xmlns:exch="http://at.dsh.cms.gov/exchange/1.0" xmlns:ext="http://at.dsh.cms.gov/extension/1.0" ext:atVersionText="2.3">
 <ext:TransferHeader>
 <ext:TransferActivity>
 <niem-core:ActivityIdentification xmlns:niem-core="http://niem.gov/niem/niem-core/2.0">
 <niem-core:IdentificationID>XXX012345</niem-core:IdentificationID>
 </niem-core:ActivityIdentification>
 </ext:TransferActivity>
 </ext:TransferHeader>
 <hix-core:Person xmlns:hix-core="http://hix.cms.gov/0.1/hix-core" xmlns:s="http://niem.gov/niem/structures/2.0" s:id="Mom">
 <niem-core:PersonName xmlns:niem-core="http://niem.gov/niem/niem-core/2.0">
 <niem-core:PersonGivenName>gina</niem-core:PersonGivenName>
 <niem-core:PersonSurName>davis</niem-core:PersonSurName>
 </niem-core:PersonName>
 </hix-core:Person>
 </exch:AccountTransferRequest>
 </requestMSG>
 </H15>

test1.xml 结果:

Exception in thread "main" org.dom4j.XPathException: Exception occurred evaluting XPath: ext:TransferHeader/ext:TransferActivity/niem-core:ActivityIdentification/niem-core:IdentificationID. Exception: XPath expression uses unbound namespace prefix niem-core
    at org.dom4j.xpath.DefaultXPath.handleJaxenException(DefaultXPath.java:374)
    at org.dom4j.xpath.DefaultXPath.valueOf(DefaultXPath.java:185)
    at org.dom4j.tree.AbstractNode.valueOf(AbstractNode.java:191)
    at CLOBTest.main(CLOBTest.java:41)

现在,如何提取具有自己命名空间的子节点的值?有没有办法在仍然通过请求节点的同时做到这一点?如果可能的话,我想挽回我的一些努力。

看起来您需要创建更多的 XPath 对象,并且每次都设置命名空间上下文,例如

     XPath Request = DocumentHelper.createXPath("//exch:AccountTransferRequest");
     Request.setNamespaceURIs(map);

     Node request =  Request.selectSingleNode(document);

     XPath idRequest = DocumentHelper.createXPath("ext:TransferHeader/ext:TransferActivity/niem-core:ActivityIdentification/niem-core:IdentificationID");
     idRequest.setNamespaceURIs(map);

     System.out.println("  ID:        \t" + idRequest.selectSingleNode(request).getText()); 

好的,明白了。只需将请求节点转换为一个元素并添加所有名称空间。一旦我在输出中使用该元素,整个文档都能识别它们。

         Element test = (Element) request;
         test.addNamespace("exch", "http://at.dsh.cms.gov/exchange/1.0");
         test.addNamespace("ext", "http://at.dsh.cms.gov/extension/1.0");
         test.addNamespace("hix-core", "http://hix.cms.gov/0.1/hix-core");
         test.addNamespace("hix-ee", "http://hix.cms.gov/0.1/hix-ee");
         test.addNamespace("hix-pm", "http://hix.cms.gov/0.1/hix-pm");
         test.addNamespace("nc", "http://niem.gov/niem/niem-core/2.0");
         test.addNamespace("niem-core", "http://niem.gov/niem/niem-core/2.0");
         test.addNamespace("s", "http://niem.gov/niem/structures/2.0");
         test.addNamespace("scr", "http://niem.gov/niem/domains/screening/2.1");
         test.addNamespace("xsi", "http://www.w3.org/2001/XMLSchema-instance");

         System.out.println("  ID:                                             \t"+test.valueOf("ext:TransferHeader/ext:TransferActivity/niem-core:ActivityIdentification/niem-core:IdentificationID"));