如何在不验证或检查 DTD 的情况下设置系统和 public ID?

How to set system and public ID without validating or checking DTD?

不确定是我还是 API 但我无法创建一个 XML 文件而不向我抛出异常或我正在尝试设置的东西(DocType) 未设置。

这是我目前正在做的事情:

StringBuilder stringBuilder = new StringBuilder();
stringBuilder.append("<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>");
stringBuilder.append("<!DOCTYPE document>");

String xmlString = AnnotatedDocumentTree.toString(annotatedDocumentTree, new SimpleAnnotatedDocumentTreeXmlConverter(), stringBuilder);

DocumentBuilderFactory icFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder icBuilder;          
Document finalDocument = null;                 

StringWriter writer = new StringWriter();

try {

    icBuilder = icFactory.newDocumentBuilder(); 

    finalDocument = icBuilder.parse(new InputSource(new ByteArrayInputStream(xmlString.getBytes("UTF-8"))));                

    Transformer transformer = TransformerFactory.newInstance().newTransformer();

    DocumentType doctype = xmlDocument.getDoctype();                    

    transformer.setOutputProperty(OutputKeys.DOCTYPE_SYSTEM, doctype.getSystemId());
    transformer.setOutputProperty(OutputKeys.DOCTYPE_PUBLIC, doctype.getPublicId());
    transformer.transform(new DOMSource(finalDocument), new StreamResult(writer));

    finalDocument = icBuilder.parse(new InputSource(new ByteArrayInputStream(writer.toString().getBytes("UTF-8"))));


} catch (Exception e) {
    e.printStackTrace();
}

但是,这样我得到了一个例外。我可以使用 DocumentBuilderFactory 并将其配置为 this:

icFactory.setValidating(false);
icFactory.setNamespaceAware(true);
icFactory.setFeature("http://xml.org/sax/features/namespaces", false);
icFactory.setFeature("http://xml.org/sax/features/validation", false);
icFactory.setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", false);
icFactory.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);

但是我的 finalDocument 中的 DocType 将是 null

Setting my own EntityResolver 也不行:

builder.setEntityResolver(new EntityResolver() {
    @Override
    public InputSource resolveEntity(String publicId, String systemId)
            throws SAXException, IOException {
        if (systemId.contains(".dtd")) {
            return new InputSource(new StringReader(""));
        } else {
            return null;
        }
    }
});

因为如果我想设置 doctype.getSystemId()真的 想设置 doctype.getSystemId().

有没有办法在没有这种头痛的情况下推设置它?


基本上我想解析这个:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<!DOCTYPE document>
<ds>
    ABGB <cue>: §§ 786 , 810 , 812 </cue>Die Kosten der ... 
    <cue>von</cue>
    <Relation bewertung="1">7 Ob 56/10a </Relation>= 
    <Relation bewertung="1">Zak 2010/773 , 440 </Relation>. 
</ds>

并将其转换为:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE ds PUBLIC "-//MBO//DTD artikel-at 1.0//DE" "http://dtd.company.de/dtd-at/artikel.dtd">
<ds>
    ABGB <cue>: §§ 786 , 810 , 812 
    </cue>Die Kosten der ... <cue>
    von 
    </cue><Relation bewertung="1">7 Ob 56/10a </Relation>= 
    <Relation bewertung="1">Zak 2010/773 , 440 </Relation>. 
</ds>

对我来说,如果 dtd 存在于指定位置 (systemId),您的代码就可以工作,否则按照下面的代码添加实体解析器就可以了。

我没有 xmlDocument 所以我硬编码了值

    StringBuilder stringBuilder = new StringBuilder();
    stringBuilder.append("<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>");
    stringBuilder.append("<!DOCTYPE document><document/>");

    String xmlString = stringBuilder.toString(); // AnnotatedDocumentTree.toString(annotatedDocumentTree, new SimpleAnnotatedDocumentTreeXmlConverter(), stringBuilder);

    DocumentBuilderFactory icFactory = DocumentBuilderFactory.newInstance();
    DocumentBuilder icBuilder;          
    Document finalDocument = null;                 

    StringWriter writer = new StringWriter();

    try {

        icBuilder = icFactory.newDocumentBuilder(); 

        finalDocument = icBuilder.parse(new InputSource(new ByteArrayInputStream(xmlString.getBytes("UTF-8"))));                

        Transformer transformer = TransformerFactory.newInstance().newTransformer();

        //DocumentType doctype = xmlDocument.getDoctype();                    

        transformer.setOutputProperty(OutputKeys.DOCTYPE_SYSTEM, "xdtd.dtd"); // doctype.getSystemId());
        transformer.setOutputProperty(OutputKeys.DOCTYPE_PUBLIC, "xxxx"); //doctype.getPublicId());
        transformer.transform(new DOMSource(finalDocument), new StreamResult(writer));

        icBuilder.setEntityResolver(new EntityResolver() {
            @Override
            public InputSource resolveEntity(String publicId, String systemId)
                    throws SAXException, IOException {
                if (systemId.contains(".dtd")) {
                    return new InputSource(new StringReader(""));
                } else {
                    return null;
                }
            }
        });
        finalDocument = icBuilder.parse(new InputSource(new ByteArrayInputStream(writer.toString().getBytes("UTF-8"))));

        System.out.println(finalDocument.getDoctype().getPublicId());
        System.out.println("-----------");
        System.out.println(writer.toString());

    } catch (Exception e) {
        e.printStackTrace();
    }

输出:

      xxxx
     -----------


     <?xml version="1.0" encoding="UTF-8"?>
     <!DOCTYPE document PUBLIC "xxxx" "xdtd.dtd">
     <document/>

此外,设置属性的选项也可以,无需实体解析器,必须在创建构建器之前完成。在这些属性中,只需要 http://apache.org/xml/features/nonvalidating/load-external-dtd


有趣的是:它显示时设置为已读:

访问docType之前:

访问docType后:


这可以在 Xerces 中使用 property http://apache.org/xml/features/dom/defer-node-expansion 控制,默认情况下 true

试试这个:

Transformer t = TransformerFactory.newInstance().newTransformer();
Source s = new StreamSource(new StringReader(inputXML));
StringWriter sw = new StringWriter();
t.setOutputProperty(OutputKeys.DOCTYPE_SYSTEM, "my.system.id");
t.setOutputProperty(OutputKeys.DOCTYPE_PUBLIC, "my/public/id");
t.transform(s, new StreamResult(sw));

根本不需要通过 DOM。