使用 libxslt 从 XML 生成 PDF 期间出现 ValidationException

ValidationException during PDF generation from XML using libxslt

我在命令行中使用 Docbook 1.78 和 xsltproc(libxslt 1.1.26 和 libxml 2.7.8)从 XML 文件生成 fo 文件。我的目标是使用 Apache 格式的输出处理器(fop;版本 1.1)生成 PDF。 我的XML-输入文件:

<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE book SYSTEM "http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd">
<book lang="de" id="MyBook">
    <chapter id="Introduction">
        <title>Introduction</title>
        <section id="sec_intro_1">
            <title>Test</title>
            <para>para1_sec_intro_1 (see also glossary: <xref linkend="gloss_etm-datei"/>).</para>
            <para>para2_sec_intro_1</para>
        </section>
        <section id="sec_intro_2">
            <title>Another Test</title>
            <para>para1_sec_intro_2 (glossary: <xref linkend="gloss_etm-datei"/>).</para> 
            <para>para2_sec_intro2</para>       
        </section>
    </chapter>
    <xi:include href="glossar_test.xml" xmlns:xi="http://www.w3.org/2001/XInclude"></xi:include>
</book>

如果我运行下面的命令

xsltproc -o ./test.fo --xinclude --stringparam paper.type A4 --stringparam fop1.extensions 1 ./docbook/fo/docbook.xsl ./test.xml 2> fo_out.txt

已生成 fo 文件,但它包含 fo:wrapper 个 ID 不唯一的元素。这是生成的 fo 文件:

...
(see also glossary: <fo:basic-link internal-destination="gloss_etm-datei"><fo:inline>
            <fo:wrapper id="idp8751564240"><!--ETM-Datei--></fo:wrapper>
            <fo:inline font-weight="bold">ETM-Datei</fo:inline>
        </fo:inline></fo:basic-link>)
....
(glossary: <fo:basic-link internal-destination="gloss_etm-datei"><fo:inline>
            <fo:wrapper id="idp8751564240"><!--ETM-Datei--></fo:wrapper>
            <fo:inline font-weight="bold">ETM-Datei</fo:inline>
        </fo:inline></fo:basic-link>).
...

现在,如果我尝试从该 fo 文件生成 pdf 文件,fop 会抛出异常:

SEVERE: Exception
org.apache.fop.apps.FOPException: org.apache.fop.fo.ValidationException: Property ID "idp8751564240" (found on "fo:wrapper") previously used; ID values must be unique within a document! (See position 6:48)
javax.xml.transform.TransformerException: org.apache.fop.fo.ValidationException: Property ID "idp8751564240" (found on "fo:wrapper") previously used; ID values must be unique within a document! (See position 6:48)
    at org.apache.fop.cli.InputHandler.transformTo(InputHandler.java:303)
    at org.apache.fop.cli.InputHandler.renderTo(InputHandler.java:130)
    at org.apache.fop.cli.Main.startFOP(Main.java:177)
    at org.apache.fop.cli.Main.main(Main.java:208)
Caused by: javax.xml.transform.TransformerException: org.apache.fop.fo.ValidationException: Property ID "idp8751564240" (found on "fo:wrapper") previously used; ID values must be unique within a document! (See position 6:48)
    at org.apache.xalan.transformer.TransformerIdentityImpl.transform(TransformerIdentityImpl.java:501)
    at org.apache.fop.cli.InputHandler.transformTo(InputHandler.java:300)
    ... 3 more
Caused by: org.apache.fop.fo.ValidationException: Property ID "idp8751564240" (found on "fo:wrapper") previously used; ID values must be unique within a document! (See position 6:48)
    at org.apache.fop.events.ValidationExceptionFactory.createException(ValidationExceptionFactory.java:38)
    at org.apache.fop.events.EventExceptionManager.throwException(EventExceptionManager.java:58)
    at org.apache.fop.events.DefaultEventBroadcaster.invoke(DefaultEventBroadcaster.java:175)
    at com.sun.proxy.$Proxy2.idNotUnique(Unknown Source)
    at org.apache.fop.fo.FObj.checkId(FObj.java:173)
    at org.apache.fop.fo.FObj.startOfNode(FObj.java:154)
    at org.apache.fop.fo.flow.Wrapper.startOfNode(Wrapper.java:65)
    at org.apache.fop.fo.FOTreeBuilder$MainFOHandler.startElement(FOTreeBuilder.java:325)
    at org.apache.fop.fo.FOTreeBuilder.startElement(FOTreeBuilder.java:175)
    at org.apache.xalan.transformer.TransformerIdentityImpl.startElement(TransformerIdentityImpl.java:1072)
    at org.apache.xerces.parsers.AbstractSAXParser.startElement(Unknown Source)
    at org.apache.xerces.xinclude.XIncludeHandler.startElement(Unknown Source)
    at org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown Source)
    at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
    at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
    at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
    at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
    at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
    at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
    at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
    at org.apache.xalan.transformer.TransformerIdentityImpl.transform(TransformerIdentityImpl.java:484)
    ... 4 more

我是不是做错了什么? 感谢您的帮助!

提前致谢!

编辑 这是 glosar_test.xml:

<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE glossary PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
        "http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd">

<glossary id="glossar">
    <title>Glossar</title>
    <glossdiv id="gloss_E">
        <title>E</title>
        <glossentry id="gloss_etm-datei">
            <glossterm id="glossterm_etm_datei">
                <indexterm>
                    <primary>ETM-Datei</primary>
                </indexterm>
                <emphasis role="bold">ETM-Datei</emphasis>
            </glossterm>
            <glossdef>
                <para>
                    Glossary_Text
                </para>
            </glossdef>
        </glossentry>

    </glossdiv>
</glossary>

我找到了解决办法! 在 http://www.sagehill.net/docbookxsl/LinkToGlossary.html 处描述了如何引用词汇表。通常,您会使用 glossterm 标签,例如:

<para>Set your <glossterm linkend="NetAddr">network address</glossterm>.
</para>
...
<glossary>
  <glossentry id="NetAddr">
    <glossterm>Network address</glossterm>
    <glossdef><para>Four numbers separated by periods</para></glossdef>
  </glossentry>
</glossary>

您也可以使用 linkxref 标签。如果您使用外部参照标签,则必须将 id 属性添加到 glossterm 并将匹配的 endterm 属性添加到 xref 元素。例如:

<xref linkend="ge-xslfoprocessor" endterm="gt-xslfoprocessor"/>
...

<glossentry id="ge-xslfoprocessor">
<glossterm id="gt-xslfoprocessor">XSL-FO processor</glossterm>
<glossdef>
<para>Software component that converts an XSL-FO document into a
formatted document.</para>
</glossdef>
</glossentry>

如果这样做,fo:wrapper 的 ID 将被删除,fop 可以将其解析为 pdf 文件。