PDF/A 使元数据符合 FOP

PDF/A conforming metadata with FOP

我无法获得 PDF/A-1a(根据 pdfbox 预检,甚至 PDF/A-1b)符合 FOP 2.1 元数据的 PDF。

假设我要设置日期、语言、标题和描述:

<fo:declarations xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:pdfaid="http://www.aiim.org/pdfa/ns/id/" 
  xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"  xml:lang="de">
    <x:xmpmeta xmlns:x="adobe:ns:meta/" id="hc_meta">
        <rdf:RDF>
            <rdf:Description xmlns:xmp="http://ns.adobe.com/xap/1.0/" rdf:about="">
                <xmp:CreatorTool>hx</xmp:CreatorTool>
                <dc:language>
                    <rdf:Bag>
                        <rdf:li>de</rdf:li>
                    </rdf:Bag>
                </dc:language>
                <dc:title>
                    <rdf:Alt>
                        <rdf:li xml:lang="de">Schrieb 2016-003 - Dings AG</rdf:li>
                    </rdf:Alt>
                </dc:title>
                <dc:creator>
                    <rdf:Seq>
                        <rdf:li>hxxxdingens Consulting GmbH, Rodger Moore</rdf:li>
                    </rdf:Seq>
                </dc:creator>
                <dc:description>
                    <rdf:Alt>
                        <rdf:li xml:lang="de">Schrieb 2016-003 - Dings AG XXX R 7 99 3 - 2016-06-30 (2016:06:30)</rdf:li>
                    </rdf:Alt>
                </dc:description>
                <dc:date>
                    <rdf:Seq>
                        <rdf:li>2016:06:30</rdf:li>
                    </rdf:Seq>
                </dc:date>
            </rdf:Description>
        </rdf:RDF>
    </x:xmpmeta>
</fo:declarations>

那么输出将不符合:

$ java -jar ~/prog/hcbriefe/preflight-app-2.0.2.jar test_1.pdf
The file test_1.pdf is not valid, error(s) :
7.2 : Error on MetaData, Title present in the document catalog dictionary can't be found in XMP information (Property is not defined)
7.2 : Error on MetaData, Subject present in the document catalog dictionary can't be found in XMP information (Subject not found in XMP (dc:description["x-default"] not found))

但是当我调用exiftool在PDF上设置标题和描述时,它会通过这个测试:

$ cp test_1.pdf test_1mod.pdf
$ exiftool -title="Schrieb 2016-003 - Dings AG" \
  -description="Schrieb 2016-003 - Dings AG XXX R 7 99 3 - 2016-06-30 (2016:06:30)" \
   test_1mod.pdf
    1 image files updated

$ java -jar ~/prog/hcbriefe/preflight-app-2.0.2.jar test_1mod.pdf
The file test_1mod.pdf is a valid PDF/A-1b file

我必须在 fo 元数据中放入什么才能使其符合 out-of-the-box 或直接脱离 FOP?

经过一番比较,我发现了。 description 和 title 元素中的语言可以不设置为 de 但必须设置为 x-default like

      ...
      <dc:title>
          <rdf:Alt>
              <rdf:li xml:lang="x-default">Schrieb 2016-003 - Dings AG</rdf:li>
          </rdf:Alt>
      </dc:title>
      ...
      <dc:description>
          <rdf:Alt>
              <rdf:li xml:lang="x-default">Schrieb 2016-003 - Dings AG XXX R 7 99 3 - 2016-06-30 (2016:06:30)</rdf:li>
          </rdf:Alt>
      </dc:description>
      <dc:date>
<!-- some validators will complain if date has : instead of - !! -->
          <rdf:Seq>
              <rdf:li>2016-06-30</rdf:li>
          </rdf:Seq>
      </dc:date>
      ...

然后它将通过pdfbox预检测试。

此外,日期在 y、m、d 之间必须有 - 分隔符以符合在线 pdf-tools.com 验证程序。