PDF/A 使元数据符合 FOP
PDF/A conforming metadata with FOP
我无法获得 PDF/A-1a(根据 pdfbox 预检,甚至 PDF/A-1b)符合 FOP 2.1 元数据的 PDF。
假设我要设置日期、语言、标题和描述:
<fo:declarations xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:pdfaid="http://www.aiim.org/pdfa/ns/id/"
xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xml:lang="de">
<x:xmpmeta xmlns:x="adobe:ns:meta/" id="hc_meta">
<rdf:RDF>
<rdf:Description xmlns:xmp="http://ns.adobe.com/xap/1.0/" rdf:about="">
<xmp:CreatorTool>hx</xmp:CreatorTool>
<dc:language>
<rdf:Bag>
<rdf:li>de</rdf:li>
</rdf:Bag>
</dc:language>
<dc:title>
<rdf:Alt>
<rdf:li xml:lang="de">Schrieb 2016-003 - Dings AG</rdf:li>
</rdf:Alt>
</dc:title>
<dc:creator>
<rdf:Seq>
<rdf:li>hxxxdingens Consulting GmbH, Rodger Moore</rdf:li>
</rdf:Seq>
</dc:creator>
<dc:description>
<rdf:Alt>
<rdf:li xml:lang="de">Schrieb 2016-003 - Dings AG XXX R 7 99 3 - 2016-06-30 (2016:06:30)</rdf:li>
</rdf:Alt>
</dc:description>
<dc:date>
<rdf:Seq>
<rdf:li>2016:06:30</rdf:li>
</rdf:Seq>
</dc:date>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
</fo:declarations>
那么输出将不符合:
$ java -jar ~/prog/hcbriefe/preflight-app-2.0.2.jar test_1.pdf
The file test_1.pdf is not valid, error(s) :
7.2 : Error on MetaData, Title present in the document catalog dictionary can't be found in XMP information (Property is not defined)
7.2 : Error on MetaData, Subject present in the document catalog dictionary can't be found in XMP information (Subject not found in XMP (dc:description["x-default"] not found))
但是当我调用exiftool在PDF上设置标题和描述时,它会通过这个测试:
$ cp test_1.pdf test_1mod.pdf
$ exiftool -title="Schrieb 2016-003 - Dings AG" \
-description="Schrieb 2016-003 - Dings AG XXX R 7 99 3 - 2016-06-30 (2016:06:30)" \
test_1mod.pdf
1 image files updated
$ java -jar ~/prog/hcbriefe/preflight-app-2.0.2.jar test_1mod.pdf
The file test_1mod.pdf is a valid PDF/A-1b file
我必须在 fo 元数据中放入什么才能使其符合 out-of-the-box 或直接脱离 FOP?
经过一番比较,我发现了。 description 和 title 元素中的语言可以不设置为 de
但必须设置为 x-default
like
...
<dc:title>
<rdf:Alt>
<rdf:li xml:lang="x-default">Schrieb 2016-003 - Dings AG</rdf:li>
</rdf:Alt>
</dc:title>
...
<dc:description>
<rdf:Alt>
<rdf:li xml:lang="x-default">Schrieb 2016-003 - Dings AG XXX R 7 99 3 - 2016-06-30 (2016:06:30)</rdf:li>
</rdf:Alt>
</dc:description>
<dc:date>
<!-- some validators will complain if date has : instead of - !! -->
<rdf:Seq>
<rdf:li>2016-06-30</rdf:li>
</rdf:Seq>
</dc:date>
...
然后它将通过pdfbox预检测试。
此外,日期在 y、m、d 之间必须有 -
分隔符以符合在线 pdf-tools.com 验证程序。
我无法获得 PDF/A-1a(根据 pdfbox 预检,甚至 PDF/A-1b)符合 FOP 2.1 元数据的 PDF。
假设我要设置日期、语言、标题和描述:
<fo:declarations xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:pdfaid="http://www.aiim.org/pdfa/ns/id/"
xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xml:lang="de">
<x:xmpmeta xmlns:x="adobe:ns:meta/" id="hc_meta">
<rdf:RDF>
<rdf:Description xmlns:xmp="http://ns.adobe.com/xap/1.0/" rdf:about="">
<xmp:CreatorTool>hx</xmp:CreatorTool>
<dc:language>
<rdf:Bag>
<rdf:li>de</rdf:li>
</rdf:Bag>
</dc:language>
<dc:title>
<rdf:Alt>
<rdf:li xml:lang="de">Schrieb 2016-003 - Dings AG</rdf:li>
</rdf:Alt>
</dc:title>
<dc:creator>
<rdf:Seq>
<rdf:li>hxxxdingens Consulting GmbH, Rodger Moore</rdf:li>
</rdf:Seq>
</dc:creator>
<dc:description>
<rdf:Alt>
<rdf:li xml:lang="de">Schrieb 2016-003 - Dings AG XXX R 7 99 3 - 2016-06-30 (2016:06:30)</rdf:li>
</rdf:Alt>
</dc:description>
<dc:date>
<rdf:Seq>
<rdf:li>2016:06:30</rdf:li>
</rdf:Seq>
</dc:date>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
</fo:declarations>
那么输出将不符合:
$ java -jar ~/prog/hcbriefe/preflight-app-2.0.2.jar test_1.pdf
The file test_1.pdf is not valid, error(s) :
7.2 : Error on MetaData, Title present in the document catalog dictionary can't be found in XMP information (Property is not defined)
7.2 : Error on MetaData, Subject present in the document catalog dictionary can't be found in XMP information (Subject not found in XMP (dc:description["x-default"] not found))
但是当我调用exiftool在PDF上设置标题和描述时,它会通过这个测试:
$ cp test_1.pdf test_1mod.pdf
$ exiftool -title="Schrieb 2016-003 - Dings AG" \
-description="Schrieb 2016-003 - Dings AG XXX R 7 99 3 - 2016-06-30 (2016:06:30)" \
test_1mod.pdf
1 image files updated
$ java -jar ~/prog/hcbriefe/preflight-app-2.0.2.jar test_1mod.pdf
The file test_1mod.pdf is a valid PDF/A-1b file
我必须在 fo 元数据中放入什么才能使其符合 out-of-the-box 或直接脱离 FOP?
经过一番比较,我发现了。 description 和 title 元素中的语言可以不设置为 de
但必须设置为 x-default
like
...
<dc:title>
<rdf:Alt>
<rdf:li xml:lang="x-default">Schrieb 2016-003 - Dings AG</rdf:li>
</rdf:Alt>
</dc:title>
...
<dc:description>
<rdf:Alt>
<rdf:li xml:lang="x-default">Schrieb 2016-003 - Dings AG XXX R 7 99 3 - 2016-06-30 (2016:06:30)</rdf:li>
</rdf:Alt>
</dc:description>
<dc:date>
<!-- some validators will complain if date has : instead of - !! -->
<rdf:Seq>
<rdf:li>2016-06-30</rdf:li>
</rdf:Seq>
</dc:date>
...
然后它将通过pdfbox预检测试。
此外,日期在 y、m、d 之间必须有 -
分隔符以符合在线 pdf-tools.com 验证程序。