PDF/A 未提取 XMP 数据

PDF/A XMP Data is not picked up

我正在尝试为发票创建 PDF/A 文件。因此,我尝试使用 gofpdf Library. Setting the headers seem to work fine but the XMP Data is not recognised by any of my validators like exiftool or a validation website. I'm using the PDF library like this: You can find a reproducable example here 为我的文件设置 XMP Headers。

    pdf, customerNumber, err := GeneratePDF(type, id, user, nil)
    if err != nil {
        return err
    }

    pointerVal := reflect.ValueOf(pdf.Fpdf)
    val := reflect.Indirect(pointerVal)

    member := val.FieldByName("pdfVersion")
    ptrToY := unsafe.Pointer(member.UnsafeAddr())
    realPtrToY := (*string)(ptrToY)
    *realPtrToY = "1.4"
    pdf.SetXmpMetadata(XMP_HEADER)

    err = s.SavePDFAndRespondWith(type, id, customerNumber, user, pdf)
    if err != nil {
        return err
    }

内容 XMP 内容如下所示,是从工作示例文件中提取的。示例文件不是用 Go 和 gofpdf 生成的。

     var XMP_HEADER = []byte(`
    <?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?>
    <x:xmpmeta xmlns:x="adobe:ns:meta/">
    <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
    <rdf:Description rdf:about="" xmlns:dc="http://purl.org/dc/elements/1.1/"><dc:title><rdf:Alt><rdf:li xml:lang="x-default" ></rdf:li></rdf:Alt></dc:title><dc:creator><rdf:Seq><rdf:li></rdf:li></rdf:Seq></dc:creator><dc:subject><rdf:Bag><rdf:li></rdf:li></rdf:Bag></dc:subject><dc:format>application/pdf</dc:format><dc:description><rdf:Alt><rdf:li xml:lang="x-default" ></rdf:li></rdf:Alt></dc:description></rdf:Description>
    <rdf:Description rdf:about="" xmlns:pdf="http://ns.adobe.com/pdf/1.3/"><pdf:Producer>iTextSharp 4.1.0 (based on iText 2.1.0)</pdf:Producer><pdf:Keywords></pdf:Keywords></rdf:Description>
    <rdf:Description rdf:about="" xmlns:xmp="http://ns.adobe.com/xap/1.0/"><xmp:ModifyDate>2020-03-13T08:44:31+01:00</xmp:ModifyDate><xmp:CreatorTool>Symtrax - Compleo Suite</xmp:CreatorTool><xmp:CreateDate>2020-03-13T08:44:31+01:00</xmp:CreateDate></rdf:Description>
    <rdf:Description rdf:about="" xmlns:pdfaid="http://www.aiim.org/pdfa/ns/id/"><pdfaid:part>3</pdfaid:part><pdfaid:conformance>A</pdfaid:conformance></rdf:Description>
    <rdf:Description xmlns:pdfaExtension="http://www.aiim.org/pdfa/ns/extension/" xmlns:pdfaSchema="http://www.aiim.org/pdfa/ns/schema#" xmlns:pdfaProperty="http://www.aiim.org/pdfa/ns/property#" rdf:about=""><pdfaExtension:schemas><rdf:Bag><rdf:li rdf:parseType="Resource"><pdfaSchema:schema>Factur-X PDFA Extension Schema</pdfaSchema:schema><pdfaSchema:namespaceURI>urn:factur-x:pdfa:CrossIndustryDocument:invoice:1p0#</pdfaSchema:namespaceURI><pdfaSchema:prefix>fx</pdfaSchema:prefix><pdfaSchema:property><rdf:Seq><rdf:li rdf:parseType="Resource"><pdfaProperty:name>DocumentFileName</pdfaProperty:name><pdfaProperty:valueType>Text</pdfaProperty:valueType><pdfaProperty:category>external</pdfaProperty:category><pdfaProperty:description>name of the embedded XML invoice file</pdfaProperty:description></rdf:li><rdf:li rdf:parseType="Resource"><pdfaProperty:name>DocumentType</pdfaProperty:name><pdfaProperty:valueType>Text</pdfaProperty:valueType><pdfaProperty:category>external</pdfaProperty:category><pdfaProperty:description>INVOICE</pdfaProperty:description></rdf:li><rdf:li rdf:parseType="Resource"><pdfaProperty:name>Version</pdfaProperty:name><pdfaProperty:valueType>Text</pdfaProperty:valueType> <pdfaProperty:category>external</pdfaProperty:category><pdfaProperty:description>The actual version of the Factur-X XML schema</pdfaProperty:description></rdf:li><rdf:li rdf:parseType="Resource"><pdfaProperty:name>ConformanceLevel</pdfaProperty:name><pdfaProperty:valueType>Text</pdfaProperty:valueType><pdfaProperty:category>external</pdfaProperty:category><pdfaProperty:description>The conformance level of the embedded Factur-X data</pdfaProperty:description></rdf:li></rdf:Seq></pdfaSchema:property></rdf:li></rdf:Bag></pdfaExtension:schemas></rdf:Description><rdf:Description xmlns:fx="urn:factur-x:pdfa:CrossIndustryDocument:invoice:1p0#" rdf:about="" fx:ConformanceLevel="EN 16931" fx:DocumentFileName="factur-x.xml" fx:DocumentType="INVOICE" fx:Version="1.0"/>
</rdf:RDF></x:xmpmeta>
<?xpacket end="w"?>`)

打开结果文件 (example) 时,您可以看到嵌入的 XMP 数据为:

<< /Type /Metadata /Subtype /XML /Length 3286 >>
stream
  <<< ... RDF ...  >>>
endstream
endobj
6 0 obj
<<
/Producer (FPDF 1.7)
/CreationDate (D:20200615175638)
>>
endobj
7 0 obj

这个 XMP 似乎没有被任何验证器或 adobe 选中。

感谢任何帮助。

你可以看到数据,但就PDF而言,它只是文件中的垃圾,从未使用过。有效的 XMP 元数据需要在 PDF 结构中声明,特别是在目录对象中。您的目录对象如下所示:

7 0 obj
<<
/Type /Catalog
/Pages 1 0 R
>>
endobj

健康的 PDF 文件如下所示:

5 0 obj
<<
/Metadata 2 0 R
/Pages 1 0 R
/Type /Catalog
>>
endobj

缩进和对象编号当然不重要。重要的是目录应包含一个名为 "Metadata" 的键,该键指向您的 XMP 流。我的 PDF 规范版本中的第 7.7.2 段。

所以您需要了解如何使用您拥有的库实现这一点。

PS:顺便说一下,有趣的是,一个 XMP 扫描仪应用程序被创建为与文件格式无关(至少最初的想法是这样), 拿起你的XMP,因为它只会对搜索 XMP 签名的文件进行扫描:)