在 PDF/A 文档中读取和写入 xml 元数据
Reading and writing xml metadata in PDF/A document
我需要读取和写入 PDF/A 文件的 XMP 元数据。
我正在使用 itextsharp 7 并尝试了多种方法来实现我的目标,但都没有成功。 control:Anzahl_Zeichen_Titel
之类的字段是我的目标。
下面的代码应该可以完成工作,但我不知道具体是怎么做的。
PdfADocument pdfADocument = new PdfADocument(new PdfReader(Vorlage), new PdfWriter(Ausgabe), new StampingProperties());
XMPMeta xmpMeta = XMPMetaFactory.ParseFromBuffer(pdfADocument.GetXmpMetadata());
XMPProperty test1 = xmpMeta.GetProperty("ftx:ControlData", "control:Anzahl_Zeichen_Vorname");
XMPProperty test2 = xmpMeta.GetProperty("http://www.aiim.org/pdfa/ns/schema#", "ControlData");
当我使用 test1 版本时,它显示 XMPException "Unregistered schema namespace URI"。
第二个似乎有效,但 test2 变量为空。
<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?>
<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core 5.6-c015 84.159810, 2016/09/10-02:41:30 ">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about=""
xmlns:xmp="http://ns.adobe.com/xap/1.0/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:pdf="http://ns.adobe.com/pdf/1.3/"
xmlns:xmpMM="http://ns.adobe.com/xap/1.0/mm/"
xmlns:stEvt="http://ns.adobe.com/xap/1.0/sType/ResourceEvent#"
xmlns:pdfaExtension="http://www.aiim.org/pdfa/ns/extension/"
xmlns:pdfaSchema="http://www.aiim.org/pdfa/ns/schema#"
xmlns:pdfaProperty="http://www.aiim.org/pdfa/ns/property#"
xmlns:pdfaType="http://www.aiim.org/pdfa/ns/type#"
xmlns:pdfaField="http://www.aiim.org/pdfa/ns/field#"
xmlns:ftx="http://ns.ftx.com/forms/1.0/"
xmlns:control="http://ns.ftx.com/forms/1.0/controldata/">
<xmp:CreatorTool>QuarkXPress(R) 8.12</xmp:CreatorTool>
<xmp:CreateDate>2017-03-14T08:56:49+01:00</xmp:CreateDate>
<xmp:ModifyDate>2017-04-11T14:35:21+02:00</xmp:ModifyDate>
<xmp:MetadataDate>2017-04-11T14:35:21+02:00</xmp:MetadataDate>
<dc:format>application/pdf</dc:format>
<!-- snip -->
<ftx:ControlData rdf:parseType="Resource">
<control:Anzahl_Zeichen_Titel>0</control:Anzahl_Zeichen_Titel>
<control:Anzahl_Zeichen_Vorname>0</control:Anzahl_Zeichen_Vorname>
<control:Anzahl_Zeichen_Namenszusatz>0</control:Anzahl_Zeichen_Namenszusatz>
<control:Anzahl_Zeichen_Hausnummer>0</control:Anzahl_Zeichen_Hausnummer>
<control:Anzahl_Zeichen_Postleitzahl>0</control:Anzahl_Zeichen_Postleitzahl>
<control:Anzahl_Zeichen_Wohnsitzlaendercode>0</control:Anzahl_Zeichen_Wohnsitzlaendercode>
<control:Auftragsnummer_Einsender>0</control:Auftragsnummer_Einsender>
<control:Formularnummer>10</control:Formularnummer>
<control:Formularversion>07.2017</control:Formularversion>
</ftx:ControlData>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
<?xpacket end="w"?>
我必须如何使用这些方法来创建和读取有效数据?
XMPMeta.getProperty
记录为:
/**
* The property value getter-methods all take a property specification: the first two parameters
* are always the top level namespace URI (the "schema" namespace) and the basic name
* of the property being referenced. See the introductory discussion of path expression usage
* for more information.
* <p>
* All of the functions return an object inherited from <code>PropertyBase</code> or
* <code>null</code> if the property does not exists. The result object contains the value of
* the property and option flags describing the property. Arrays and the non-leaf levels of
* nodes do not have values.
* <p>
* See {@link PropertyOptions} for detailed information about the options.
* <p>
* This is the simplest property getter, mainly for top level simple properties or after using
* the path composition functions in XMPPathFactory.
*
* @param schemaNS The namespace URI for the property. May be <code>null</code> or the empty
* string if the first component of the propName path contains a namespace prefix. The
* URI must be for a registered namespace.
* @param propName The name of the property. May be a general path expression, must not be
* <code>null</code> or the empty string. Using a namespace prefix on the first
* component is optional. If present without a schemaNS value then the prefix specifies
* the namespace. The prefix must be for a registered namespace. If both a schemaNS URI
* and propName prefix are present, they must be corresponding parts of a registered
* namespace.
* @return Returns a <code>XMPProperty</code> containing the value and the options or
* <code>null</code> if the property does not exist.
* @throws XMPException Wraps all errors and exceptions that may occur.
*/
XMPProperty getProperty(String schemaNS, String propName) throws XMPException;
特别是第一个参数必须是 命名空间 URI,所以
XMPProperty test1 = xmpMeta.GetProperty("ftx:ControlData", "control:Anzahl_Zeichen_Vorname");
显然是错误的。
你的第二个选择
XMPProperty test2 = xmpMeta.GetProperty("http://www.aiim.org/pdfa/ns/schema#", "ControlData");
正确地将 命名空间 URI 作为第一个参数。不幸的是,它不是 属性 的 命名空间 URI,它是 http://ns.ftx.com/forms/1.0/
.
因此,你应该试试
XMPProperty test2 = xmpMeta.GetProperty("http://ns.ftx.com/forms/1.0/", "ControlData");
或(因为 schemaNS
记录 可能是 null
或空字符串,如果 propName 路径的第一个组件包含名称空间前缀 )
XMPProperty test2 = xmpMeta.GetProperty(null, "ftx:ControlData");
在mkl的回答的帮助下,我找到了如何读写所需的数据。
private const string NsControlData = "http://ns.ftx.com/forms/1.0/";
private const string NsControl = "http://ns.ftx.com/forms/1.0/controldata/";
// Opens the template file as PDF/A document.
PdfADocument pdfADocument = new PdfADocument(new iText.Kernel.Pdf.PdfReader("input.pdf"), new PdfWriter("output.pdf"), new StampingProperties());
// Reading the metadata from input file.
byte[] xmpMetadata = pdfADocument.GetXmpMetadata();
// Parse the metadata
XMPMeta parser = XMPMetaParser.Parse(xmpMetadata, new ParseOptions());
// Read a value
XMPProperty anzahlZeichenTitel = parser.GetStructField(NsControlData, "ControlData", NsControl, "Anzahl_Zeichen_Titel");
// Write a value
parser.SetStructField(NsControlData, "ControlData", NsControl, "Anzahl_Zeichen_Titel", "333");
// writing new file with new metadata.
pdfADocument.SetXmpMetadata(parser);
pdfADocument.GetWriter().Flush();
pdfADocument.Close();
我需要读取和写入 PDF/A 文件的 XMP 元数据。
我正在使用 itextsharp 7 并尝试了多种方法来实现我的目标,但都没有成功。 control:Anzahl_Zeichen_Titel
之类的字段是我的目标。
下面的代码应该可以完成工作,但我不知道具体是怎么做的。
PdfADocument pdfADocument = new PdfADocument(new PdfReader(Vorlage), new PdfWriter(Ausgabe), new StampingProperties());
XMPMeta xmpMeta = XMPMetaFactory.ParseFromBuffer(pdfADocument.GetXmpMetadata());
XMPProperty test1 = xmpMeta.GetProperty("ftx:ControlData", "control:Anzahl_Zeichen_Vorname");
XMPProperty test2 = xmpMeta.GetProperty("http://www.aiim.org/pdfa/ns/schema#", "ControlData");
当我使用 test1 版本时,它显示 XMPException "Unregistered schema namespace URI"。 第二个似乎有效,但 test2 变量为空。
<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?>
<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core 5.6-c015 84.159810, 2016/09/10-02:41:30 ">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about=""
xmlns:xmp="http://ns.adobe.com/xap/1.0/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:pdf="http://ns.adobe.com/pdf/1.3/"
xmlns:xmpMM="http://ns.adobe.com/xap/1.0/mm/"
xmlns:stEvt="http://ns.adobe.com/xap/1.0/sType/ResourceEvent#"
xmlns:pdfaExtension="http://www.aiim.org/pdfa/ns/extension/"
xmlns:pdfaSchema="http://www.aiim.org/pdfa/ns/schema#"
xmlns:pdfaProperty="http://www.aiim.org/pdfa/ns/property#"
xmlns:pdfaType="http://www.aiim.org/pdfa/ns/type#"
xmlns:pdfaField="http://www.aiim.org/pdfa/ns/field#"
xmlns:ftx="http://ns.ftx.com/forms/1.0/"
xmlns:control="http://ns.ftx.com/forms/1.0/controldata/">
<xmp:CreatorTool>QuarkXPress(R) 8.12</xmp:CreatorTool>
<xmp:CreateDate>2017-03-14T08:56:49+01:00</xmp:CreateDate>
<xmp:ModifyDate>2017-04-11T14:35:21+02:00</xmp:ModifyDate>
<xmp:MetadataDate>2017-04-11T14:35:21+02:00</xmp:MetadataDate>
<dc:format>application/pdf</dc:format>
<!-- snip -->
<ftx:ControlData rdf:parseType="Resource">
<control:Anzahl_Zeichen_Titel>0</control:Anzahl_Zeichen_Titel>
<control:Anzahl_Zeichen_Vorname>0</control:Anzahl_Zeichen_Vorname>
<control:Anzahl_Zeichen_Namenszusatz>0</control:Anzahl_Zeichen_Namenszusatz>
<control:Anzahl_Zeichen_Hausnummer>0</control:Anzahl_Zeichen_Hausnummer>
<control:Anzahl_Zeichen_Postleitzahl>0</control:Anzahl_Zeichen_Postleitzahl>
<control:Anzahl_Zeichen_Wohnsitzlaendercode>0</control:Anzahl_Zeichen_Wohnsitzlaendercode>
<control:Auftragsnummer_Einsender>0</control:Auftragsnummer_Einsender>
<control:Formularnummer>10</control:Formularnummer>
<control:Formularversion>07.2017</control:Formularversion>
</ftx:ControlData>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
<?xpacket end="w"?>
我必须如何使用这些方法来创建和读取有效数据?
XMPMeta.getProperty
记录为:
/**
* The property value getter-methods all take a property specification: the first two parameters
* are always the top level namespace URI (the "schema" namespace) and the basic name
* of the property being referenced. See the introductory discussion of path expression usage
* for more information.
* <p>
* All of the functions return an object inherited from <code>PropertyBase</code> or
* <code>null</code> if the property does not exists. The result object contains the value of
* the property and option flags describing the property. Arrays and the non-leaf levels of
* nodes do not have values.
* <p>
* See {@link PropertyOptions} for detailed information about the options.
* <p>
* This is the simplest property getter, mainly for top level simple properties or after using
* the path composition functions in XMPPathFactory.
*
* @param schemaNS The namespace URI for the property. May be <code>null</code> or the empty
* string if the first component of the propName path contains a namespace prefix. The
* URI must be for a registered namespace.
* @param propName The name of the property. May be a general path expression, must not be
* <code>null</code> or the empty string. Using a namespace prefix on the first
* component is optional. If present without a schemaNS value then the prefix specifies
* the namespace. The prefix must be for a registered namespace. If both a schemaNS URI
* and propName prefix are present, they must be corresponding parts of a registered
* namespace.
* @return Returns a <code>XMPProperty</code> containing the value and the options or
* <code>null</code> if the property does not exist.
* @throws XMPException Wraps all errors and exceptions that may occur.
*/
XMPProperty getProperty(String schemaNS, String propName) throws XMPException;
特别是第一个参数必须是 命名空间 URI,所以
XMPProperty test1 = xmpMeta.GetProperty("ftx:ControlData", "control:Anzahl_Zeichen_Vorname");
显然是错误的。
你的第二个选择
XMPProperty test2 = xmpMeta.GetProperty("http://www.aiim.org/pdfa/ns/schema#", "ControlData");
正确地将 命名空间 URI 作为第一个参数。不幸的是,它不是 属性 的 命名空间 URI,它是 http://ns.ftx.com/forms/1.0/
.
因此,你应该试试
XMPProperty test2 = xmpMeta.GetProperty("http://ns.ftx.com/forms/1.0/", "ControlData");
或(因为 schemaNS
记录 可能是 null
或空字符串,如果 propName 路径的第一个组件包含名称空间前缀 )
XMPProperty test2 = xmpMeta.GetProperty(null, "ftx:ControlData");
在mkl的回答的帮助下,我找到了如何读写所需的数据。
private const string NsControlData = "http://ns.ftx.com/forms/1.0/";
private const string NsControl = "http://ns.ftx.com/forms/1.0/controldata/";
// Opens the template file as PDF/A document.
PdfADocument pdfADocument = new PdfADocument(new iText.Kernel.Pdf.PdfReader("input.pdf"), new PdfWriter("output.pdf"), new StampingProperties());
// Reading the metadata from input file.
byte[] xmpMetadata = pdfADocument.GetXmpMetadata();
// Parse the metadata
XMPMeta parser = XMPMetaParser.Parse(xmpMetadata, new ParseOptions());
// Read a value
XMPProperty anzahlZeichenTitel = parser.GetStructField(NsControlData, "ControlData", NsControl, "Anzahl_Zeichen_Titel");
// Write a value
parser.SetStructField(NsControlData, "ControlData", NsControl, "Anzahl_Zeichen_Titel", "333");
// writing new file with new metadata.
pdfADocument.SetXmpMetadata(parser);
pdfADocument.GetWriter().Flush();
pdfADocument.Close();