iText 数字签名损坏 PDF/A 2b

Question

当使用 itext v5.5.11 对文档进行数字签名时 PDF/A-2b 文档损坏 - 这意味着它们不再有效作为 PDF/A 文档。违反了以下规则： https://github.com/veraPDF/veraPDF-validation-profiles/wiki/PDFA-Parts-2-and-3-rules#rule-643-1

在上面的 link 中指定了摘要无效，因此我还给了你一个代码段，它在使用 iText 签署 pdf 文档时处理计算摘要：

        // Make the digest
        InputStream data;
        try {

            data = signatureAppearance.getRangeStream();
        } catch (IOException e) {
            String message = "MessageDigest error for signature input, type: IOException";
            signLogger.logError(message, e);
            throw new CustomException(message, e);
        }
        MessageDigest messageDigest;
        try {
            messageDigest = MessageDigest.getInstance("SHA1");

        } catch (NoSuchAlgorithmException ex) {
            String message = "MessageDigest error for signature input, type: NoSuchAlgorithmException";
            signLogger.logError(message, ex);
            throw new CustomException(message, ex);
        }
        byte[] buf = new byte[8192];
        int n;
        try {
            while ((n = data.read(buf)) > 0) {
                messageDigest.update(buf, 0, n);
            }
        } catch (IOException ex) {
            String message = "MessageDigest update error for signature input, type: IOException";
            signLogger.logError(message, ex);
            throw new CustomException(message, ex);
        }
        byte[] hash = messageDigest.digest();
        // If we add a time stamp:
        // Create the signature
        PdfPKCS7 sgn;
        try {

            sgn = new PdfPKCS7(key, chain, configuration.getSignCertificate().getSignatureHashAlgorithm().value() , null, new BouncyCastleDigest(), false);
        } catch (InvalidKeyException ex) {
            String message = "Certificate PDF sign error for signature input, type: InvalidKeyException";
            signLogger.logError(message, ex);
            throw new CustomException(message, ex);
        } catch (NoSuchProviderException ex) {
            String message = "Certificate PDF sign error for signature input, type: NoSuchProviderException";
            signLogger.logError(message, ex);
            throw new CustomException(message, ex);
        } catch (NoSuchAlgorithmException ex) {
            String message = "Certificate PDF sign error for signature input, type: NoSuchAlgorithmException";
            signLogger.logError(message, ex);
            throw new CustomException(message, ex);
        }catch (Exception ex) {
            String message = "Certificate PDF sign error for signature input, type: Exception";
            signLogger.logError(message, ex);
            throw new CustomException(message, ex);
        }
        byte[] sh = sgn.getAuthenticatedAttributeBytes(hash, null,null, MakeSignature.CryptoStandard.CMS);
        try {
            sgn.update(sh, 0, sh.length);
        } catch (java.security.SignatureException ex) {
            String message = "Certificate PDF sign error for signature input, type: SignatureException";
            signLogger.logError(message, ex);
            throw new CustomException(message, ex);
        }
        byte[] encodedSig = sgn.getEncodedPKCS7(hash);
        if (contentEstimated + 2 < encodedSig.length) {
            String message = "The estimated size for the signature is smaller than the required one. Terminating request..";
            signLogger.log("ERROR", message);
            throw new CustomException(message);
        }
        byte[] paddedSig = new byte[contentEstimated];
        System.arraycopy(encodedSig, 0, paddedSig, 0, encodedSig.length);
        // Replace the contents
        PdfDictionary dic2 = new PdfDictionary();
        dic2.put(PdfName.CONTENTS, new PdfString(paddedSig).setHexWriting(true));
        try {
            signatureAppearance.close(dic2);
        } catch (IOException ex) {
            String message = "PdfSignatureAppearance close error for signature input, type: IOException";
            signLogger.logError(message, ex);
            throw new CustomException(message, ex);
        } catch (DocumentException ex) {
            String message = "PdfSignatureAppearance close error for signature input, type: DocumentException";
            signLogger.logError(message, ex);
            throw new CustomException(message, ex);
        }

对于 PDF/A 验证，我使用 VeraPDF 库。

提及 VeraPDF 库报告已损坏 PDF/A 库，Adobe Reader 验证工具报告 PDF/A 文档未损坏可能也有帮助。

如有任何帮助，我们将不胜感激。

Answer 1

When digitally signing document with itext v5.5.11 PDF/A-2b documents get corrupted - meaning they are no longer valid as PDF/A documents. Following rule is violated: https://github.com/veraPDF/veraPDF-validation-profiles/wiki/PDFA-Parts-2-and-3-rules#rule-643-1

虽然这确实是 veraPDF 所声称的，但这是错误的； iText 创建的签名覆盖整个修订版减去为签名容器保留的 space。

这种不正确的违规检测的原因是 veraPDF 中的一个错误。

veraPDF 如何确定带符号的字节范围是否有效

veryPDF 版本（基于 greenfield 解析器的版本和基于 PDFBox 的版本）都试图确定标称字节范围值并将其与实际值进行比较。它是这样确定标称值的：

public long[] getByteRangeBySignatureOffset(long signatureOffset) throws IOException {
    pdfSource.seek(signatureOffset);
    skipID();
    byteRange[0] = 0;
    parseDictionary();
    byteRange[3] = getOffsetOfNextEOF(byteRange[2]) - byteRange[2];
    return byteRange;
}

private long getOffsetOfNextEOF(long currentOffset) throws IOException {
    byte[] buffer = new byte[EOF_STRING.length];
    pdfSource.seek(currentOffset + document.getHeaderOffset());
    readWholeBuffer(pdfSource, buffer);
    pdfSource.rewind(buffer.length - 1);
    while (!Arrays.equals(buffer, EOF_STRING)) {    //TODO: does it need to be optimized?
        readWholeBuffer(pdfSource, buffer);
        if (pdfSource.isEOF()) {
            pdfSource.seek(currentOffset + document.getHeaderOffset());
            return pdfSource.length();
        }
        pdfSource.rewind(buffer.length - 1);
    }
    long result = pdfSource.getPosition() + buffer.length - 1;  // offset of byte after 'F'
    pdfSource.seek(currentOffset + document.getHeaderOffset());
    return result - 1;
}

（基于 PDFBox SignatureParser class）

public long[] getByteRangeBySignatureOffset(long signatureOffset) throws IOException {
    source.seek(signatureOffset);
    skipID();
    byteRange[0] = 0;
    parseDictionary();
    byteRange[3] = getOffsetOfNextEOF(byteRange[2]) - byteRange[2];
    return byteRange;
}

private long getOffsetOfNextEOF(long currentOffset) throws IOException {
    byte[] buffer = new byte[EOF_STRING.length];
    source.seek(currentOffset + document.getHeader().getHeaderOffset());
    source.read(buffer);
    source.unread(buffer.length - 1);
    while (!Arrays.equals(buffer, EOF_STRING)) {    //TODO: does it need to be optimized?
        source.read(buffer);
        if (source.isEOF()) {
            source.seek(currentOffset + document.getHeader().getHeaderOffset());
            return source.getStreamLength();
        }
        source.unread(buffer.length - 1);
    }
    long result = source.getOffset() - 1 + buffer.length;   // byte right after 'F'
    source.seek(currentOffset + document.getHeader().getHeaderOffset());
    return result - 1;
}

（基于绿地解析器 SignatureParser）

基本上两个实现在这里做同样的事情，从签名开始他们寻找下一次出现的文件结束标记 %%EOF 并尝试完成标称字节范围值，以便第二个范围以该标记结束。

为什么这是错误的

这种确定标称带符号字节范围值的方法错误的原因有多种：

根据PDF/A规范，

No data can follow the last end-of-file marker except a single optional end-of-line marker as described in ISO 32000-1:2008, 7.5.5.

因此，紧接在下一个文件结束标记 %%EOF 之后的偏移量不一定已经是签名修订的结尾，正确的偏移量可能是下一个行尾标记之后的那个！由于 PDF 行尾标记可以是单个 CR 或单个 LF 或 CRLF 组合，这意味着 veraPDF 选择三个可能的偏移量之一并声称它是标称结束修订，因此，有符号字节范围的标称结束。
有可能（即使几乎从未见过）在一次修订中准备签名值（以文件结束标记结束），然后在增量更新中附加一些数据产生一个新的修订（以另一个文件结束标记结束），然后签名值填充为签署文档的值，包括这个新修订。

由于 veraPDF 在签名字典后使用 下一个文件结束标记，在这种情况下 veraPDF 实际上选择了错误的文件结束标记.
文件结束标记 %%EOF 在句法上实际上只是在 PDF / 修订版末尾具有特殊含义的注释，PDF 中几乎所有地方都允许注释在 PDF 字符串、PDF 流数据和 PDF 交叉引用表之外。因此，字节序列 %%EOF 可以作为常规注释或字符串或流的非注释内容在签名值字典和已签名修订的实际结尾之间出现任意次数。

如果出现这种情况，veraPDF 会选择一个字节序列作为文件结束标记，该标记从来没有作为某事的结束.

此外，除非在循环中到达实际的文件末尾（并返回 pdfSource.length() / source.getStreamLength()），否则结果似乎差一，- 1 in return result - 1 与结果的使用不对应。

veraPDF 版本

我检查了 veraPDF 的当前 1.5.0-SNAPSHOT 版本，它们被标记为：

veraPDF-pdfbox-验证 1.5.4
veraPDF-验证 1.5.2
veraPDF 解析器 1.5.1

OP 的示例文档

OP 提供的示例文档在文件结束标记后有一个 LF。由于这个和上面提到的差一个问题，veraPDF 确定一个标称有符号字节范围结束，这是两个字节短。

Answer 2

我同意目前对 veraPDF 如何检查 ByteRange 的分析。实际上，它假定文件恰好在紧跟在签名字段之后的 %EOF 标记处终止。

原因很简单。文档可以由多人顺序签名，仍然可以是一个有效的 PDF/A-2B 文档。当生成第二个签名时，它将增量更新包含第一个签名的文件。

因此，如果我们从字面上解释 PDF/A-2B 要求中的术语 file：

When computing the digest for the file, it shall be computed over the entire file, including the signature dictionary but excluding the PDF Signature itself. This range is then indicated by the ByteRange entry of the signature dictionary.

我们永远无法创建具有多个签名的有效 PDF/A 文件。这显然不是 PDF/A-2 标准的意图。

PDF 文件通常被理解为前导 %PDF 到尾随 %EOF 之间的字节范围，例如，允许将 PDF 文件作为更大字节流（例如，邮件附件）的一部分。这就是 veraPDF 实现的基础。

不过我同意这种方法没有考虑 %EOF 之后可选的行尾序列。我为 veraPDF 创建了相应的问题：https://github.com/veraPDF/veraPDF-validation/issues/166

它留下了一个有趣的问题：如果文档有更多签名，第一个签名的有效字节范围是多少？我相信，所有情况：

ByteRange 覆盖文件直到下一个 %EOF 标记
ByteRange 覆盖文件直到下一个 %EOF 标记 + 单个 CR 字符
ByteRange 覆盖文件直到下一个 %EOF 标记 + 单个 LF 字符
ByteRange 覆盖文件直到下一个 %EOF 标记 + 两个字节的 CR+LF 序列

应该允许。

Answer 3

如上所述，我们刚刚发布了解决本次讨论中问题的 veraPDF 1.4 修补程序。新版本可供下载：http://downloads.verapdf.org/rel/1.4/verapdf-1.4.5-installer.zip

特别是，iText 签名的 PDF/A-2 文档似乎可以很好地通过 veraPDF 验证。

iText 数字签名损坏 PDF/A 2b

iText digital signature corrupts PDF/A 2b

java

itext

digital-signature

pdfa

veraPDF 如何确定带符号的字节范围是否有效

为什么这是错误的

veraPDF 版本

OP 的示例文档