尽管密码错误,pdfbox 加密文件仍会打开

pdfbox encrypted file opens despite the password being wrong

下载pdf文件时,我指定了密码“123456789987654abc211234567899klm7654321”。打开的时候,我可以去掉几个字符,比如,

"123456789987654abc211234567899kl" - 无论如何都会打开文件!但是如果我使用

“123456789987654abc211234567899k”- 文件未打开


帮助我了解问题所在

    private static void encryptPdf(
        InputStream inputStream,
        OutputStream outputStream,
        String ownerPassword,
        String userPassword) throws Exception
{
    PDDocument document = PDDocument.load(inputStream);
    if (document.isEncrypted())
    {
        return;
    }
    AccessPermission accessPermission = new AccessPermission();
    StandardProtectionPolicy spp =
            new StandardProtectionPolicy(ownerPassword, userPassword, accessPermission);
    spp.setEncryptionKeyLength(40);
    document.protect(spp);

    document.save(outputStream);
    document.close();
}

根据修订版 4 之前的 pdf 加密密码计算加密密钥的第一步是

The password string is generated from host system codepage characters (or system scripts) by first converting the string to PDFDocEncoding. If the input is Unicode, first convert to a codepage encoding, and then to PDFDocEncoding for backward compatibility. Pad or truncate the resulting password string to exactly 32 bytes. If the password string is more than 32 bytes long, use only its first 32 bytes; if it is less than 32 bytes long, pad it by appending the required number of additional bytes from the beginning of the following padding string:

<28 BF 4E 5E 4E 75 8A 41 64 00 4E 56 FF FA 01 08
2E 2E 00 B6 D0 68 3E 80 2F 0C A9 FE 64 53 69 7A>

That is, if the password string is n bytes long, append the first 32 - n bytes of the padding string to the end of the password string. If the password string is empty (zero-length), meaning there is no user password, substitute the entire padding string in its place.

(ISO 32000-2 第 7.6.4.3.2 节“算法 2:计算文件加密密钥以加密文档(修订版 4 及更早版本)”)

对于更现代的加密类型,您也有限制,但通常不那么严格:

The UTF-8 password string shall be generated from Unicode input by processing the input string with the SASLprep (Internet RFC 4013) profile of stringprep (Internet RFC 3454) using the Normalize and BiDi options, and then converting to a UTF-8 representation.

Truncate the UTF-8 representation to 127 bytes if it is longer than 127 bytes.

(ISO 32000-2 第 7.6.4.3.3 节“算法 2.A:从加密文档中检索文件加密密钥以对其进行解密(修订版 6 及更高版本)” )