将解密文件读入 ZipInputStream 有时会截断第一个文件

Question

我正在开发一个 e-reader 应用程序（使用 skyepub），它基本上将加密书籍下载到文件系统中（并且它保存的是数据库中的解密密钥），并且当用户尝试阅读它会将书加载到内存中并对其进行解密。

问题是有些书的第一章是 t运行（epub 书实际上是 zip 文件，每一章都是一个单独的文件）。这导致了这个可怕的错误：

this XML file does not appear to have any style information associated with it. The document tree is shown below

我试过的

我已经确认加密的书已正确下载，b/c 如果我将文件复制到我的桌面（从我的 root android）和运行此命令它：

openssl aes-192-cbc -d -K *** -iv *** -in test.epub.encrypted -out test.epub

它工作得很好。但是，如果我几乎尝试使用以下 android 代码

做同样的事情

public ContentData getContentData(String baseDirectory, String contentPath) {
    if( contentPath.startsWith("/fonts/")) {
        ... // handle font suff
    }

    int secondSlash = contentPath.indexOf('/', 1);
    if( secondSlash == -1) return null;

    String bookEditionID = contentPath.substring(1,secondSlash);
    String zipEntryName = contentPath.substring(secondSlash+1);

    final ContentData data = new ContentData();

    try {
        InputStream stream = dbUtil.getBookStream(bookEditionID);
        if( stream == null) return null;

        final ZipInputStream zip = new ZipInputStream(stream);

        ZipEntry entry;
        do {
            entry = zip.getNextEntry();
            Log.e("Abjjad","looping through entry: "+entry);
            if( entry == null) {
                zip.close();
                return null;
            }
        } while( !entry.getName().equals(zipEntryName));

        Log.e("debug","going through data with entry: " +entry+", contentLength: "+entry.getSize());

见方法dbUtil.getBookStream:

public InputStream getBookStream( String bookEditionId) {
    BookInfo book = getBookInfo(bookEditionId);

    InputStream origStream = null;
    try {

        // Open the downloaded ePub
        origStream = openFileInput(bookEditionId + ".epub");

        // De-obfuscate the key
        SecretKeySpec sks = getObfuscationKeySpec(bookEditionId);
        Cipher c = Cipher.getInstance("AES/ECB/PKCS5Padding", "BC");
        c.init(Cipher.DECRYPT_MODE, sks);
        byte[] decodedBytes = c.doFinal(Base64.decode(book.decryptionKey, Base64.DEFAULT));
        String keyPair = new String(decodedBytes);

        // Split the key and parse into binary
        int separator = keyPair.indexOf(':');
        byte[] key = DatatypeConverter.parseHexBinary(keyPair.substring(0, separator));
        byte[] iv = DatatypeConverter.parseHexBinary(keyPair.substring(separator + 1));

        c = Cipher.getInstance("AES/CBC/PKCS7Padding","BC");
        c.init(Cipher.DECRYPT_MODE, new SecretKeySpec(key,"AES"), new IvParameterSpec(iv));
        return new CipherInputStream(origStream, c);
    } catch( Exception e) {
        try {
            if (origStream != null) origStream.close();
        } catch( Exception x) {}
        return null;
    }
}

然后是第一个代码块中entry.getSize()returns-1的日志。

奖金（适用于 iOS！）

我们在 iOS 中编写了相同的代码，并且它运行完美（在同一本书上）：

+ (NSData *)encryptKey:(NSString *)key ivParam:(NSString *)iv bookId:(NSString *)bookId
{
    NSString *keyPair = [NSString stringWithFormat:@"%@:%@", key, iv];
    NSString *secret = [self getObfuscationSecretWithValue:bookId];

    NSData *data = [keyPair dataUsingEncoding:NSASCIIStringEncoding];

    char keyPtr[kCCKeySizeAES128];
    bzero(keyPtr, sizeof(keyPtr));
    [[NSData dataWithHexString:secret] getBytes:keyPtr length:sizeof(keyPtr)];

    NSUInteger dataLength = [data length];
    size_t bufferSize = dataLength + kCCBlockSizeAES128;
    void *buffer = malloc(bufferSize);

    size_t numBytesEncrypted;
    CCCryptorStatus status = CCCrypt(kCCEncrypt, kCCAlgorithmAES, kCCOptionPKCS7Padding | kCCOptionECBMode, keyPtr, kCCKeySizeAES128,
                                     NULL,
                                     [data bytes], [data length],
                                     buffer, bufferSize, &numBytesEncrypted);
    if (status == kCCSuccess) {
        return [NSData dataWithBytes:buffer length:numBytesEncrypted];
    }
    else {
        free(buffer);
        return nil;
    }
}

更新

我注意到这个 t运行cation 只在阅读目录后发生（这看起来像上面的 last 章节？）.. 从日志：

:::::::::::::::::::::::::::::::
getInputStream: /24748681/OEBPS/toc.ncx
:::::::::::::::::::::::::::::::
looping through entry: mimetype
looping through entry: OEBPS/hayat-ghayr.html
looping through entry: OEBPS/content.opf
looping through entry: OEBPS/images/978-614-425-313-7-hayat-ghayr-cover.png
looping through entry: OEBPS/images/978-614-425-313-7-hayat_fmt.png
looping through entry: OEBPS/template.css
looping through entry: OEBPS/hayat-ghayr-2.html
looping through entry: OEBPS/hayat-ghayr-1.html
looping through entry: OEBPS/hayat-ghayr-3.html
looping through entry: OEBPS/hayat-ghayr-4.html
looping through entry: OEBPS/hayat-ghayr-5.html
looping through entry: OEBPS/hayat-ghayr-6.html
looping through entry: OEBPS/hayat-ghayr-7.html
looping through entry: OEBPS/hayat-ghayr-8.html
looping through entry: OEBPS/hayat-ghayr-9.html
looping through entry: OEBPS/hayat-ghayr-10.html
looping through entry: OEBPS/hayat-ghayr-11.html
looping through entry: OEBPS/hayat-ghayr-12.html
looping through entry: OEBPS/hayat-ghayr-13.html
looping through entry: OEBPS/hayat-ghayr-14.html
looping through entry: OEBPS/hayat-ghayr-15.html
looping through entry: OEBPS/hayat-ghayr-16.html
looping through entry: OEBPS/hayat-ghayr-17.html
looping through entry: OEBPS/hayat-ghayr-18.html
looping through entry: OEBPS/hayat-ghayr-19.html
looping through entry: OEBPS/hayat-ghayr-20.html
looping through entry: OEBPS/hayat-ghayr-21.html
looping through entry: OEBPS/hayat-ghayr-22.html
looping through entry: META-INF/container.xml
looping through entry: OEBPS/images/277.png
looping through entry: OEBPS/toc.ncx
going through data with entry: OEBPS/toc.ncx, contentLength: 5549
returning data
:::::::::::::::::::::::::::::::
getInputStream: /24748681/OEBPS/hayat-ghayr.html
:::::::::::::::::::::::::::::::
looping through entry: mimetype
looping through entry: OEBPS/hayat-ghayr.html
going through data with entry: OEBPS/hayat-ghayr.html, contentLength: -1
returning data

Answer 1

根据 the docs，如果大小未知，getSize() 可能 return -1。这肯定发生在某些 zip 文件中。在这些情况下，您需要阅读整个条目以确定其未压缩的大小。

分析

红鲱鱼

首先，整个加密解密的事情是一个红色的鲱鱼..简单地复制 same epub/zip 文件并使用相同的代码读取它会产生相同的结果页面.. 所以这是 zip 文件本身的问题，而不是它的解密问题

Zip 文档

如 java 文档中所述，如果内容未知（这正是此处发生的情况），读取 zip 文件实际上可以 return -1.. 事实上，我们得到了 相同的 zip 文件，将其解压缩（在命令行上），然后使用更高的压缩级别重新压缩它，如下所示：

zip -9 -r filename.epub *

然后我们将相同的 zip 文件提供给现有代码，它完美地工作了！

解决方案

所以这是最终有效的代码：

    try {
        InputStream stream = abjjadDb.getBookStream(bookEditionID);
        if( stream == null) return null;

        final ZipInputStream zip = new ZipInputStream(stream);

        ZipEntry entry;
        do {
            entry = zip.getNextEntry();
            if( entry == null) {
                zip.close();
                return null;
            }
        } while( !entry.getName().equals(zipEntryName));

        data.contentLength = entry.getSize();
        data.lastModified = entry.getTime();
        data.contentPath = contentPath;

        InputStream s = zip;
        if( data.contentLength == -1) {
            Log.e("demo",new Object(){}.getClass().getEnclosingMethod().getName()+":: entry \""+entry+"\" has contentLength -1, so we will work around");
            ByteArrayOutputStream buffer = new ByteArrayOutputStream();
            int nRead;
            // use buf to store data from the zip file entry in fixed size
            byte[] buf = new byte[4096];
            while ((nRead = zip.read(buf)) != -1) {
                // dump that data into buffer, which is a growing buffer
                buffer.write(buf, 0, nRead);
            }
            buffer.flush();

            byte[] finalBuffer = buffer.toByteArray();
            Log.e("demo",new Object(){}.getClass().getEnclosingMethod().getName()+":: entry \""+entry+"\" final data length: "+finalBuffer.length);
            data.contentLength = finalBuffer.length;
            s = new ByteArrayInputStream(finalBuffer);
        }
        final InputStream finalStream = s;

日志给了我们这个

getContentData:: entry "OEBPS/hayat-ghayr.html" has contentLength -1, so we will work around
getContentData:: entry "OEBPS/hayat-ghayr.html" final data length: 2378
getContentData:: entry "OEBPS/hayat-ghayr.html" has contentLength -1, so we will work around
getContentData:: entry "OEBPS/hayat-ghayr.html" final data length: 2378

有趣的是..如果我们在命令行中运行，那么该大小与该文件hayat-ghayr的实际内容长度完全匹配：

$ unzip -l b17c024e-89f1-42f7-a546-91d46610cedb.epub 
Archive:  b17c024e-89f1-42f7-a546-91d46610cedb.epub
  Length     Date   Time    Name
 --------    ----   ----    ----
       20  01-27-12 11:17   mimetype
     2378  04-20-12 10:12   OEBPS/hayat-ghayr.html
     6436  02-06-12 11:06   OEBPS/content.opf
   112579  01-27-12 11:25   OEBPS/images/978-614-425-313-7-hayat-ghayr-cover.png
   182575  01-27-12 11:25   OEBPS/images/978-614-425-313-7-hayat_fmt.png
     7757  01-27-12 11:21   OEBPS/template.css
     5643  01-27-12 11:18   OEBPS/hayat-ghayr-2.html
    20144  01-27-12 11:17   OEBPS/hayat-ghayr-1.html
    65543  01-27-12 11:17   OEBPS/hayat-ghayr-3.html
    59434  01-27-12 11:17   OEBPS/hayat-ghayr-4.html
    66768  01-27-12 11:17   OEBPS/hayat-ghayr-5.html
    49117  01-27-12 11:17   OEBPS/hayat-ghayr-6.html
    65346  01-27-12 11:17   OEBPS/hayat-ghayr-7.html
    74196  01-27-12 11:17   OEBPS/hayat-ghayr-8.html
    73998  01-27-12 11:17   OEBPS/hayat-ghayr-9.html
    61031  01-27-12 11:17   OEBPS/hayat-ghayr-10.html
    68297  01-27-12 11:17   OEBPS/hayat-ghayr-11.html
    72084  01-27-12 11:17   OEBPS/hayat-ghayr-12.html
     2386  01-27-12 11:17   OEBPS/hayat-ghayr-13.html
    61132  01-27-12 11:17   OEBPS/hayat-ghayr-14.html
    46320  01-27-12 11:17   OEBPS/hayat-ghayr-15.html
    32673  01-27-12 11:17   OEBPS/hayat-ghayr-16.html
    88584  01-27-12 11:17   OEBPS/hayat-ghayr-17.html
    56474  01-27-12 11:17   OEBPS/hayat-ghayr-18.html
    52840  01-27-12 11:17   OEBPS/hayat-ghayr-19.html
    80022  01-27-12 11:17   OEBPS/hayat-ghayr-20.html
    50781  01-27-12 11:17   OEBPS/hayat-ghayr-21.html
     2765  01-27-12 11:17   OEBPS/hayat-ghayr-22.html
      265  01-27-12 11:17   META-INF/container.xml
    54942  01-27-12 11:17   OEBPS/images/277.png
     5549  01-27-12 11:17   OEBPS/toc.ncx
     1072  03-23-12 13:28   iTunesMetadata.plist
 --------                   -------
  1529151                   32 files

将解密文件读入 ZipInputStream 有时会截断第一个文件

Reading a decrypted file into a ZipInputStream truncates first file sometimes

java

encryption

android

epub

skyepub

我试过的

奖金（适用于 iOS！）

更新

分析

红鲱鱼

Zip 文档

解决方案