ZipFile:读取时值错误

ZipFile : Wrong values when reading

我正在创建一个包含一个目录和其中一个压缩文本文件的 zip 文件。

创建 zip 文件的代码

   try(ZipOutputStream zos=new ZipOutputStream(new FileOutputStream("E:/TestFile.zip")))
   {  
    //comment,level,method for all entries
    zos.setComment("Test Zip File");
    zos.setLevel(Deflater.BEST_COMPRESSION);
    zos.setMethod(Deflater.DEFLATED);
    
    //Creating Directories[ends with a forward slash]
    {
     ZipEntry dir1=new ZipEntry("Directory/");  
     
     //Give it a comment
     dir1.setComment("Directory");
     //Some extra data
     dir1.setExtra("Hello".getBytes());
     //Set Creation,Access,Modification Time
     FileTime time=FileTime.fromMillis(System.currentTimeMillis());
     dir1.setCreationTime(time);
     dir1.setLastAccessTime(time);
     dir1.setLastModifiedTime(time);
     
     //put the entry & close it
     zos.putNextEntry(dir1);
     zos.closeEntry();
    }
     
    //Creating an fully compressed file inside the directory with all informtion
    {
      ZipEntry file=new ZipEntry("Directory/Test.txt");
      
      //Meta Data
      {
       //Give it a comment
       file.setComment("A File");
       //Some extra data
       file.setExtra("World".getBytes());
       //Set Creation,Access,Modification Time
       FileTime time=FileTime.fromMillis(System.currentTimeMillis());
       file.setCreationTime(time);
       file.setLastAccessTime(time);
       file.setLastModifiedTime(time);
      }
    
     //Byte Data
     {
      //put entry for writing
      zos.putNextEntry(file);
      byte[] data="Hello World Hello World".getBytes();

      //Compress Data
      Deflater deflater=new Deflater(9);
      deflater.setDictionary("Hello World ".getBytes());
      deflater.setInput(data);
      deflater.finish();
      byte[] output=new byte[100];
      int compressed=deflater.deflate(output);
     
      //Write Data   
      CRC32 check=new CRC32();
      check.update(data);
      file.setSize(deflater.getBytesRead());
      file.setCrc(check.getValue());          
      file.setCompressedSize(compressed);     
      zos.write(output,0,compressed);
      
      //end data
      System.out.println(deflater.getBytesRead()+"/"+compressed);
      deflater.end();
     }
     
     //close the entry
     zos.closeEntry();
    }
   }
  }

写入文件时,未压缩的字节数据大小为 23 个字节,压缩后的数据大小为 15 个字节。我使用 ZipEntry 中的每个方法只是为了测试我是否可以在读取时正确检索所有值.

在使用 ZipFile class 而不是 ZipInputStream 读取它时(错误 getSize() 总是 returns -1)使用此代码

 //reading zip file using ZipFile
  public static void main(String[] args)throws Exception
  {
   try(ZipFile zis=new ZipFile("E:/TestFile.zip"))
   {
    Enumeration<? extends ZipEntry> entries=zis.entries();
    while(entries.hasMoreElements())
    {
     ZipEntry entry=entries.nextElement();
     
     System.out.println("Name="+entry.getName());
     System.out.println("Is Directory="+entry.isDirectory());   
     System.out.println("Comment="+entry.getComment());
     System.out.println("Creation Time="+entry.getCreationTime());
     System.out.println("Access Time="+entry.getLastAccessTime());
     System.out.println("Modification Time="+entry.getLastModifiedTime());
     System.out.println("CRC="+entry.getCrc());
     System.out.println("Real Size="+entry.getSize());
     System.out.println("Compressed Size="+entry.getCompressedSize());
     System.out.println("Optional Data="+new String(entry.getExtra()));
     System.out.println("Method="+entry.getMethod());
     if(!entry.isDirectory())
     {
      Inflater inflater=new Inflater();
      try(InputStream is=zis.getInputStream(entry))
      {
       byte[] originalData=new byte[(int)entry.getSize()];
       inflater.setInput(is.readAllBytes());
       int realLength=inflater.inflate(originalData);
       if(inflater.needsDictionary())
       {
        inflater.setDictionary("Hello World ".getBytes());
        realLength=inflater.inflate(originalData);
       }
       inflater.end();

       System.out.println("Data="+new String(originalData,0,realLength));
      }  
     }
     System.out.println("=====================================================");
   }   
  }
 }  

我得到这个输出

Name=Directory/
Is Directory=true
Comment=Directory
Creation Time=null
Access Time=null
Modification Time=2022-01-24T17:00:25Z
CRC=0
Real Size=0
Compressed Size=2
Optional Data=UTaHello
Method=8
=====================================================
Name=Directory/Test.txt
Is Directory=false
Comment=A File
Creation Time=null
Access Time=null
Modification Time=2022-01-24T17:00:25Z
CRC=2483042136
Real Size=15
Compressed Size=17
Optional Data=UT��aWorld
Method=8
Data=Hello World Hel
==================================================

这段代码有很多错误输出

目录

1)Creation Time & Access Time are null[even though i have specified it in the write method]

2)Extra Data[Optional Data] has wrong encoding

对于文件

1)Creation Time & Access Time are null[even though i have specified it in the write method]

2)getSize() & getCompressedSize() methods return the wrong values. I have specified these values during writing manually with sizeSize() & setCompressedSize() when creating the file the values were 23 and 15 but it returns 15 and 17

3)Extra Data[Optional Data] has wrong encoding

4)Since getSize() returns incorrect size it dosen't display the whole data[Hello World Hel]

由于出现了这么多问题,我认为 post 这是一个问题,而不是多个小问题,因为它们似乎都是相关的。我是编写 zip 文件的完全初学者,所以我将不胜感激关于我从这里去哪里的任何方向。

如果大小未知或不正确,我可以使用 while 循环将 zip 条目的数据读取到缓冲区中,这不是问题,但如果他们知道我们为什么还要创建 set 或 get size 方法无论如何大部分时间都会这样做。有什么意义?

经过大量研究,我能够解决 70% 的问题。鉴于 ZipOutputStream 和 ZipFile 读取数据的方式,其他问题无法解决

问题 1:由 getSize() 和 getCompressedSize()

编辑的值不正确 return

1) 写入中

我之前没有看到这个,但是 ZipOutputStream 已经为我们做了压缩,我用我自己的充气机对它进行了双重压缩,所以我删除了那个代码,我意识到你必须只在你需要的时候指定这些值使用存储的方法。否则它们是根据数据为您计算的。所以折射我的 zip 编写代码,这就是它的样子

   try(ZipOutputStream zos=new ZipOutputStream(new FileOutputStream("E:/TestFile2.zip")))
   {  
    //comment,level,method for all entries
    zos.setComment("Test Zip File");
    //Auto Compression
    zos.setMethod(ZipOutputStream.DEFLATED);
    zos.setLevel(9);
    
    //Creating Directories[ends with a forward slash]
    {
     ZipEntry dir1=new ZipEntry("Directory/");  
     
     //Give it a comment
     dir1.setComment("Directory");
     //Some extra data
     dir1.setExtra("Hello".getBytes());
     //Set Creation,Access,Modification Time
     FileTime time=FileTime.fromMillis(System.currentTimeMillis());
     dir1.setCreationTime(time);
     dir1.setLastAccessTime(time);
     dir1.setLastModifiedTime(time);
     
     //put the entry & close it
     zos.putNextEntry(dir1);
     zos.closeEntry();
    }
     
    //Creating an fully compressed file inside the directory with all informtion
    {
      ZipEntry file=new ZipEntry("Directory/Test.txt");
      
      //Meta Data
      {
       //Give it a comment
       file.setComment("A File");
       //Some extra data
       file.setExtra("World".getBytes());
       //Set Creation,Access,Modification Time
       FileTime time=FileTime.fromMillis(System.currentTimeMillis());
       file.setCreationTime(time);
       file.setLastAccessTime(time);
       file.setLastModifiedTime(time);
      }
    
     //Byte Data
     {
      byte[] data="Hello World Hello World".getBytes();
     
      //Data
      zos.putNextEntry(file);
      zos.write(data);
      zos.flush();
     }
     
     //close the entry
     zos.closeEntry();
    }
    
    //finish writing the zip file without closing stream
    zos.finish();
   }

2)阅读中

要获得正确的尺寸和压缩尺寸值,有 2 种方法

-> 如果您使用 ZipFile class 读取文件,则值会正确显示

-> 如果您使用 ZipInputStream,那么只有在您从条目中读取所有字节后才会计算这些值。更多信息 here

 if(!entry.isDirectory())
 {
  try(ByteArrayOutputStream baos=new ByteArrayOutputStream())
  {
   int read;
   byte[] data=new byte[10];    
   while((read=zipInputStream.read(data))>0){baos.write(data,0,read);}
   System.out.println("Data="+new String(baos.toByteArray()));
  } 
 }
 //Now these values are correct
 System.out.println("CRC="+entry.getCrc());
 System.out.println("Real Size="+entry.getSize());
 System.out.println("Compressed Size="+entry.getCompressedSize());

问题2:额外数据不正确

这个几乎可以解释一切

这是代码

     ByteBuffer extraData = ByteBuffer.wrap(entry.getExtra()).order(ByteOrder.LITTLE_ENDIAN);
     while(extraData.hasRemaining()) 
     {
       int id = extraData.getShort() & 0xffff;
       int length = extraData.getShort() & 0xffff;

       if(id == 0x756e) 
       {
         int crc32 = extraData.getInt();
         short permissions = extraData.getShort();
         int 
         linkLengthOrDeviceNumbers = extraData.getInt(),
         userID = extraData.getChar(),
         groupID = extraData.getChar();

         ByteBuffer linkDestBuffer = extraData.slice().limit(length - 14);
         String linkDestination=StandardCharsets.UTF_8.decode(linkDestBuffer).toString();
       } 
       else
       {
        extraData.position(extraData.position() + length);        
        byte[] ourData=new byte[extraData.remaining()];
        extraData.get(ourData);

        //do stuff
       }
     } 

未解决的问题

还有 3 个值 return 不同的结果取决于您使用的读取文件的方法。我对每个条目进行了 table 观察

                            ZipFile           ZipInputStream
 getCreationTime()           null             <correct value>

 getLastAccessTime()         null             <correct value>

 getComment()             <correct value>        null

显然来自 bug report 这是预期的行为,因为 zip 文件是随机访问的,而 zip 输入流是顺序的,因此它们访问数据的方式不同。

根据我的观察,使用 ZipInputStream return 是最好的结果,所以我将继续使用它