Python 生成的 zlib + base64 文本变得不正确 header 检查 Java 中的异常

Question

我需要从 Kafka 流中在 Java 中编写一个消费者，发布者应用程序是由第三方在 python 中编写的。当我使用 base64 解码然后在 java 中进行 zlip 解压缩时，我得到一个不正确的 header 检查异常。

我的任务是：将这个压缩的 base64 + zlib 数据转换成可读的文本文件。

python 中的发布者代码：

 # read in the file content
   inputf=open(file, 'rb')
   input=inputf.read()
   inputf.close()
   # using zlib.compress(s) method
   compressed = zlib.compress(input)
   # encoding with base64 encoding
   encoding_type='base64'
   enc_data=encode(compressed,encoding_type)
   enc_data_utf8=enc_data.decode("utf-8")
#   enc_data=enc_data_no_no_newline                      ####[0:86000]       #   trim
   event_etl_event[filename+"_content"]=enc_data_utf8
   event_etl_event[filename+"_compressed_format"]="zlib+uuencode"
    enter code here

消费者代码在 java

public void processData(){

   inputStr = event.getEventEtlEvent().getAllProblemsTxtContent();
        
        
        System.out.println("Before Base64 decoding: \n" + inputStr);
        
         Path path0 = Paths.get("AllProblems_Before_base64_decoding.txt");
         Files.write(path0, inputStr.getBytes());
          
        Base64 base64 = new Base64();
        String decodedString = new String(base64.decode(inputStr.getBytes()));
        
        System.out.println("After Base64 decode: \n" + decodedString);
        
         Path path1 = Paths.get("AllProblems_After_base64_decoding.txt");
         Files.write(path1, decodedString.getBytes());
          
        System.out.println("now zlib 64 decodingString .........\n\n\n");
        
        byte[] output = ZLibUtils.decompress(decodedString.getBytes());  

        System.out.println("After Zlib Decompress: "+ output);

        
    } catch (JsonParseException e) {
        e.printStackTrace();
    } catch (JsonMappingException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    } /*catch (InvalidDataException e) {
        e.printStackTrace();
    } catch (DataFormatException e) {
        e.printStackTrace();
    }*/catch (Exception e) {
        e.printStackTrace();
    }

ZLibUtils.java

  public static byte[] decompress(byte[] data) throws DataFormatException {  
            byte[] output = new byte[0];  
    
            Inflater decompresser = new Inflater();  
            decompresser.reset();  
            decompresser.setInput(data);  
    
            ByteArrayOutputStream o = new ByteArrayOutputStream(data.length);  
           // try {  
                byte[] buf = new byte[data.length]; 
                byte[] a = new byte[data.length];
                while (!decompresser.finished()) {  
                    int i = decompresser.inflate(buf);  
                    o.write(buf, 0, i);  
                }  
                output = o.toByteArray();  
                /* } catch (Exception e) {  
                output = data;  
                e.printStackTrace();  
                //FIXME: in later code
                System.exit(0);
            } finally {  
                try {  
                    o.close();  
                } catch (IOException e) {  
                    e.printStackTrace();  
                }  
            }  */
    
            decompresser.end();  
            return output;  
        }

现在当我运行我的程序时，我得到以下异常：

    java.util.zip.DataFormatException: incorrect header check
        at java.util.zip.Inflater.inflateBytes(Native Method)
        at java.util.zip.Inflater.inflate(Inflater.java:259)
        at java.util.zip.Inflater.inflate(Inflater.java:280)
        at com.exmple.util.ZLibUtils.decompress(ZLibUtils.java:84

)

期待您的回音。

Answer 1

问题在于您将 base64 解码数据从字节数组转换为字符串，然后再转换回字节数组。对于大多数编码，这是 not a no-op。这意味着，对于大多数编码和大多数字节数组，

byte[] decoded = { (byte) 0x9b, 1, 2, 3 };
String decodedString = new String(decoded);
byte[] processed = decodedString.getBytes();

处理后的内容会和解码后的内容不一样

解决方案是不将 base64 解码数据视为字符串，而不是直接使用字节数据：

    Base64 base64 = new Base64();
    byte[] decoded = base64.decode(inputStr.getBytes());
    
    Path path1 = Paths.get("AllProblems_After_base64_decoding.txt");
    Files.write(path1, decoded);
      
    System.out.println("now zlib 64 decodingString .........\n\n\n");
    
    byte[] output = ZLibUtils.decompress(decoded);

Python 生成的 zlib + base64 文本变得不正确 header 检查 Java 中的异常

Python generated zlib + base64 text getting incorrect header check exception in Java

java

compression

zlib