使用 BWT 进行文本压缩和解压缩

Question

请问能否将BWT MTF和Huffman算法结合起来，在java中获得更高的压缩率？过程是怎样的？编写 MTF 文件时出错？

public class MTF{
    static File f=new File("MTF.txt");
public static File encode(String msg, String symTable)throws Exception{
            if(!f.exists())
                f.createNewFile();
    StringBuilder s = new StringBuilder(symTable);
    for(char c : msg.toCharArray()){
        int idx = s.indexOf("" + c);
                    FileWriter writer = new FileWriter(f); 
                    writer.write(idx+" "); 
                    System.out.print(idx+" ");
                    writer.flush();
                    writer.close();
        s = s.deleteCharAt(idx).insert(0, c);
    }
            System.out.println("MTF done");
    return f;
}

Answer 1

检验这个假设很容易，过程是：

取一组有代表性的字符串（您的程序将在 "real world" 中处理的字符串）；
使用 BWT MTF 编码（互联网上有很多实现）；
用霍夫曼压缩；

一般来说：应用 MTF 应该可以提高可压缩性，例如这里提到的：http://michael.dipperstein.com/bwt/

BWT is useful because it converts the data into a format that is generally more compressible by run length encoders and statistical encoders with order greater than 0. By additionally applying move-to-front coding, the data will be in a format which is generally more compressible by even zero order statistical encoders such as traditional implementations of Huffman coding or arithmetic coding.

使用 BWT 进行文本压缩和解压缩

Text Compression and Decompression using BWT

java

compression

burrows-wheeler-transform