将 MATLAB 的 Reed Solomon 函数移植到 Java

Question

我在 MATLAB 中用 RS(160,80) 实现了一个简单的 RS 纠错方案。基本流程如下：

我生成一个长度为80，每个符号8位的消息，我生成一个长度为160的RS码。
生成RS码后，我add/XOR另一个码长160的伽罗瓦域。（这个域只包含00000000和00000001）。这是为了模拟在方案中添加错误。这会生成我的嘈杂代码。
现在，我采用另一个 GF 字段（与上面 [00000000 00000001] 类型相似），它有 < 40 个符号不同于我用来创建噪声的符号。我add/XOR这是对上一步生成的嘈杂代码。
最后，我运行通过 RS 解码器检索我的原始消息。

我的 MATLAB 函数：

function RSKeyExchange(dev1, dev2)
    dev1_fp = zeros(1,160);
    dev2_fp = zeros(1,160);

    for i=1:160
        dev1_fp(i) = str2double(dev1.key(i));
        dev2_fp(i) = str2double(dev2.key(i));
    end

    n = 160;        % total code length
    k = 80;         % message length - actual message for syncronisation
    m = 8;          % order (2^m)

    % therefore, RS decoder can detect t = n-k = 80 errors
    % and correct t/2 = 40 errors

    msg = gf(randi([1 255],1 ,80), m);
    code = rsenc(msg, n, k);

    noise_add = gf(dev1_fp, 8);
    noise_remove = gf(dev2_fp, 8);

    noisy_code = code + noise_add;

    % noisy_code is now transmitted from dev1 to dev2 (sender to receiver)

    decommited_code = noisy_code + noise_remove;
    [rxcode, cnumerr] = rsdec(decommited_code, n, k);

    fprintf('Number of errors corrected: %d\n', cnumerr);
end

现在我一直在寻找将其移植到 Java 的方法。我查找了库，但不确定如何准确移植我的特定用例。

Zxing - 仅将 QR 和 Aztec 代码作为输入
Backblaze - JavaReedSolomon - 修复代码擦除，这不是我产生的那种错误，输入是文件形式（对这里发生的事情感到非常困惑)
Simple RS error correction example - 感觉更清晰一些，但只接受字符串作为输入。我觉得我可以修改它以适合我的用例，但我不确定如何添加噪音。我不确定如何通过此实现生成 RS(160, 80) 代码，也不知道如何生成自定义 GF 字段以添加噪声。

任何帮助将不胜感激（特别是如果你能指出一个适合我的用例的实现，或者帮助修改上面可用的资源之一）

谢谢！

Answer 1

我查看了 "simple RS" 示例 "GF28" Java 代码。解码器似乎只处理擦除（其中一个输入是一组错误索引）。它使用基于十六进制 11B = x^8 + x^4 + x^3 + x + 1 的 GF(256)，与 AES 加密相同。这是有点不寻常的选择，因为最低的 "primitive" 是 3（除零以外的所有数字都可以认为是 3 的幂），而不是 "primitive" 为 2 的字段。字段多项式是通过 PX 定义，因此可以轻松更改。我不确定为什么它动态生成 tables 而不是在初始化期间生成它们，使用第二组 true/false tables 来指示特定的 table 值是否已经生成。

我有一个用于 8 位 GF(256) 字段的 C RS 演示程序。是交互的，你select一个字段（有30个），是否使用自倒多项式（一般不用），生成多项式的第一个连续根是否为1（如果没有指定，那么第一个连续的根是 "primitive")、奇偶校验字节数和数据字节数。它处理擦除和错误，因为它是一个演示程序，它包括 3 种主要类型的解码器算法的代码 Peterson 矩阵算法、Sugiyama 对扩展 Euclid 算法的改编和 Berlekamp Massey 算法，以及用于生成错误值的 Forney 算法。交互部分可以用 select 所有这些参数的代码替换。根据您的情况，将 MAXPAR 的定义从 20 更改为 80（最大奇偶校验数）。用户输入是通过标准输入，因此文本输入文件可用于运行演示。

http://rcgldr.net/misc/eccdemo8.zip

在我的演示中，要生成代码字，用户输入值（用户选项 "E" 或 "C"），然后输入 "N" 进行编码。为了产生错误，用户在特定位置输入值。要生成擦除，用户使用 "P" 选项在特定位置输入擦除值。 "F"用于修正（修正）码字

Wiki 文章包括 3 种主要解码器类型的逻辑。它还解释了傅里叶变换，但是傅里叶变换需要使用其他解码器之一才能生成误差多项式，并且不实用。

https://en.wikipedia.org/wiki/Reed%E2%80%93Solomon_error_correction#Error_correction_algorithms

根据您的评论，我查看了 Zxing 库。方法描述有点简洁并且使用了错误的术语。这是重做的描述：

GenericGF(primitive, size, b)
    primitive - wrong term, this is the field polynomial

    size - The description is "size of the field", but that is
    determined by the polynomial, a n+1 bit polynomial is used
    for an n bit field. My guess is this is the size of the
    generator polynomial, perhaps the number of roots.

    b - alpha raised to the b power is the first consecutive root.
    alpha is the primitive for the particular field. Unlike the
    comment, b == 0 is about as common as b == 1.

https://en.wikipedia.org/wiki/Primitive_element_(finite_field)

encode(int[] toEncode, int ecBytes)
    toEncode - toEncode includes space for the "parity" data used
    to encode the message.

    ecByte - the number of "parity" elements at the end of to Encode.
    Why is it named ecBytes when it is an array of ints?

decode(int[] received, int twoS )
    received is an array of concatenated encoded messages?
    Since encode generates a single encoded message, then it would
    make more sense for decode to decode a single encoded message.

    twoS - the number of encoded messages?

    Based on the references, it is using the Sugiyama adaptation
    of extended Euclid algorithm. A side benefit is that the
    error evaluator polynomial is generated during the process.

wrapper注释也有错误。 GF(256) 码字的最大大小是 255 字节，因为错误位置被限制在 0 到 254 的范围内。这是因为位置与幂有关，并且 GF(256) 中任何提升到 255 次方的数字都是相同的数字。

请注意，可能会在不触发库异常的情况下进行错误更正。然而，如果有 80 个奇偶校验字节，则可能需要 41 个错误。错误校正将涉及生成一组 40 个位置和值，这些位置和值产生有效代码字，但与原始代码字相差 81 个或更多字节（81 个或更多错误）。然而，对于 160 字节的缩短码字，所有 40 个生成的位置都必须在 0 到 159 的范围内。假设计算的位置随机、均匀分布，它们错误纠正的几率将是（（160！） /((160-40)!))/(255^40) != 3.86 × 10^-11 。如果使用全长 (255) 码字或较少数量的奇偶校验，则误更正是一个问题。

我做了一个java RS ecc GF(256) 例子。它不处理擦除（这需要修改已知擦除位置的综合症，然后将错误定位器与擦除定位器合并）。它使用 Euclid 算法来计算错误定位器和错误评估器多项式。它有注释，但您可能需要查看 Wiki RS 文章才能更好地理解它。该代码通过根据需要使用 byte&0xff 将它们转换为无符号整数来处理 Java 有符号字节。我将参数设置为 80 个消息字节，80 个奇偶校验字节以匹配问题的示例。尽管 GF(256) 的加法和减法都是异或运算，但代码使用单独的函数进行加法和减法，因此代码对应于适用于 GF(p) 的算法，其中 p 是质数，例如 GF( 257). Lambda 的 "derivative" 的计算对于 GF(p) 也是不同的，这里有解释：

https://en.wikipedia.org/wiki/Forney_algorithm#Formal_derivative

Link 到 java rsecc GF(256) 示例：

http://rcgldr.net/misc/rsecc8.zip

交互式 C 版本更容易理解，因为它是交互式的，显示一些内部计算（可以更改定义以启用/禁用显示的内容），并且用户可以尝试不同的变体。

Answer 2

好的，所以我能够解决我的问题。我妥协了一点，所以它不是一个确切的端口，而且很乱。无论哪种方式，它都适用于我的用例。

我为 Zxing 找到了一个 wrapper library，它完成了相当多的繁重工作。

包装器没有像我的 MATLAB 程序那样使用 GF(8)，而是使用 GenericGF.QR_CODE_FIELD_256，它使用 GF(256) 进行编码。

这是 java 实现，基本上实现了相同的目的。（MsgKeyPair只是一个包含两个字符串的数据结构）

public MsgKeyPair generateKey(String fingerprint){
    EncoderDecoder encoderDecoder = new EncoderDecoder();

    try {
        String message = generateMessage();
        byte[] data = message.getBytes();

        byte[] encodedData = encoderDecoder.encodeData(data, 80);
        byte[] noisyData = encodedData.clone();

        for(int i=0; i<encodedData.length; i++){
            byte one = 0xff & 00000001;
            if(fingerprint.charAt(i) == '1') {
                int oneInt = (int) one;
                int edInt = (int)encodedData[i];
                int xor = oneInt ^ edInt;

                noisyData[i] = (byte)(0xff & xor);
            }
        }

        String keyToSend = new String(Base64.encode(noisyData, Base64.DEFAULT));
        return new MsgKeyPair(message, keyToSend);
    }
    catch (Exception e){
        Log.e(TAG, "generateKey: ", e);
    }

    return new MsgKeyPair("", "");
}

public String KeyDecoder(String fingerprint, String key) throws Exception{
    byte[] noisyData = Base64.decode(key, Base64.DEFAULT);

    byte[] decomCode = noisyData.clone();
    for(int i=0; i<noisyData.length; i++){
        byte one = 0xff & 00000001;
        if(fingerprint.charAt(i) == '1'){
            int oneInt = (int)one;
            int ndInt = (int) noisyData[i];
            int xor = oneInt ^ ndInt;

            decomCode[i] = (byte)(0xff & xor);
        }
    }

    EncoderDecoder encoderDecoder = new EncoderDecoder();

    byte[] decodedData = encoderDecoder.decodeData(decomCode, 80);
    String originalMessage = new String(decodedData);

    return originalMessage;
}

public static String generateMessage(){
    String msg = "";

    for(int i=0; i<80; i++){
        int bit = Math.round(randomWithRange(0,1));
        msg = msg + String.valueOf(bit);
    }

    return msg;
}

private static int randomWithRange(int min, int max)
{
    int range = (max - min) + 1;
    return (int)(Math.random() * range) + min;
}

将 MATLAB 的 Reed Solomon 函数移植到 Java

Porting MATLAB's Reed Solomon function to Java

java

matlab

error-correction

reed-solomon

galois-field