纠错码和最小距离

Error correcting codes and minimum distances

我在网上(在 King 的网站上)看到了一个挑战,虽然我理解它背后的总体思路,但我还是有点迷茫——也许措辞有点不对?这是问题所在,我将在下面说明我不明白的地方:

Error correcting codes are used in a wide variety of applications ranging from satellite communication to music CDs. The idea is to encode a binary string of length k as a binary string of length n>k, called a codeword such that even if some bit(s) of the encoding are corrupted (if you scratch on your CD for instance), the original k-bit string can still be recovered. There are three important parameters associated with an error correcting code: the length of codewords (n), the dimension (k) which is the length of the unencoded strings, and finally the minimum distance (d) of the code. Distance between two codewords is measured as hamming distance, i.e., the number of positions in which the codewords differ: 0010 and 0100 are at distance 2. The minimum distance of the code is the distance between the two different codewords that are closest to each other. Linear codes are a simple type of error correcting codes with several nice properties. One of them being that the minmum distance is the smallest distance any non-zero codeword has to the zero codeword (the codeword consisting of n zeros always belongs to a linear code of length n). Another nice property of linear codes of length n and dimension k is that they can be described by an n×k generator matrix of zeros and ones. Encoding a k-bit string is done by viewing it as a column vector and multiplying it by the generator matrix. The example below shows a generator matrix and how the string 1001 is encoded. graph.png Matrix multiplication is done as usual except that additon is done modulo 2 (i.e., 0+1=1+0=1 and 0+0=1+1=0). The set of codewords of this code is then simply all vectors that can be obtained by encoding all k-bit strings in this way. Write a program to calculate the minimum distance for several linear error correcting codes of length at most 30 and dimension at most 15. Each code will be given as a generator matrix. Input You will be given several generator matrices as input. The first line contains an integer T indicating the number of test cases. The first line of each test case gives the parameters n and k where 1≤n≤30, 1≤k≤15 and n > k, as two integers separated by a single space. The following n lines describe a generator matrix. Each line is a row of the matrix and has k space separated entries that are 0 or 1. Output For each generator matrix output a single line with the minimum distance of the corresponding linear code.

示例输入 1

2

7 4

1 0 0 0

0 1 0 0

0 0 1 0

0 0 0 1

0 1 1 1

1 0 1 1

1 1 0 1

3 2

1 1

0 0

1 1

示例输出 1

3

0

现在我的假设是问题在问"Write a program that can take in the linear code in matrix form and say what the minimum distance is from an all zero codeword"我只是不明白为什么第一个输入有 3 个输出而第二个输入有 0 个输出?

很困惑。

有什么想法吗?

第一个例子:

Input binary string: 1000
Resulting code: 1100001
Hamming distance to zero codeword 0000000: 3

第二个例子:

Input binary string: 11
Resulting code: 000
Hamming distance to zero codeword 000: 0

您的目标是找到 有效 非零码字(可以从一些非零 k 位输入字符串中生成)到零码字的汉明距离最小(在不同的单词 - 二进制表示中的单词数量最少)和 return 那个距离。

希望对你有帮助,问题描述确实有点难懂。

编辑。我在第一个例子中打错了字。实际输入应该是 1000 而不是 0001。此外,可能不清楚输入字符串到底是什么以及代码字是如何计算的。让我们看看第一个示例。

Input binary string: 1000

这个二进制字符串通常是 不是 生成矩阵的一部分。它只是所有可能的非零 4 位字符串之一。让我们将它乘以生成矩阵:

(1 0 0 0) * (1 0 0 0) = 1
(0 1 0 0) * (1 0 0 0) = 0
(0 0 1 0) * (1 0 0 0) = 0
(0 0 0 1) * (1 0 0 0) = 0
(0 1 1 1) * (1 0 0 0) = 0
(1 0 1 1) * (1 0 0 0) = 1
(1 1 0 1) * (1 0 0 0) = 1

找到产生 "minimal" 码字的输入的一种方法是迭代所有 2^k-1 个非零 k 位字符串并为它们中的每一个计算码字。这是 k <= 15 的可行解。

第一个测试用例 0011 的另一个示例(可能有多个输入产生 "minimal" 输出):

(1 0 0 0) * (0 0 1 1) = 0
(0 1 0 0) * (0 0 1 1) = 0
(0 0 1 0) * (0 0 1 1) = 1
(0 0 0 1) * (0 0 1 1) = 1
(0 1 1 1) * (0 0 1 1) = 2 = 0 (mod 2)
(1 0 1 1) * (0 0 1 1) = 2 = 0 (mod 2)
(1 1 0 1) * (0 0 1 1) = 1

结果代码 0011001 与零代码字的汉明距离也为 3。没有 4 位字符串的代码在二进制表示中少于 3 个。这就是为什么第一个测试用例的答案是 3.