使用随机二进制密钥解密二进制序列。我的脚本有什么问题？

Question

（更新如下）

尝试 Philip Klein 的线性代数文本 编码矩阵 中的问题。运行针对所有可能的密钥暴力破解密文二进制序列的问题。问题 1.5.1，第 57 页：

An 11-symbol message has been encrypted as follows. Each symbol is represented by a number between 0 and 26 (A maps to 0, B to 25, space to 26.) Each number is represented by a five-bit binary sequence (0 maps to 00000, 26 to 11010). Finally, the resulting sequence of 55 bits is encrypted using a flawed version of the one-time pad: the key is not 55 bits but 11 copies of the same sequence of 5 random bits. The ciphertext is: '10101', '00100', '10101', '01011', '11001', '00011', '01011', '10101', '00100', '11001', '11010'

目标是找出明文。我遇到的问题是：一，我的解码器函数产生几个高于 int 26 的 5 位二进制数。这个函数应该尝试针对每个可能的 5 位二进制序列（密钥）直到 int 26 的密文二进制序列，产生每一个可能的明文序列。第二，我是否应该使用伽罗华域来确保每个二进制序列保持二进制？（1 + 1 = 0 而不是 2）有什么建议么？我正在尝试通过使用 Klein 的有趣文本来学习线性代数（并提高我有限的 python 能力）。比较难... 谢谢！

import string

# The Cipher-text
Y = ['10101', '00100', '10101', '01011', '11001', '00011', '01011', '10101', '00100', '11001', '11010']


def trans(bit): # convert the bits into int
    x = ''.join(map(str, bit))
    return int(x, 2)


def decoder(cipher): # Try the ciphertext against each possible 5 bit sequence (up to 11010)
    pos_seq = ["".join("{0:05b}".format(x)) for x in range(27)]
    attempt = []
    for i in pos_seq:
        for j in cipher: # Add ciphertext binary to every possible binary 'key'.
            temp = [(int(i[0])+int(j[0])),(int(i[1])+int(j[1])),(int(i[2])+int(j[2])),
                    (int(i[3])+int(j[3])), (int(i[4])+int(j[4]))]
            attempt.append(temp)
    for i in range(len(attempt)): # Galois Field, 'two' becomes 'one'
        for k in attempt[i]:
            if k == 2:
                attempt[i][attempt[i].index(k)] = 0
    return attempt


new_list = decoder(Y)

decoded = [trans(i) for i in new_list] # translate each 5 digit sequence into int

alph = list(string.ascii_lowercase) # alphabet plus a space
alph.append(' ')

decoded2 = [alph[x] for x in decoded] # Translate int to letter or space

print(decoded2)

更新

感谢曹大方和jason，我编辑代码如下，找到明文：eve is evil

解码器函数的范围增加到 32
(还是要脑补一下GF(2)和XOR，包括大方对x ^ y & mask的使用)
使用

chunks = [decoded[x:x+11] for x in range(0, len(decoded), 11)]

*因为密文是由11'symbols'

将上面的列表映射到大房使用的lambda函数：

def decode(encoded):
    y = []
    for i in encoded:
        if i < 27:
            y.append(i)
    return ''.join(map(lambda x: alph[x], y))

for i in chunks:
    decoded2 = decode(i)
    print(decoded2)

Answer 1

使用 5 位有 32 个可能的值。知道有效值为 0-26，任何超过 26 的解密值都是一个信号，表明密钥不是加密中使用的密钥。暴力破解的时候，遇到这样的值可以直接跳过key。事实上，我认为这就是你应该如何消除不正确的密钥。

同时，并没有规定密钥不大于26。你应该尝试所有32种组合。

回复。 Galois Field，计算机处理器自然是二进制的。您可以利用 XOR 等按位运算来加快代码速度。实际上XOR就是GF(2)中的加法运算。您可以使用 x ^ y & mask 在 GF(2) 中添加 2 个向量。当不处理 GF(2) 时，您可以在将 "wrap" 值添加到有效范围 x + y % n.

之后使用模运算符

import string

# The Cipher-text
Y = ['10101', '00100', '10101', '01011', '11001', '00011', '01011', '10101', '00100', '11001', '11010']
# Parse to binary value
y = list(map(lambda v: int(v, 2), Y))

max_val = 32
alphabet_size = 26

# decrypt([0b11111, ...]) = [(key=0b1111, decrpyted=[0b11111, ...]), ...]
def decrypt(encrypted):
    possible_keys = range(max_val)
    attempt = []
    for key in possible_keys:
        decrypted = []
        for symbol in encrypted:
            # XOR is equivalent to Add in GF(2)
            # If not GF(2), add then wrap around using modulo
            v = symbol ^ key & (max_val - 1)
            print('v = %d' % v)
            if v > alphabet_size:
                break
            decrypted.append(v)
        if len(decrypted) < len(encrypted):
            print('bad key %d' % key)
            continue
        print('good key %d' % key)
        attempt.append((key, decrypted))
    return attempt

# alphabet plus a space
alph = list(string.ascii_lowercase)
alph.append(' ')
def decode(encoded):
    return ''.join(map(lambda x: alph[x], encoded))

results = decrypt(y)

for (key, encoded) in results:
    decoded = decode(encoded)
    print(decoded)

Answer 2

替代方法：

# First establish the letter -> binary code:
digits = set(range(2))
base = 2
alphabet = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H',
'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S',
'T', 'U', 'V', 'W', 'X', 'Y', 'Z', ' ']
binary_code = {a*base**4 + b*base**3 + c*base**2 + d*base**1 + e*base**0:str(a)+str(b)+str(c)+str(d)+str(e) for a in digits
                          for b in digits
                          for c in digits
                          for d in digits
                          for e in digits if a*base**4 + b*base**3 + c*base**2 + d*base**1 + e*base**0 <= 26}
letters_to_binary = {k:v for v,k in zip(alphabet,binary_code.values())}

# Now because this one-time pad is flawed, we can make a list of all possible keys - there are only 32
# These are all the 5 bit binary numbers
keys = {str(a)+str(b)+str(c)+str(d)+str(e) for a in digits for b in digits
                        for c in digits
                        for d in digits
                        for e in digits}

# We can then apply each key to the cyphertext under 
# Galois Field 2 addition - GF(2), and see which of 
# the resulting messages is most sensical

cyphertext = ['10101', '00100', '10101', '01011', '11001', '00011', '01011', '10101', '00100', '11001', '11010']
results = {}
for key in keys:
    message = ''
    for text in cyphertext:
        decoded = ''
        for num, t in zip(key, text):
            # The following if else implements GF(2) 
            # addition
            if (int(num) == 1 and int(t) == 0) or (int(num) == 0 and int(t) == 1):
                decoded = decoded + '1'
            else:
                decoded = decoded + '0'
        if decoded not in letters_to_binary:
            continue
        else:
            message = message + letters_to_binary[decoded]
    results[key] = message

print(results)
# From printing the results, using a key of 10001 results in a message of 'EVE IS EVIL', no other messages make sense.
# Thus the key is 10001

Answer 3

晚会迟到了，但我认为这个可能很简单：

message = [0b10101, 0b00100, 0b10101, 0b01011, 0b11001, 0b00011, 
0b01011, 0b10101, 0b00100, 0b11001, 0b11010]

keys = list(range(0b00000, 0b100000))

letters = ['a', 'b', 'c', 'd', 'e', 
'f', 'g', 'h', 'i', 'j', 'k', 'l',
'm', 'n', 'o', 'p', 'q', 'r', 's', 
't', 'u', 'v', 'w', 'x', 'y', 'z', ' ']

for k in keys:
    decrypted = ""
    possible = True
    for m in message:
        if(m^k in range(len(letters))):
            decrypted += letters[m^k]
        else:
            possible = False
            break
    if(possible):
        print(str(k) + ": " + decrypted)

Answer 4

看看这个解决方案，正好在代码 17 处。我认为它简短。夏娃是邪恶的！

此致！

message = [0b10101,0b00100,0b10101,
           0b01011,0b11001,0b00011,
           0b01011,0b10101,0b00100,
           0b11001,0b11010]
           
codes = range(0b00000,0b100000)
L = {i-65:chr(i) for i in range(65,91)}
L[26] = ' '

for code in codes:
    print(code, bin(code), ''.join([L[(l^code)%27] for l in message]))

Answer 5

cyph = [21, 4, 21, 11, 25, 3, 11, 21, 4, 25, 26]
alphabet = 'abcdefghijklmnopqrstuvwxyz '
for cc in range(1, 32):
    print(cc, ''.join([(alphabet[(x^cc)]) \
              for x in cyph if (x^cc) < len(alphabet)]))

灵感来自@Gabriel Simões

使用随机二进制密钥解密二进制序列。我的脚本有什么问题？

Decrypt binary sequence with random binary key. What's wrong with my script?

python

encryption

linear-algebra

galois-field