如何在 python 中解码此二进制字符串?

How do I decode this binary string in python?

所以,我有这个字符串 01010011101100000110010101101100011011000110111101110100011010000110010101110010011001010110100001101111011101110111100101101111011101010110010001101111011010010110111001100111011010010110110101100110011010010110111001100101011000010111001001100101011110010110111101110101011001100110100101101110011001010101000000000000

并且我想使用 python 对其进行解码,但出现此错误 UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb0 in position 280: invalid start byte

根据此网站:https://www.binaryhexconverter.com/binary-to-ascii-text-converter

输出应该是S�ellotherehowyoudoingimfineareyoufineP

这是我的代码:

def decodeAscii(bin_string):
    binary_int = int(bin_string, 2);
  
    byte_number = binary_int.bit_length() + 7 // 8
    binary_array = binary_int.to_bytes(byte_number, "big")
    ascii_text = binary_array.decode()
    
    print(ascii_text)

我该如何解决?

您的字节根本无法解码为 utf-8,正如错误消息告诉您的那样。

utf-8 是 decode 的默认编码参数 - 输入正确编码值的最佳方法是 知道 编码 - 否则你'大家猜猜看。

猜测可能也是网站所做的,通过尝试最常见的编码,直到没有抛出异常:

def decodeAscii(bin_string):
    binary_int = int(bin_string, 2);
    byte_number = binary_int.bit_length() + 7 // 8
    binary_array = binary_int.to_bytes(byte_number, "big")
    ascii_text = "Bin string cannot be decoded"
    for enc in ['utf-8', 'ascii', 'ansi']:
        try:
            ascii_text = binary_array.decode(encoding=enc)
            break
        except:
            pass
    print(ascii_text)

s = "01010011101100000110010101101100011011000110111101110100011010000110010101110010011001010110100001101111011101110111100101101111011101010110010001101111011010010110111001100111011010010110110101100110011010010110111001100101011000010111001001100101011110010110111101110101011001100110100101101110011001010101000000000000"
decodeAscii(s)

输出:

S°ellotherehowyoudoingimfineareyoufineP

但不能保证您通过猜测找到“正确”的编码。

您的二进制字符串不是有效的 ascii 或 utf-8 字符串。您可以通过说

告诉 decode 忽略无效序列
ascii_text = binary_array.decode(errors='ignore')

一行就可以解决:

试试这个:

def bin_to_text(bin_str):
    bin_to_str = "".join([chr(int(bin_str[i:i+8],2)) for i in range(0,len(bin_str),8)])

    return bin_to_str

bin_str = '01010011101100000110010101101100011011000110111101110100011010000110010101110010011001010110100001101111011101110111100101101111011101010110010001101111011010010110111001100111011010010110110101100110011010010110111001100101011000010111001001100101011110010110111101110101011001100110100101101110011001010101000000000000'
bin_to_str = bin_to_text(bin_str)
print(bin_to_str)

输出:

S°ellotherehowyoudoingimfineareyoufineP