如何在 python 中解码此二进制字符串?
How do I decode this binary string in python?
所以,我有这个字符串 01010011101100000110010101101100011011000110111101110100011010000110010101110010011001010110100001101111011101110111100101101111011101010110010001101111011010010110111001100111011010010110110101100110011010010110111001100101011000010111001001100101011110010110111101110101011001100110100101101110011001010101000000000000
并且我想使用 python 对其进行解码,但出现此错误
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb0 in position 280: invalid start byte
根据此网站:https://www.binaryhexconverter.com/binary-to-ascii-text-converter
输出应该是S�ellotherehowyoudoingimfineareyoufineP
这是我的代码:
def decodeAscii(bin_string):
binary_int = int(bin_string, 2);
byte_number = binary_int.bit_length() + 7 // 8
binary_array = binary_int.to_bytes(byte_number, "big")
ascii_text = binary_array.decode()
print(ascii_text)
我该如何解决?
您的字节根本无法解码为 utf-8,正如错误消息告诉您的那样。
utf-8 是 decode 的默认编码参数 - 输入正确编码值的最佳方法是 知道 编码 - 否则你'大家猜猜看。
猜测可能也是网站所做的,通过尝试最常见的编码,直到没有抛出异常:
def decodeAscii(bin_string):
binary_int = int(bin_string, 2);
byte_number = binary_int.bit_length() + 7 // 8
binary_array = binary_int.to_bytes(byte_number, "big")
ascii_text = "Bin string cannot be decoded"
for enc in ['utf-8', 'ascii', 'ansi']:
try:
ascii_text = binary_array.decode(encoding=enc)
break
except:
pass
print(ascii_text)
s = "01010011101100000110010101101100011011000110111101110100011010000110010101110010011001010110100001101111011101110111100101101111011101010110010001101111011010010110111001100111011010010110110101100110011010010110111001100101011000010111001001100101011110010110111101110101011001100110100101101110011001010101000000000000"
decodeAscii(s)
输出:
S°ellotherehowyoudoingimfineareyoufineP
但不能保证您通过猜测找到“正确”的编码。
您的二进制字符串不是有效的 ascii 或 utf-8 字符串。您可以通过说
告诉 decode
忽略无效序列
ascii_text = binary_array.decode(errors='ignore')
一行就可以解决:
试试这个:
def bin_to_text(bin_str):
bin_to_str = "".join([chr(int(bin_str[i:i+8],2)) for i in range(0,len(bin_str),8)])
return bin_to_str
bin_str = '01010011101100000110010101101100011011000110111101110100011010000110010101110010011001010110100001101111011101110111100101101111011101010110010001101111011010010110111001100111011010010110110101100110011010010110111001100101011000010111001001100101011110010110111101110101011001100110100101101110011001010101000000000000'
bin_to_str = bin_to_text(bin_str)
print(bin_to_str)
输出:
S°ellotherehowyoudoingimfineareyoufineP
所以,我有这个字符串 01010011101100000110010101101100011011000110111101110100011010000110010101110010011001010110100001101111011101110111100101101111011101010110010001101111011010010110111001100111011010010110110101100110011010010110111001100101011000010111001001100101011110010110111101110101011001100110100101101110011001010101000000000000
并且我想使用 python 对其进行解码,但出现此错误
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb0 in position 280: invalid start byte
根据此网站:https://www.binaryhexconverter.com/binary-to-ascii-text-converter
输出应该是S�ellotherehowyoudoingimfineareyoufineP
这是我的代码:
def decodeAscii(bin_string):
binary_int = int(bin_string, 2);
byte_number = binary_int.bit_length() + 7 // 8
binary_array = binary_int.to_bytes(byte_number, "big")
ascii_text = binary_array.decode()
print(ascii_text)
我该如何解决?
您的字节根本无法解码为 utf-8,正如错误消息告诉您的那样。
utf-8 是 decode 的默认编码参数 - 输入正确编码值的最佳方法是 知道 编码 - 否则你'大家猜猜看。
猜测可能也是网站所做的,通过尝试最常见的编码,直到没有抛出异常:
def decodeAscii(bin_string):
binary_int = int(bin_string, 2);
byte_number = binary_int.bit_length() + 7 // 8
binary_array = binary_int.to_bytes(byte_number, "big")
ascii_text = "Bin string cannot be decoded"
for enc in ['utf-8', 'ascii', 'ansi']:
try:
ascii_text = binary_array.decode(encoding=enc)
break
except:
pass
print(ascii_text)
s = "01010011101100000110010101101100011011000110111101110100011010000110010101110010011001010110100001101111011101110111100101101111011101010110010001101111011010010110111001100111011010010110110101100110011010010110111001100101011000010111001001100101011110010110111101110101011001100110100101101110011001010101000000000000"
decodeAscii(s)
输出:
S°ellotherehowyoudoingimfineareyoufineP
但不能保证您通过猜测找到“正确”的编码。
您的二进制字符串不是有效的 ascii 或 utf-8 字符串。您可以通过说
告诉decode
忽略无效序列
ascii_text = binary_array.decode(errors='ignore')
一行就可以解决:
试试这个:
def bin_to_text(bin_str):
bin_to_str = "".join([chr(int(bin_str[i:i+8],2)) for i in range(0,len(bin_str),8)])
return bin_to_str
bin_str = '01010011101100000110010101101100011011000110111101110100011010000110010101110010011001010110100001101111011101110111100101101111011101010110010001101111011010010110111001100111011010010110110101100110011010010110111001100101011000010111001001100101011110010110111101110101011001100110100101101110011001010101000000000000'
bin_to_str = bin_to_text(bin_str)
print(bin_to_str)
输出:
S°ellotherehowyoudoingimfineareyoufineP