确定 Python 中字符串的编码

Question

我是这个网站的新手，所以如果我需要更改这个问题的任何内容，请告诉我！同样，我一般对 base 64 比较缺乏经验，所以请多多包涵！

在Python中，我有一个简单的解码base 64字符串的小程序：

import base64

def decodeBase64(string):

    decodeableString = string

    for value in range(len(string)%4):
        decodeableString += '='

    return base64.b64decode(decodeableString)

尝试解码时：

0J3QuNC20LUg0L/RgNC40LLQtdC00LXQvSDQutC+0LQg0LTQvtGB0YLRg9C/0LAg0Log0LfQtNCw0L3QuNGOIFvQo9CU0JDQm9CV0J3Qnl06Ck9WSzhZTFggLyAo0JjQnNCvIC8g0JrQm9Cu0KcpCj09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09ID09PT09PQrQkdCw0LfQsCAzNg==

作为挑战的一部分，我遇到了俄语字符，这不知道如何处理，所以它只返回：

b'\xd0\x9d\xd0\xb8\xd0\xb6\xd0\xb5 \xd0\xbf\xd1\x80\xd0\xb8\xd0\xb2\xd0\xb5\xd0\xb4\xd0\xb5\xd0\xbd \xd0\xba\xd0\xbe\xd0\xb4 \xd0\xb4\xd0\xbe\xd1\x81\xd1\x82\xd1\x83\xd0\xbf\xd0\xb0 \xd0\xba \xd0\xb7\xd0\xb4\xd0\xb0\xd0\xbd\xd0\xb8\xd1\x8e [\xd0\xa3\xd0\x94\xd0\x90\xd0\x9b\xd0\x95\xd0\x9d\xd0\x9e]:\nOVK8YLX / (\xd0\x98\xd0\x9c\xd0\xaf / \xd0\x9a\xd0\x9b\xd0\xae\xd0\xa7)\n================================================== ======\n\xd0\x91\xd0\xb0\xd0\xb7\xd0\xb0 36'

使用不同的在线解码器，我了解到这包含俄语字符。是否有任何相对简单的方法让我的程序检查解码的 base 64 字符串是否包含非 ascii 字符，然后将其翻译成这样？

Answer 1

在您的特定情况下，字符串是 UTF-8 编码的。

在Python 3.x中你必须将它从bytes解码到str，假设解码后的字节在x:

>>> x.decode('utf-8')
'Ниже приведен код доступа к зданию [УДАЛЕНО]:\nOVK8YLX / (ИМЯ / КЛЮЧ)\n================================================== ======\nБаза 36'

不过一般情况下只能猜编码。请参阅 this 和相关问题。

确定 Python 中字符串的编码

Determining the Encoding of a String in Python

python

base64

ascii

non-ascii-characters