在 MYSQL 中检索保存到 blob 的数据时进行双重编码

Double encoding when retrieving data saved to blob in MYSQL

我有一个嵌套的 python 列表,其中包含数据,例如显示的 longlist。 我有使这个列表变成 json 之类的字符串的代码,然后我对其进行编码以便将其提供给 base64.b64encoder,然后将其作为包含其他数据的 blob 保存到我的 mysql 数据库.当我尝试检索数据并将其转换回列表时,我 运行 遇到了我只能描述为双重编码的问题。

longlist = [
    [1, 2,'thisisdhdh'],
    [1,2,'thlsldfsdf'],
    [2,0,'sdlfjldksjflksdj']
] 

string1 = json.dumps(longlist)

encod1 = string1.encode('ascii')

encodedlist = base64.b64encode(encod1)

cur.execute('INSERT INTO conversations (convoblob) VALUES ("%s")' % (encodedlist))

code above to save the encoded list to the DB

code below to retrieve the data

cur.execute('SELECT convoblob FROM conversations WHERE id IN (327) ')
item = cur.fetchone()

x = items[0] 

1.output 类型 class 字节
b"b'W1sxLCAyLCAidGhpc2lzZGhkaCJdLCBbMSwgMiwgInRobHNsZGZzZGYiXSwgWzIsIDAsICJzZGxmamxka3N1.qZmxrc2RqIl1d'"

       x = items[0].decode('ascii')

1.output类型class'str'1.b'W1sxLCAyLCAidGhpc2lzZGhkaCJdLCBbMSwgMiwgInRobHNsZGZzZGYiXSwgWzIsIDAsICJzZGxmamxka3NqZ1.mxrc2RqIl1d'

如您所见,解码一次将其变成一个我无法 运行 解码器打开的字符串。

不将数据保存到数据库将解码得很好,出于某种原因,一旦数据进入数据库,它会以 (b") 开头的 sting 和 (") 结尾

我的数据库默认字符集是 utf8mb4

我确实找到了一个解决方法,即解码从数据库中提取的数据,并截断字符串末尾的位,将其标识为类似字符串的字节,然后对字符串进行编码,然后再进行编码用 base64.b64decode

解码
cur.execute('SELECT convoblob FROM conversations WHERE id IN (327) ')
item = cur.fetchone()
decoded = item[0].decode('ascii')
y = len(decoded) - 1
newString = decoded[2:y]
this = newString.encode('ascii')
man = base64.b64decode(this).decode('ascii')
ohyeah = json.loads(man)
print(ohyeah[1])

输出为 b'W1sxLCAyLCAidGhpc2lzZGhkaCJdLCBbMSwgMiwgInRobHNsZGZzZGYiXSwgWzIsIDAsICJzZGxmamxka3NqZmxrc2RqIl1d'

[1, 2, 'thlsldfsdf']

这是正在发生的事情:

import sqlite3
import json
import base64

longlist = [
    [1, 2,'thisisdhdh'],
    [1,2,'thlsldfsdf'],
    [2,0,'sdlfjldksjflksdj']
] 

# you create a string
# '[[1, 2, "thisisdhdh"], [1, 2, "thlsldfsdf"], [2, 0, "sdlfjldksjflksdj"]]'
string1 = json.dumps(longlist)

# you create a bytes object from the string, happens to be encoded as ascii
# b'[[1, 2, "thisisdhdh"], [1, 2, "thlsldfsdf"], [2, 0, "sdlfjldksjflksdj"]]'
encod1 = string1.encode('ascii')

# you create a base-64 encoding of that object, still bytes
# b'W1sxLCAyLCAidGhpc2lzZGhkaCJdLCBbMSwgMiwgInRobHNsZGZzZGYiXSwgWzIsIDAsICJzZGxmamxka3NqZmxrc2RqIl1d'
encodedlist = base64.b64encode(encod1)

conn = sqlite3.connect('test.db')
conn.execute('DROP TABLE IF EXISTS conversations')
conn.execute('CREATE TABLE conversations (convoblob BLOB)')
# you format that bytes object into a string (casting it to a string) <<< your problem
conn.execute('INSERT INTO conversations VALUES ("%s")' % (encodedlist))

cur = conn.execute('SELECT convoblob FROM conversations')
item = cur.fetchone()

# no surprise about the mangled string here, since it was cast back into a string first
# "b'W1sxLCAyLCAidGhpc2lzZGhkaCJdLCBbMSwgMiwgInRobHNsZGZzZGYiXSwgWzIsIDAsICJzZGxmamxka3NqZmxrc2RqIl1d'"
print(item[0])

# Let's try again
conn.close()
conn = sqlite3.connect('test.db')
conn.execute('DROP TABLE IF EXISTS conversations')
conn.execute('CREATE TABLE conversations (convoblob BLOB)')
# now passing the bytes directly to the database, as bytes, into a blob
conn.execute('INSERT INTO conversations VALUES (?)', [encodedlist])

cur = conn.execute('SELECT convoblob FROM conversations')
item = cur.fetchone()

# This works:
my_list = json.loads(base64.b64decode(item[0]))
print(my_list)

我知道你使用 MySQL,但它对 SQLite 的作用相同(并且有同样的问题),所以我确定这是你的问题 sqlite3适用于任何标准 Python 安装。

因此,简而言之:通过使用 % 格式将 bytes 对象格式化为字符串,您创建了 "b'data'" 对象,这给您带来了麻烦。这可以通过将 bytes 本身传递给 SQL 引擎并让它使用查询参数将其传递给查询来避免,无论如何这是避免 [=23= 的更好方法] 注入.