在 MYSQL 中检索保存到 blob 的数据时进行双重编码
Double encoding when retrieving data saved to blob in MYSQL
我有一个嵌套的 python 列表,其中包含数据,例如显示的 longlist。
我有使这个列表变成 json 之类的字符串的代码,然后我对其进行编码以便将其提供给 base64.b64encoder,然后将其作为包含其他数据的 blob 保存到我的 mysql 数据库.当我尝试检索数据并将其转换回列表时,我 运行 遇到了我只能描述为双重编码的问题。
longlist = [
[1, 2,'thisisdhdh'],
[1,2,'thlsldfsdf'],
[2,0,'sdlfjldksjflksdj']
]
string1 = json.dumps(longlist)
encod1 = string1.encode('ascii')
encodedlist = base64.b64encode(encod1)
cur.execute('INSERT INTO conversations (convoblob) VALUES ("%s")' % (encodedlist))
code above to save the encoded list to the DB
code below to retrieve the data
cur.execute('SELECT convoblob FROM conversations WHERE id IN (327) ')
item = cur.fetchone()
x = items[0]
1.output 类型 class 字节
b"b'W1sxLCAyLCAidGhpc2lzZGhkaCJdLCBbMSwgMiwgInRobHNsZGZzZGYiXSwgWzIsIDAsICJzZGxmamxka3N1.qZmxrc2RqIl1d'"
x = items[0].decode('ascii')
1.output类型class'str'1.b'W1sxLCAyLCAidGhpc2lzZGhkaCJdLCBbMSwgMiwgInRobHNsZGZzZGYiXSwgWzIsIDAsICJzZGxmamxka3NqZ1.mxrc2RqIl1d'
如您所见,解码一次将其变成一个我无法 运行 解码器打开的字符串。
不将数据保存到数据库将解码得很好,出于某种原因,一旦数据进入数据库,它会以 (b") 开头的 sting 和 (") 结尾
我的数据库默认字符集是 utf8mb4
我确实找到了一个解决方法,即解码从数据库中提取的数据,并截断字符串末尾的位,将其标识为类似字符串的字节,然后对字符串进行编码,然后再进行编码用 base64.b64decode
解码
cur.execute('SELECT convoblob FROM conversations WHERE id IN (327) ')
item = cur.fetchone()
decoded = item[0].decode('ascii')
y = len(decoded) - 1
newString = decoded[2:y]
this = newString.encode('ascii')
man = base64.b64decode(this).decode('ascii')
ohyeah = json.loads(man)
print(ohyeah[1])
输出为 b'W1sxLCAyLCAidGhpc2lzZGhkaCJdLCBbMSwgMiwgInRobHNsZGZzZGYiXSwgWzIsIDAsICJzZGxmamxka3NqZmxrc2RqIl1d'
[1, 2, 'thlsldfsdf']
这是正在发生的事情:
import sqlite3
import json
import base64
longlist = [
[1, 2,'thisisdhdh'],
[1,2,'thlsldfsdf'],
[2,0,'sdlfjldksjflksdj']
]
# you create a string
# '[[1, 2, "thisisdhdh"], [1, 2, "thlsldfsdf"], [2, 0, "sdlfjldksjflksdj"]]'
string1 = json.dumps(longlist)
# you create a bytes object from the string, happens to be encoded as ascii
# b'[[1, 2, "thisisdhdh"], [1, 2, "thlsldfsdf"], [2, 0, "sdlfjldksjflksdj"]]'
encod1 = string1.encode('ascii')
# you create a base-64 encoding of that object, still bytes
# b'W1sxLCAyLCAidGhpc2lzZGhkaCJdLCBbMSwgMiwgInRobHNsZGZzZGYiXSwgWzIsIDAsICJzZGxmamxka3NqZmxrc2RqIl1d'
encodedlist = base64.b64encode(encod1)
conn = sqlite3.connect('test.db')
conn.execute('DROP TABLE IF EXISTS conversations')
conn.execute('CREATE TABLE conversations (convoblob BLOB)')
# you format that bytes object into a string (casting it to a string) <<< your problem
conn.execute('INSERT INTO conversations VALUES ("%s")' % (encodedlist))
cur = conn.execute('SELECT convoblob FROM conversations')
item = cur.fetchone()
# no surprise about the mangled string here, since it was cast back into a string first
# "b'W1sxLCAyLCAidGhpc2lzZGhkaCJdLCBbMSwgMiwgInRobHNsZGZzZGYiXSwgWzIsIDAsICJzZGxmamxka3NqZmxrc2RqIl1d'"
print(item[0])
# Let's try again
conn.close()
conn = sqlite3.connect('test.db')
conn.execute('DROP TABLE IF EXISTS conversations')
conn.execute('CREATE TABLE conversations (convoblob BLOB)')
# now passing the bytes directly to the database, as bytes, into a blob
conn.execute('INSERT INTO conversations VALUES (?)', [encodedlist])
cur = conn.execute('SELECT convoblob FROM conversations')
item = cur.fetchone()
# This works:
my_list = json.loads(base64.b64decode(item[0]))
print(my_list)
我知道你使用 MySQL,但它对 SQLite 的作用相同(并且有同样的问题),所以我确定这是你的问题 sqlite3
适用于任何标准 Python 安装。
因此,简而言之:通过使用 % 格式将 bytes
对象格式化为字符串,您创建了 "b'data'"
对象,这给您带来了麻烦。这可以通过将 bytes
本身传递给 SQL 引擎并让它使用查询参数将其传递给查询来避免,无论如何这是避免 [=23= 的更好方法] 注入.
我有一个嵌套的 python 列表,其中包含数据,例如显示的 longlist。 我有使这个列表变成 json 之类的字符串的代码,然后我对其进行编码以便将其提供给 base64.b64encoder,然后将其作为包含其他数据的 blob 保存到我的 mysql 数据库.当我尝试检索数据并将其转换回列表时,我 运行 遇到了我只能描述为双重编码的问题。
longlist = [
[1, 2,'thisisdhdh'],
[1,2,'thlsldfsdf'],
[2,0,'sdlfjldksjflksdj']
]
string1 = json.dumps(longlist)
encod1 = string1.encode('ascii')
encodedlist = base64.b64encode(encod1)
cur.execute('INSERT INTO conversations (convoblob) VALUES ("%s")' % (encodedlist))
code above to save the encoded list to the DB
code below to retrieve the data
cur.execute('SELECT convoblob FROM conversations WHERE id IN (327) ')
item = cur.fetchone()
x = items[0]
1.output 类型 class 字节
b"b'W1sxLCAyLCAidGhpc2lzZGhkaCJdLCBbMSwgMiwgInRobHNsZGZzZGYiXSwgWzIsIDAsICJzZGxmamxka3N1.qZmxrc2RqIl1d'"
x = items[0].decode('ascii')
1.output类型class'str'1.b'W1sxLCAyLCAidGhpc2lzZGhkaCJdLCBbMSwgMiwgInRobHNsZGZzZGYiXSwgWzIsIDAsICJzZGxmamxka3NqZ1.mxrc2RqIl1d'
如您所见,解码一次将其变成一个我无法 运行 解码器打开的字符串。
不将数据保存到数据库将解码得很好,出于某种原因,一旦数据进入数据库,它会以 (b") 开头的 sting 和 (") 结尾
我的数据库默认字符集是 utf8mb4
我确实找到了一个解决方法,即解码从数据库中提取的数据,并截断字符串末尾的位,将其标识为类似字符串的字节,然后对字符串进行编码,然后再进行编码用 base64.b64decode
解码cur.execute('SELECT convoblob FROM conversations WHERE id IN (327) ')
item = cur.fetchone()
decoded = item[0].decode('ascii')
y = len(decoded) - 1
newString = decoded[2:y]
this = newString.encode('ascii')
man = base64.b64decode(this).decode('ascii')
ohyeah = json.loads(man)
print(ohyeah[1])
输出为 b'W1sxLCAyLCAidGhpc2lzZGhkaCJdLCBbMSwgMiwgInRobHNsZGZzZGYiXSwgWzIsIDAsICJzZGxmamxka3NqZmxrc2RqIl1d'
[1, 2, 'thlsldfsdf']
这是正在发生的事情:
import sqlite3
import json
import base64
longlist = [
[1, 2,'thisisdhdh'],
[1,2,'thlsldfsdf'],
[2,0,'sdlfjldksjflksdj']
]
# you create a string
# '[[1, 2, "thisisdhdh"], [1, 2, "thlsldfsdf"], [2, 0, "sdlfjldksjflksdj"]]'
string1 = json.dumps(longlist)
# you create a bytes object from the string, happens to be encoded as ascii
# b'[[1, 2, "thisisdhdh"], [1, 2, "thlsldfsdf"], [2, 0, "sdlfjldksjflksdj"]]'
encod1 = string1.encode('ascii')
# you create a base-64 encoding of that object, still bytes
# b'W1sxLCAyLCAidGhpc2lzZGhkaCJdLCBbMSwgMiwgInRobHNsZGZzZGYiXSwgWzIsIDAsICJzZGxmamxka3NqZmxrc2RqIl1d'
encodedlist = base64.b64encode(encod1)
conn = sqlite3.connect('test.db')
conn.execute('DROP TABLE IF EXISTS conversations')
conn.execute('CREATE TABLE conversations (convoblob BLOB)')
# you format that bytes object into a string (casting it to a string) <<< your problem
conn.execute('INSERT INTO conversations VALUES ("%s")' % (encodedlist))
cur = conn.execute('SELECT convoblob FROM conversations')
item = cur.fetchone()
# no surprise about the mangled string here, since it was cast back into a string first
# "b'W1sxLCAyLCAidGhpc2lzZGhkaCJdLCBbMSwgMiwgInRobHNsZGZzZGYiXSwgWzIsIDAsICJzZGxmamxka3NqZmxrc2RqIl1d'"
print(item[0])
# Let's try again
conn.close()
conn = sqlite3.connect('test.db')
conn.execute('DROP TABLE IF EXISTS conversations')
conn.execute('CREATE TABLE conversations (convoblob BLOB)')
# now passing the bytes directly to the database, as bytes, into a blob
conn.execute('INSERT INTO conversations VALUES (?)', [encodedlist])
cur = conn.execute('SELECT convoblob FROM conversations')
item = cur.fetchone()
# This works:
my_list = json.loads(base64.b64decode(item[0]))
print(my_list)
我知道你使用 MySQL,但它对 SQLite 的作用相同(并且有同样的问题),所以我确定这是你的问题 sqlite3
适用于任何标准 Python 安装。
因此,简而言之:通过使用 % 格式将 bytes
对象格式化为字符串,您创建了 "b'data'"
对象,这给您带来了麻烦。这可以通过将 bytes
本身传递给 SQL 引擎并让它使用查询参数将其传递给查询来避免,无论如何这是避免 [=23= 的更好方法] 注入.