在 JSON 中序列化一个 base64 编码的数据
Serialize in JSON a base64 encoded data
我正在编写一个脚本来为演示自动生成数据,我需要在 JSON 中序列化一些数据。此数据的一部分是图像,因此我将其编码为 base64,但是当我尝试 运行 我的脚本时,我得到:
Traceback (most recent call last):
File "lazyAutomationScript.py", line 113, in <module>
json.dump(out_dict, outfile)
File "/usr/lib/python3.4/json/__init__.py", line 178, in dump
for chunk in iterable:
File "/usr/lib/python3.4/json/encoder.py", line 422, in _iterencode
yield from _iterencode_dict(o, _current_indent_level)
File "/usr/lib/python3.4/json/encoder.py", line 396, in _iterencode_dict
yield from chunks
File "/usr/lib/python3.4/json/encoder.py", line 396, in _iterencode_dict
yield from chunks
File "/usr/lib/python3.4/json/encoder.py", line 429, in _iterencode
o = _default(o)
File "/usr/lib/python3.4/json/encoder.py", line 173, in default
raise TypeError(repr(o) + " is not JSON serializable")
TypeError: b'iVBORw0KGgoAAAANSUhEUgAADWcAABRACAYAAABf7ZytAAAABGdB...
...
BF2jhLaJNmRwAAAAAElFTkSuQmCC' is not JSON serializable
据我所知,base64 编码的任何东西(在本例中为 PNG 图像)只是一个字符串,因此它应该会给序列化带来问题。我错过了什么?
您必须注意数据类型。
如果你读取二进制图像,你会得到字节。
如果你用 base64 对这些字节进行编码,你又会得到 ... 字节! (请参阅有关 b64encode 的文档)
json 无法处理原始字节,这就是您收到错误的原因。
刚刚写了一些例子,附上注释,希望对你有帮助:
from base64 import b64encode
from json import dumps
ENCODING = 'utf-8'
IMAGE_NAME = 'spam.jpg'
JSON_NAME = 'output.json'
# first: reading the binary stuff
# note the 'rb' flag
# result: bytes
with open(IMAGE_NAME, 'rb') as open_file:
byte_content = open_file.read()
# second: base64 encode read data
# result: bytes (again)
base64_bytes = b64encode(byte_content)
# third: decode these bytes to text
# result: string (in utf-8)
base64_string = base64_bytes.decode(ENCODING)
# optional: doing stuff with the data
# result here: some dict
raw_data = {IMAGE_NAME: base64_string}
# now: encoding the data to json
# result: string
json_data = dumps(raw_data, indent=2)
# finally: writing the json string to disk
# note the 'w' flag, no 'b' needed as we deal with text here
with open(JSON_NAME, 'w') as another_open_file:
another_open_file.write(json_data)
替代解决方案是使用自定义编码器动态编码内容:
import json
from base64 import b64encode
class Base64Encoder(json.JSONEncoder):
# pylint: disable=method-hidden
def default(self, o):
if isinstance(o, bytes):
return b64encode(o).decode()
return json.JSONEncoder.default(self, o)
有了这个定义你可以做:
m = {'key': b'\x9c\x13\xff\x00'}
json.dumps(m, cls=Base64Encoder)
它将产生:
'{"key": "nBP/AA=="}'
What am I missing?
错误是 binary
不可 JSON 序列化。
from base64 import b64encode
# *binary representation* of the base64 string
assert b64encode(b"binary content") == b'YmluYXJ5IGNvbnRlbnQ='
# base64 string
assert b64encode(b"binary content").decode('utf-8') == 'YmluYXJ5IGNvbnRlbnQ='
后者肯定是 "JSON serializable" 因为是二进制 b"binary content"
.
的 base64 字符串表示
我正在编写一个脚本来为演示自动生成数据,我需要在 JSON 中序列化一些数据。此数据的一部分是图像,因此我将其编码为 base64,但是当我尝试 运行 我的脚本时,我得到:
Traceback (most recent call last):
File "lazyAutomationScript.py", line 113, in <module>
json.dump(out_dict, outfile)
File "/usr/lib/python3.4/json/__init__.py", line 178, in dump
for chunk in iterable:
File "/usr/lib/python3.4/json/encoder.py", line 422, in _iterencode
yield from _iterencode_dict(o, _current_indent_level)
File "/usr/lib/python3.4/json/encoder.py", line 396, in _iterencode_dict
yield from chunks
File "/usr/lib/python3.4/json/encoder.py", line 396, in _iterencode_dict
yield from chunks
File "/usr/lib/python3.4/json/encoder.py", line 429, in _iterencode
o = _default(o)
File "/usr/lib/python3.4/json/encoder.py", line 173, in default
raise TypeError(repr(o) + " is not JSON serializable")
TypeError: b'iVBORw0KGgoAAAANSUhEUgAADWcAABRACAYAAABf7ZytAAAABGdB...
...
BF2jhLaJNmRwAAAAAElFTkSuQmCC' is not JSON serializable
据我所知,base64 编码的任何东西(在本例中为 PNG 图像)只是一个字符串,因此它应该会给序列化带来问题。我错过了什么?
您必须注意数据类型。
如果你读取二进制图像,你会得到字节。 如果你用 base64 对这些字节进行编码,你又会得到 ... 字节! (请参阅有关 b64encode 的文档)
json 无法处理原始字节,这就是您收到错误的原因。
刚刚写了一些例子,附上注释,希望对你有帮助:
from base64 import b64encode
from json import dumps
ENCODING = 'utf-8'
IMAGE_NAME = 'spam.jpg'
JSON_NAME = 'output.json'
# first: reading the binary stuff
# note the 'rb' flag
# result: bytes
with open(IMAGE_NAME, 'rb') as open_file:
byte_content = open_file.read()
# second: base64 encode read data
# result: bytes (again)
base64_bytes = b64encode(byte_content)
# third: decode these bytes to text
# result: string (in utf-8)
base64_string = base64_bytes.decode(ENCODING)
# optional: doing stuff with the data
# result here: some dict
raw_data = {IMAGE_NAME: base64_string}
# now: encoding the data to json
# result: string
json_data = dumps(raw_data, indent=2)
# finally: writing the json string to disk
# note the 'w' flag, no 'b' needed as we deal with text here
with open(JSON_NAME, 'w') as another_open_file:
another_open_file.write(json_data)
替代解决方案是使用自定义编码器动态编码内容:
import json
from base64 import b64encode
class Base64Encoder(json.JSONEncoder):
# pylint: disable=method-hidden
def default(self, o):
if isinstance(o, bytes):
return b64encode(o).decode()
return json.JSONEncoder.default(self, o)
有了这个定义你可以做:
m = {'key': b'\x9c\x13\xff\x00'}
json.dumps(m, cls=Base64Encoder)
它将产生:
'{"key": "nBP/AA=="}'
What am I missing?
错误是 binary
不可 JSON 序列化。
from base64 import b64encode
# *binary representation* of the base64 string
assert b64encode(b"binary content") == b'YmluYXJ5IGNvbnRlbnQ='
# base64 string
assert b64encode(b"binary content").decode('utf-8') == 'YmluYXJ5IGNvbnRlbnQ='
后者肯定是 "JSON serializable" 因为是二进制 b"binary content"
.