为什么同一张图片的base64编码输出不同？

Question

下面这个函数读取图像并将其转换为 base64 图像字符串

def getBase64Image(filePath):
    with open(filePath, "rb") as img_file:
        my_string = base64.b64encode(img_file.read())
        my_string = my_string.decode('utf-8')
    return my_string

然而，下面的函数将图像作为数组（从 OpenCV 加载）并将其转换为 base64 图像字符串

def convertToBase64(image):
    image.tobytes()
    my_string = base64.b64encode(image)
    my_string = my_string.decode('utf-8')
    return my_string

第一个函数的输出字符串与第二个函数生成的字符串不同。这是为什么？

理想情况下，我希望第二个函数生成与第一个函数相同的 base64 字符串。

拜托，有人可以指导我如何实现这一目标吗？

Answer 1

您编码的结构完全不同。第一种方法是读取压缩图像格式（如 jpeg 或 png）的字节。即使是位图图像，图像中存储了很多原始数组数据中没有的额外数据。

第二种方法是获取像素数据的h x w x 3数组，将其转换为字节字符串，然后对其进行64位编码。通过将黑白数据数组的字节串与保存的位图图像进行比较，您可以看到差异。

Answer 2

您的第一个函数使用 PNG/JPG 图像数据 "as-is" 并对其进行编码。

您的第二个函数使用图像的 RGB 或灰度表示中的 RAW 字节并对其进行编码。如果你想将 RAW RGB 转换为图像，你可以使用 cv2.imencode() 这将输出 PNG 或 JPG 或任何你喜欢的。

def convertToBase64(image):
    #image.tobytes() -- don't need this
    _, converted = cv2.imencode( '.png', image)  # here the magic happens
    my_string = base64.b64encode(converted)  # and here
    my_string = my_string.decode('utf-8')
    return my_string

是的，以防万一不清楚。您不必将编码后的图像保存在任何地方，这一切都发生在内存中。

为什么同一张图片的base64编码输出不同？

Why is base64encoded output different for the same image?

python

base64

opencv

tobase64string