Base64 解码：特定字符串不正确的填充（使用正确的填充）

Question

我正在尝试使用 Python 的 base64.b64decode(str) 方法对字符串进行 Base64 解码（转换为字节）：

46oWrWpy2gTEGwNnN6Ayy

并且我确保它有 4 个 = 的倍数用于填充或出于挫败感：

46oWrWpy2gTEGwNnN6Ayy=

46oWrWpy2gTEGwNnN6Ayy==

46oWrWpy2gTEGwNnN6Ayy===

46oWrWpy2gTEGwNnN6Ayy==================================================

但我在 Python v3.6.1 上仍然得到“不正确的填充”。其他字符串都可以。

我给同事看，他试穿 Python 2 并观察到相同的反应。

我注意到删除第一个“4”足以确保 Base64 解码正常工作。

我略读了 Python's docs (noting casefold doesn't apply for Base64) and haven't yet ventured further into RFC3548 但想知道以前是否有人遇到过类似的事情。任何人有任何线索:)？这肯定不是 Python 的 Base64 解码器中的错误？

Answer 1

似乎是你的数据有问题，与Python无关：

$ echo 46oWrWpy2gTEGwNnN6Ayy | base64 -d
ãªjrÚÄg7 2base64: invalid input
$ echo 46oWrWpy2gTEGwNnN6Ayy= | base64 -d
ãªjrÚÄg7 2base64: invalid input
$ echo 46oWrWpy2gTEGwNnN6Ayy== | base64 -d
ãªjrÚÄg7 2base64: invalid input
$ echo 46oWrWpy2gTEGwNnN6Ayy=== | base64 -d
ãªjrÚÄg7 2base64: invalid input
$ echo 46oWrWpy2gTEGwNnN6Ayy==== | base64 -d
ãªjrÚÄg7 2base64: invalid input

我设法以这种方式解码它（删除了最后一个 'y'）：

$ echo 46oWrWpy2gTEGwNnN6Ay | base64 -d
ãªjrÚÄg7 2

Answer 2

解决了。

Base64 文本的每个字符是 raw 的 8 位中的 6 位。如果一个字符在原始字节的中间，那么您将丢失一些剩余的位。维基百科文章（以及许多在线答案）似乎使用填充作为“0”字节的可互换，但事实并非如此（在 Base64 字典中，它应该编码为 A）。

填充对于缺失数据不可互换。

#!/usr/bin/env python3

# We use hexlify for debugging.
import binascii

# We use the Base64 library.
import base64

# Base64 works on multiples of 4 characters..
# ..Sometimes we get 3/2/1 characters and it might be midway through another.
def relaxed_decode_base64(data):

 # If there is already padding we strim it as we calculate padding ourselves.
 if '=' in data:
  data = data[:data.index('=')]

 # We need to add padding, how many bytes are missing.
 missing_padding = len(data) % 4

 # We would be mid-way through a byte.
 if missing_padding == 1:
  data += 'A=='
 # Jut add on the correct length of padding.
 elif missing_padding == 2:
  data += '=='
 elif missing_padding == 3:
  data += '='

 # Actually perform the Base64 decode.
 return base64.b64decode(data)

# Debugging
print(str(relaxed_decode_base64('46oWrWpy2gTEGwNnN6Ayy')) + '\n')

testString = ''

for count in range(0, 1024):
 testString += '/'
 print(str(len(testString)) + ' - ' + testString)
 print(binascii.hexlify(relaxed_decode_base64(testString)))
 input()

Base64 解码：特定字符串不正确的填充（使用正确的填充）

Base64 Decode : Specific String Incorrect Padding (with correct padding)

python

base64

python-3.6