是什么让重复的 base64 和 base36 编码如此缓慢?以及如何让它更快?

What makes repeated base64 and base36 encoding so slow? and how to make it faster?

我正在尝试一些不同的编码,当我尝试在 base64/base32 中重复编码一些文本时(使用哪个取决于半随机布尔列表)我注意到它慢得离谱,我不明白,因为我认为他们特别快。我真的不明白为什么它这么慢,如果你能帮助我就太好了。

这是相关代码的一部分:

from base64 import b64encode, b32encode
from random import random as rn

big_number = int(input("The number of encoding layers : "))
bool_list = [True if rn() < 0.5 else False for _ in range(big_number)]
sample_text = bytes("lorem ipsum", "utf8")
for curr_bool in bool_list:
    temp = b64encode(sample_text) if curr_bool else b32encode(sample_text)
    sample_text = temp

内存和时间昂贵的操作。答案基于关键 :

If you encode bytes with base64, the result is longer than the input. If you take this result and encode it again in a loop, you have exponential growth.

以下修改后的脚本显示了 base64base32 编码的增长比率:

from base64 import b64encode, b32encode
from random import random as rn
ii = 0
big_number = int(input("The number of encoding layers : "))
30
bool_list = [True if rn() < 0.5 else False for _ in range(big_number)]
sample_text = bytes("lorem ipsum", "utf8")
sample_len = len( sample_text )
current_len = sample_len
for curr_bool in bool_list:
    sample_text = b64encode(sample_text) if curr_bool else b32encode(sample_text)
    print( ii, curr_bool, len(sample_text), (len( sample_text )/current_len))
    current_len = len( sample_text )
    ii += 1

** 示例输出**(截断):python .\SO009943.py

The number of encoding layers : 30
…
24 True 172320 1.3333333333333333
25 False 275712 1.6
26 True 367616 1.3333333333333333
27 False 588192 1.600017409470752
28 True 784256 1.3333333333333333
29 False 1254816 1.6000081606006202

因此,对于 base64 的 4/3 和 base32 的 1.6:

,结果长度大致介于 Compound interest 之间
big_number = 30
round( sample_len * (4/3) ** big_number)
# 61596
round( sample_len * 1.6 ** big_number)
# 14621508

对于更大的数字:

big_number = 50
round( sample_len * (4/3) ** big_number)
# 19423591
round( sample_len * 1.6 ** big_number)
# 176763184868

big_number = 99
round( sample_len * (4/3) ** big_number)
# 25723354884215
round( sample_len * 1.6 ** big_number)
# 1775296791184759324672