是什么让重复的 base64 和 base36 编码如此缓慢?以及如何让它更快?
What makes repeated base64 and base36 encoding so slow? and how to make it faster?
我正在尝试一些不同的编码,当我尝试在 base64/base32 中重复编码一些文本时(使用哪个取决于半随机布尔列表)我注意到它慢得离谱,我不明白,因为我认为他们特别快。我真的不明白为什么它这么慢,如果你能帮助我就太好了。
这是相关代码的一部分:
from base64 import b64encode, b32encode
from random import random as rn
big_number = int(input("The number of encoding layers : "))
bool_list = [True if rn() < 0.5 else False for _ in range(big_number)]
sample_text = bytes("lorem ipsum", "utf8")
for curr_bool in bool_list:
temp = b64encode(sample_text) if curr_bool else b32encode(sample_text)
sample_text = temp
内存和时间昂贵的操作。答案基于关键 :
If you encode bytes with base64
, the result is longer than the
input. If you take this result and encode it again in a loop, you have
exponential growth.
以下修改后的脚本显示了 base64
和 base32
编码的增长比率:
from base64 import b64encode, b32encode
from random import random as rn
ii = 0
big_number = int(input("The number of encoding layers : "))
30
bool_list = [True if rn() < 0.5 else False for _ in range(big_number)]
sample_text = bytes("lorem ipsum", "utf8")
sample_len = len( sample_text )
current_len = sample_len
for curr_bool in bool_list:
sample_text = b64encode(sample_text) if curr_bool else b32encode(sample_text)
print( ii, curr_bool, len(sample_text), (len( sample_text )/current_len))
current_len = len( sample_text )
ii += 1
** 示例输出**(截断):python .\SO009943.py
The number of encoding layers : 30
…
24 True 172320 1.3333333333333333
25 False 275712 1.6
26 True 367616 1.3333333333333333
27 False 588192 1.600017409470752
28 True 784256 1.3333333333333333
29 False 1254816 1.6000081606006202
因此,对于 base64 的 4/3
和 base32 的 1.6
:
,结果长度大致介于 Compound interest 之间
big_number = 30
round( sample_len * (4/3) ** big_number)
# 61596
round( sample_len * 1.6 ** big_number)
# 14621508
对于更大的数字:
big_number = 50
round( sample_len * (4/3) ** big_number)
# 19423591
round( sample_len * 1.6 ** big_number)
# 176763184868
和
big_number = 99
round( sample_len * (4/3) ** big_number)
# 25723354884215
round( sample_len * 1.6 ** big_number)
# 1775296791184759324672
我正在尝试一些不同的编码,当我尝试在 base64/base32 中重复编码一些文本时(使用哪个取决于半随机布尔列表)我注意到它慢得离谱,我不明白,因为我认为他们特别快。我真的不明白为什么它这么慢,如果你能帮助我就太好了。
这是相关代码的一部分:
from base64 import b64encode, b32encode
from random import random as rn
big_number = int(input("The number of encoding layers : "))
bool_list = [True if rn() < 0.5 else False for _ in range(big_number)]
sample_text = bytes("lorem ipsum", "utf8")
for curr_bool in bool_list:
temp = b64encode(sample_text) if curr_bool else b32encode(sample_text)
sample_text = temp
内存和时间昂贵的操作。答案基于关键
If you encode bytes with
base64
, the result is longer than the input. If you take this result and encode it again in a loop, you have exponential growth.
以下修改后的脚本显示了 base64
和 base32
编码的增长比率:
from base64 import b64encode, b32encode
from random import random as rn
ii = 0
big_number = int(input("The number of encoding layers : "))
30
bool_list = [True if rn() < 0.5 else False for _ in range(big_number)]
sample_text = bytes("lorem ipsum", "utf8")
sample_len = len( sample_text )
current_len = sample_len
for curr_bool in bool_list:
sample_text = b64encode(sample_text) if curr_bool else b32encode(sample_text)
print( ii, curr_bool, len(sample_text), (len( sample_text )/current_len))
current_len = len( sample_text )
ii += 1
** 示例输出**(截断):python .\SO009943.py
The number of encoding layers : 30
…
24 True 172320 1.3333333333333333
25 False 275712 1.6
26 True 367616 1.3333333333333333
27 False 588192 1.600017409470752
28 True 784256 1.3333333333333333
29 False 1254816 1.6000081606006202
因此,对于 base64 的 4/3
和 base32 的 1.6
:
big_number = 30
round( sample_len * (4/3) ** big_number)
# 61596
round( sample_len * 1.6 ** big_number)
# 14621508
对于更大的数字:
big_number = 50
round( sample_len * (4/3) ** big_number)
# 19423591
round( sample_len * 1.6 ** big_number)
# 176763184868
和
big_number = 99
round( sample_len * (4/3) ** big_number)
# 25723354884215
round( sample_len * 1.6 ** big_number)
# 1775296791184759324672