Python中的替换密码优化
Substitution cipher code optimization in Python
这是我的代码,用于测试我的三个不同函数,这些函数对 2000 个长度不超过 500 的随机字符串和 2000 个随机密钥执行简单的替换加密。
输出显示最好的函数是 encrypt3
,然后是 encrypt1
,最慢的是 encrypt2
。
还有哪些执行替换的方法比 encrypt3
更快?
将大写字母“A”替换为“Z”,不允许使用其他字符,也不需要测试输入字符串是否仅包含这些字符。
代码末尾测试所有函数是否产生相同的输出。
from random import randrange, seed, sample
from time import perf_counter
alphabet="ABCDEFGHIJKLMNOPQRSTUVWXYZ"
def encrypt1(t,key):
v=dict(zip(alphabet,key))
return ''.join(v.get(n) for n in t)
def encrypt2(t,key):
return ''.join(key[alphabet.index(n)] for n in t)
def encrypt3(t,key):
return t.translate(str.maketrans(alphabet,key))
d=2000 # number of strings and keys to test
length=500 # biggest length of strings
strings=[''.join(chr(randrange(26)+65) for n in range(1,randrange(1,length))) for n in range(d)]
keys=[''.join(chr(n+65) for n in sample(range(26), 26)) for n in range(d)]
a=perf_counter()
en1=[encrypt1(strings[n],keys[n]) for n in range(d)]
b=perf_counter()
print('encrypt1 time:',b-a)
a=perf_counter()
en2=[encrypt2(strings[n],keys[n]) for n in range(d)]
b=perf_counter()
print('encrypt2 time:',b-a)
a=perf_counter()
en3=[encrypt3(strings[n],keys[n]) for n in range(d)]
b=perf_counter()
print('encrypt3 time:',b-a)
print("All encryptions outputs are same:",en1==en2==en3)
输出:
# encrypt1 time: 0.09787979999999999
# encrypt2 time: 0.16948359999999996
# encrypt3 time: 0.029016399999999998
# All encryptions outputs are same: True
简单地比较一个没有任何操作的join和translante/maketrans
所花费的时间,你很快就会发现不可能有一个使用join并且比translante/maketrans
实现更快的解决方案(具体实现见文末代码)
Encryption join only took: 0.006335399999999991s
Encryption translation function took: 0.004516500000000034s
并且知道字符串在 Python 中是不可变的,并且连接是最快(如果不是最快的话)纯 python 连接字符的方法之一,似乎很难找到更好的方法python 实施。
但是,正如 frank-yellin 所提到的,可以完成 C 实现以使代码 运行 更快。 C 代码 运行s 比 python 快于这种情况下的低级操作(替换字符串中的字符)。
要尝试编写 C 版本,您可以使用 cython,这比您自己编写扩展的所有样板文件要容易得多。
示例:
您将需要安装 cython:pip install cython
并通过在包含以下 3 个文件
的文件夹中运行 python setup.py build_ext --inplace
来编译 cython 代码
# file cencrypt.pyx
# distutils: language = c++
from libcpp.string cimport string
cdef char char_A = 'A'
def encrypt(t, key):
cdef string key_str = key.encode('UTF-8')
cdef string result = t.encode('UTF-8')
for i in range(len(result)):
result[i] = key_str[result[i]-char_A]
return result.decode('UTF-8')
# file main.py
from random import randrange, seed, sample
from time import perf_counter
from cencrypt import encrypt as encrypt_c
alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
def encrypt_join_only(t, key):
return ''.join(t)
def encrypt_dict(t, key):
v = dict(zip(alphabet, key))
return ''.join(v.get(n) for n in t)
def encrypt_array(t, key):
ord_a = ord("A")
return ''.join(key[n - ord_a] for n in map(ord, t))
def encrypt_translation(t, key):
return t.translate(str.maketrans(alphabet, key))
d = 2000 # number of strings and keys to test
length = 500 # biggest length of strings
strings = [''.join(chr(randrange(26) + 65) for n in range(1, randrange(1, length))) for n in range(d)]
keys = [''.join(chr(n + 65) for n in sample(range(26), 26)) for n in range(d)]
def measure_perf(function, name):
start = perf_counter()
result = [function(strings[n], keys[n]) for n in range(d)]
end = perf_counter()
print(f'Encryption {name} took: {end - start}s', )
return result
measure_perf(encrypt_join_only, "join only")
equal = (
measure_perf(encrypt_dict, "dict lookup") ==
measure_perf(encrypt_array, "array lookup") ==
measure_perf(encrypt_translation, "translation function") ==
measure_perf(encrypt_c, "cython implementation")
)
print("All encryptions outputs are same:", equal)
#file setup.py
from setuptools import setup
from Cython.Build import cythonize
setup(
ext_modules=cythonize("cencrypt.pyx")
)
结果 I7:
Encryption join only took: 0.006335399999999991s
Encryption dict lookup took: 0.044010700000000014s
Encryption array lookup took: 0.0479598s
Encryption translation function took: 0.004516500000000034s
Encryption cython implementation took: 0.002248700000000048s
All encryptions outputs are same: True
这是我的代码,用于测试我的三个不同函数,这些函数对 2000 个长度不超过 500 的随机字符串和 2000 个随机密钥执行简单的替换加密。
输出显示最好的函数是 encrypt3
,然后是 encrypt1
,最慢的是 encrypt2
。
还有哪些执行替换的方法比 encrypt3
更快?
将大写字母“A”替换为“Z”,不允许使用其他字符,也不需要测试输入字符串是否仅包含这些字符。
代码末尾测试所有函数是否产生相同的输出。
from random import randrange, seed, sample
from time import perf_counter
alphabet="ABCDEFGHIJKLMNOPQRSTUVWXYZ"
def encrypt1(t,key):
v=dict(zip(alphabet,key))
return ''.join(v.get(n) for n in t)
def encrypt2(t,key):
return ''.join(key[alphabet.index(n)] for n in t)
def encrypt3(t,key):
return t.translate(str.maketrans(alphabet,key))
d=2000 # number of strings and keys to test
length=500 # biggest length of strings
strings=[''.join(chr(randrange(26)+65) for n in range(1,randrange(1,length))) for n in range(d)]
keys=[''.join(chr(n+65) for n in sample(range(26), 26)) for n in range(d)]
a=perf_counter()
en1=[encrypt1(strings[n],keys[n]) for n in range(d)]
b=perf_counter()
print('encrypt1 time:',b-a)
a=perf_counter()
en2=[encrypt2(strings[n],keys[n]) for n in range(d)]
b=perf_counter()
print('encrypt2 time:',b-a)
a=perf_counter()
en3=[encrypt3(strings[n],keys[n]) for n in range(d)]
b=perf_counter()
print('encrypt3 time:',b-a)
print("All encryptions outputs are same:",en1==en2==en3)
输出:
# encrypt1 time: 0.09787979999999999
# encrypt2 time: 0.16948359999999996
# encrypt3 time: 0.029016399999999998
# All encryptions outputs are same: True
简单地比较一个没有任何操作的join和translante/maketrans
所花费的时间,你很快就会发现不可能有一个使用join并且比translante/maketrans
实现更快的解决方案(具体实现见文末代码)
Encryption join only took: 0.006335399999999991s
Encryption translation function took: 0.004516500000000034s
并且知道字符串在 Python 中是不可变的,并且连接是最快(如果不是最快的话)纯 python 连接字符的方法之一,似乎很难找到更好的方法python 实施。
但是,正如 frank-yellin 所提到的,可以完成 C 实现以使代码 运行 更快。 C 代码 运行s 比 python 快于这种情况下的低级操作(替换字符串中的字符)。
要尝试编写 C 版本,您可以使用 cython,这比您自己编写扩展的所有样板文件要容易得多。
示例:
您将需要安装 cython:pip install cython
并通过在包含以下 3 个文件
python setup.py build_ext --inplace
来编译 cython 代码
# file cencrypt.pyx
# distutils: language = c++
from libcpp.string cimport string
cdef char char_A = 'A'
def encrypt(t, key):
cdef string key_str = key.encode('UTF-8')
cdef string result = t.encode('UTF-8')
for i in range(len(result)):
result[i] = key_str[result[i]-char_A]
return result.decode('UTF-8')
# file main.py
from random import randrange, seed, sample
from time import perf_counter
from cencrypt import encrypt as encrypt_c
alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
def encrypt_join_only(t, key):
return ''.join(t)
def encrypt_dict(t, key):
v = dict(zip(alphabet, key))
return ''.join(v.get(n) for n in t)
def encrypt_array(t, key):
ord_a = ord("A")
return ''.join(key[n - ord_a] for n in map(ord, t))
def encrypt_translation(t, key):
return t.translate(str.maketrans(alphabet, key))
d = 2000 # number of strings and keys to test
length = 500 # biggest length of strings
strings = [''.join(chr(randrange(26) + 65) for n in range(1, randrange(1, length))) for n in range(d)]
keys = [''.join(chr(n + 65) for n in sample(range(26), 26)) for n in range(d)]
def measure_perf(function, name):
start = perf_counter()
result = [function(strings[n], keys[n]) for n in range(d)]
end = perf_counter()
print(f'Encryption {name} took: {end - start}s', )
return result
measure_perf(encrypt_join_only, "join only")
equal = (
measure_perf(encrypt_dict, "dict lookup") ==
measure_perf(encrypt_array, "array lookup") ==
measure_perf(encrypt_translation, "translation function") ==
measure_perf(encrypt_c, "cython implementation")
)
print("All encryptions outputs are same:", equal)
#file setup.py
from setuptools import setup
from Cython.Build import cythonize
setup(
ext_modules=cythonize("cencrypt.pyx")
)
结果 I7:
Encryption join only took: 0.006335399999999991s
Encryption dict lookup took: 0.044010700000000014s
Encryption array lookup took: 0.0479598s
Encryption translation function took: 0.004516500000000034s
Encryption cython implementation took: 0.002248700000000048s
All encryptions outputs are same: True