Python中的替换密码优化

Substitution cipher code optimization in Python

这是我的代码,用于测试我的三个不同函数,这些函数对 2000 个长度不超过 500 的随机字符串和 2000 个随机密钥执行简单的替换加密。

输出显示最好的函数是 encrypt3,然后是 encrypt1,最慢的是 encrypt2

还有哪些执行替换的方法比 encrypt3 更快?

将大写字母“A”替换为“Z”,不允许使用其他字符,也不需要测试输入字符串是否仅包含这些字符。

代码末尾测试所有函数是否产生相同的输出。

from random import randrange, seed, sample
from time import perf_counter

alphabet="ABCDEFGHIJKLMNOPQRSTUVWXYZ"

def encrypt1(t,key):
    v=dict(zip(alphabet,key))
    return ''.join(v.get(n) for n in t)

def encrypt2(t,key):
    return ''.join(key[alphabet.index(n)] for n in t)

def encrypt3(t,key):
    return t.translate(str.maketrans(alphabet,key))
    
d=2000 # number of strings and keys to test
length=500 # biggest length of strings

strings=[''.join(chr(randrange(26)+65) for n in range(1,randrange(1,length))) for n in range(d)]

keys=[''.join(chr(n+65) for n in sample(range(26), 26)) for n in range(d)]

a=perf_counter()
en1=[encrypt1(strings[n],keys[n]) for n in range(d)]
b=perf_counter()
print('encrypt1 time:',b-a)

a=perf_counter()
en2=[encrypt2(strings[n],keys[n]) for n in range(d)]
b=perf_counter()
print('encrypt2 time:',b-a)

a=perf_counter()
en3=[encrypt3(strings[n],keys[n]) for n in range(d)]
b=perf_counter()
print('encrypt3 time:',b-a)

print("All encryptions outputs are same:",en1==en2==en3)

输出:

# encrypt1 time: 0.09787979999999999
# encrypt2 time: 0.16948359999999996
# encrypt3 time: 0.029016399999999998
# All encryptions outputs are same: True

简单地比较一个没有任何操作的join和translante/maketrans所花费的时间,你很快就会发现不可能有一个使用join并且比translante/maketrans实现更快的解决方案(具体实现见文末代码)

Encryption join only took: 0.006335399999999991s
Encryption translation function took: 0.004516500000000034s

并且知道字符串在 Python 中是不可变的,并且连接是最快(如果不是最快的话)纯 python 连接字符的方法之一,似乎很难找到更好的方法python 实施。

但是,正如 frank-yellin 所提到的,可以完成 C 实现以使代码 运行 更快。 C 代码 运行s 比 python 快于这种情况下的低级操作(替换字符串中的字符)。

要尝试编写 C 版本,您可以使用 cython,这比您自己编写扩展的所有样板文件要容易得多。

示例: 您将需要安装 cython:pip install cython 并通过在包含以下 3 个文件

的文件夹中运行 python setup.py build_ext --inplace 来编译 cython 代码
# file cencrypt.pyx

# distutils: language = c++

from libcpp.string cimport string

cdef char char_A = 'A'

def encrypt(t, key):
    cdef string key_str = key.encode('UTF-8')
    cdef string result = t.encode('UTF-8')

    for i in range(len(result)):
        result[i] = key_str[result[i]-char_A]

    return result.decode('UTF-8')
# file main.py

from random import randrange, seed, sample
from time import perf_counter

from cencrypt import encrypt as encrypt_c

alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"


def encrypt_join_only(t, key):
    return ''.join(t)


def encrypt_dict(t, key):
    v = dict(zip(alphabet, key))
    return ''.join(v.get(n) for n in t)


def encrypt_array(t, key):
    ord_a = ord("A")
    return ''.join(key[n - ord_a] for n in map(ord, t))


def encrypt_translation(t, key):
    return t.translate(str.maketrans(alphabet, key))


d = 2000  # number of strings and keys to test
length = 500  # biggest length of strings

strings = [''.join(chr(randrange(26) + 65) for n in range(1, randrange(1, length))) for n in range(d)]
keys = [''.join(chr(n + 65) for n in sample(range(26), 26)) for n in range(d)]


def measure_perf(function, name):
    start = perf_counter()
    result = [function(strings[n], keys[n]) for n in range(d)]
    end = perf_counter()
    print(f'Encryption {name} took: {end - start}s', )
    return result


measure_perf(encrypt_join_only, "join only")
equal = (
    measure_perf(encrypt_dict, "dict lookup") ==
    measure_perf(encrypt_array, "array lookup") ==
    measure_perf(encrypt_translation, "translation function") ==
    measure_perf(encrypt_c, "cython implementation")
)

print("All encryptions outputs are same:", equal)
#file setup.py

from setuptools import setup
from Cython.Build import cythonize

setup(
    ext_modules=cythonize("cencrypt.pyx")
)

结果 I7:

Encryption join only took: 0.006335399999999991s
Encryption dict lookup took: 0.044010700000000014s
Encryption array lookup took: 0.0479598s
Encryption translation function took: 0.004516500000000034s
Encryption cython implementation took: 0.002248700000000048s
All encryptions outputs are same: True