为 Ising-/Potts-Model Monte-Carlo 加速 Python/Numpy 代码

Question

对于一个小的副项目，我在 Python/Numpy 中编写了一个二维 Ising-/Potts-Model 蒙特卡洛模拟，下面是（简化的）代码。

基本上代码执行以下操作：

在 [1,orientations]

orientations

对于每个时间步长 (MCS)，数组的每个像素都以伪随机顺序访问一次（因此最大素数在 () 和 index = (a*i + rand) % (N**2) x = index % N y = index // N )
检查 8 个相邻的数组条目（周期性边界条件）并将条目更改为相邻值之一
如果新配置的能量变低，则接受更改，否则拒绝除非满足条件

我尽量加快速度，但对于大型数组 (N,M > 500)，代码并不是很快。由于我需要大约 10e5 MCS 的阵列才能看到明显的趋势，因此实现了

1 loop, best of 3: 276 ms per loop

100x100 阵列上的 1 个 MCS 是不够的。不幸的是，由于我缺乏经验，我不知道如何提高性能。

我认为 Neighbors() 和 calc_dE() 函数是瓶颈，尤其是嵌套循环，但我找不到加快速度的方法。我的 cython 尝试并没有真正成功，因为我之前没有用 cython 做过任何事情，所以我愿意接受任何建议。

代码：

（pyplot命令只是为了可视化，通常有注释。）

import math
import numpy as np
import matplotlib.pyplot as plt

def largest_primes_under(N):
    n = N - 1
    while n >= 2:
        if all(n % d for d in range(2, int(n ** 0.5 + 1))):
            return n
        n -= 1

def Neighbors(Lattice,i,j,n=1):
    ''' Returns an flat array of all neighboring sites in the n-th coordination sphere including the center'''
    N, M = Lattice.shape
    rows = [(i-1) % N, i, (i+1) % N]
    cols = [(j-1) % N, j, (j+1) % M]
    return Lattice[rows][:, cols].flatten()

def calc_dE(Lattice, x, y, z):
    N, M = Lattice.shape
    old_energy = 0
    new_energy = 0
    for i in [0,1,-1]:
        for j in [0,1,-1]:
            if i == 0 and j == 0: 
                continue
            if Lattice[x%N,y%M] == Lattice[(x+i)%N,(y+j)%M]:
                old_energy += 1
            elif z == Lattice[(x+i)%N,(y+j)%M]: 
                new_energy += 1 
    return old_energy-new_energy

N, M = 100,100
orientations = N*M
MCS = int(100)

a = largest_primes_under(N*M)
L = np.random.randint(1,orientations+1,size=(N,M))

mat = plt.matshow(L,cmap = plt.get_cmap('plasma', orientations+1), vmin = -0.5, vmax = orientations+0.5, interpolation='kaiser')
plt.axis('off')

for t in range(1,MCS+1):
    rand = np.random.random_integers(N*M)
    for i in range(0,N**2):
        index = (a*i + rand) % (N**2)
        x = index % N
        y = index // N
        n = Neighbors(L,x,y)
        if len(n)-1 == 0: 
            continue
        else: 
            z = np.random.choice(n)
        dE = calc_dE(L,x,y,z)
        if  (dE < 0): 
            L[x%N,y%N] = z      
        elif np.random.sample() < math.exp(-dE*2.5): 
            L[x%N,y%N] = z

    mat.set_data(L)
    plt.draw()
    plt.pause(0.0001)

Answer 1

不确定您在依赖项方面是否有任何限制，但我肯定会调查 Numba。它提供了一组装饰器（特别是 njit），可以将您的代码编译为机器代码，并使其可能更快、更快，前提是您使用的是兼容类型（例如 numpy 数组，但不是 pandas数据帧）。

此外，不确定您正在查看的比例是多少，但我敢肯定，您可以找到比手动实现的 for 循环优化得多的素数搜索算法示例。

否则你总是可以回退到 Cython，但它需要重新编写你的代码。

编辑：好的，我用 numba 试了一下。

一些注意事项：

将整个for循环移动到一个函数中，这样你就可以用njit
在 Neighbors 中，我不得不将 rows 和 cols 从 lists 更改为 np.arrays 因为 numba 不接受通过列表进行索引
我将 np.random.random_integers 替换为 np.random.randint，因为前者已被弃用
我将 math.exp 替换为 np.exp，这应该会带来轻微的性能提升（除了为您节省导入）

import numpy as np
from numba import njit

def largest_primes_under(N):
    n = N - 1
    while n >= 2:
        if all(n % d for d in range(2, int(n ** 0.5 + 1))):
            return n
        n -= 1

@njit
def Neighbors(Lattice,i,j,n=1):
    ''' Returns an flat array of all neighboring sites in the n-th coordination sphere including the center'''
    N, M = Lattice.shape
    rows = np.array([(i-1) % N, i, (i+1) % N])
    cols = np.array([(j-1) % N, j, (j+1) % M])
    return Lattice[rows,:][:,cols].flatten()

@njit
def calc_dE(Lattice, x, y, z):
    N, M = Lattice.shape
    old_energy = 0
    new_energy = 0
    for i in [0,1,-1]:
        for j in [0,1,-1]:
            if i == 0 and j == 0: 
                continue
            if Lattice[x%N,y%M] == Lattice[(x+i)%N,(y+j)%M]:
                old_energy += 1
            elif z == Lattice[(x+i)%N,(y+j)%M]: 
                new_energy += 1 
    return old_energy-new_energy

@njit
def fun(L, MCS, a):
    N, M = L.shape

    for t in range(1,MCS+1):
        rand = np.random.randint(N*M)
        for i in range(0,N**2):
            index = (a*i + rand) % (N**2)
            x = index % N
            y = index // N
            n = Neighbors(L,x,y)
            if len(n)-1 == 0: continue
            else: z = np.random.choice(n)
            dE = calc_dE(L,x,y,z)
            if  (dE < 0): L[x%N,y%N] = z      
            elif np.random.sample() < np.exp(-dE*2.5): L[x%N,y%N] = z    
    return L

运行同例

N, M = 100,100
orientations = N*M
MCS = 1

L = np.random.randint(1,orientations+1,size=(N,M))
a = largest_primes_under(N*M)

到 %timeit fun(L, MCS, a)（在 Jupyter 中）给我

16.9 ms ± 853 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

这比您目前拥有的快约 15 倍。您可能可以进行更多优化，关于 numba 的好处是我获得了 x15 的速度提升，而无需深入研究或显着改变代码的实现方式。

几个一般性观察：

在Neighbors中，argument/parameter n没有被使用，所以为了清楚起见你应该删除它（或更新代码）
在Python中，您通常希望对函数名和变量使用小写字母。大写通常保留给类（不是对象）和 "global" 变量
你上面关于 largest_primes_under 只被调用一次的评论绝对是正确的，我应该仔细看看代码。

premature optimization is the root of all evil

Answer 2

感谢您提供的代码，我发现它对于支持我正在开发的 Monte Carlo 课程非常有用（以及 Giorgio 的优化）。但是，我相信您上面的原始代码中可能存在严重缺陷，这会导致在临界点以上或附近进行模拟时出现问题。我相信这条线

z = np.random.choice(n)

应该是

z = np.random.choice(orientations)+1

否则代码将无法从所有相邻自旋都具有相同方向的配置中逃脱。在查看临界点（kT/J = 2.27，这在您的代码中意味着将 beta 因子从 2.5 更改为 0.441）之上的伊辛模型（方向 = 2）的极限情况时，这一点尤为重要。希望这对您有所帮助。

为 Ising-/Potts-Model Monte-Carlo 加速 Python/Numpy 代码

Speed up Python/Numpy Code for Ising-/Potts-Model Monte-Carlo

python

arrays

numpy

montecarlo