3 个嵌套循环：优化简单模拟以提高速度

Question

背景

我遇到了一个难题。在这里：

One day, an alien comes to Earth. Every day, each alien does one of four things, each with equal probability to:

Kill himself

Do nothing

Split himself into two aliens (while killing himself)

split himself into three aliens (while killing himself)

What is the probability that the alien species eventually dies out entirely?

Link to the source and the solution, problem #10

很遗憾，我没能从理论上解决问题。然后我继续用基本的马尔可夫链和 Monte Carlo 模拟来模拟它。

面试时没有问我这个问题。我从朋友那里得知了这个问题，然后在搜索数学解决方案时找到了上面的link。

重新解释问题

我们从外星人的数量开始n = 1。 n有机会不变，减1，加1，减2，每次%25。如果 n 递增，即外星人倍增，我们再次重复此过程 n 次。这对应于每个外星人都会做一遍它的事情。不过，我必须设置一个上限，以便我们停止模拟并避免崩溃。 n 可能会增加，我们会一次又一次地循环 n 次。

如果外星人以某种方式灭绝，我们将再次停止模拟，因为没有什么可以模拟了。

在n达到零或上限后，我们还记录人口（它将是零或某个数字>= max_pop）。

我重复了很多次并记录了每一个结果。最后，零的数量除以结果总数应该给我一个近似值。

密码

from random import randint
import numpy as np

pop_max = 100
iter_max = 100000

results = np.zeros(iter_max, dtype=int)

for i in range(iter_max):
    n = 1
    while n > 0 and n < pop_max:
        for j in range(n):
            x = randint(1, 4)
            if x == 1:
                n = n - 1
            elif x == 2:
                continue
            elif x == 3:
                n = n + 1
            elif x == 4:
                n = n + 2
    results[i] = n

print( np.bincount(results)[0] / iter_max )

iter_max和pop_max确实可以改，但是我觉得如果有100个外星人，他们灭绝的概率就可以忽略不计了。不过这只是一个猜测，我还没有做任何事情来计算（更多）适当的人口上限。

这段代码给出了有希望的结果，非常接近大约 %41.4 的真实答案。

一些输出

> python aliens.py
0.41393
> python aliens.py
0.41808
> python aliens.py
0.41574
> python aliens.py
0.4149
> python aliens.py
0.41505
> python aliens.py
0.41277
> python aliens.py
0.41428
> python aliens.py
0.41407
> python aliens.py
0.41676

后果

我对结果没意见，但这段代码所用的时间我不能说相同。大约需要 16-17 秒:)

如何提高速度？如何优化循环（尤其是 while 循环）？也许有更好的方法或更好的模型？

Answer 1

您可以通过使用 numpy（更快）一次生成 n 个随机整数来向量化您的内部循环，并使用算术而不是布尔逻辑摆脱所有 if 语句。

while...: 
    #population changes by (-1, 0, +1, +2) for each alien
    n += np.random.randint(-1,3, size=n).sum()

对其他所有内容使用您的确切代码（您可能会在其他地方找到其他优化）我使用这一更改从 21.2 秒减少到 4.3 秒。

在不更改算法（即使用 monte carlo 以外的方法求解）的情况下，在您开始编译为机器代码（幸运的是如果你安装了 numba 就很容易了。

我不会提供有关 numba 执行的即时编译的完整教程，相反，我只会分享我的代码并记下我所做的更改：

from time import time
import numpy as np
from numpy.random import randint
from numba import njit, int32, prange

@njit('i4(i4)')
def simulate(pop_max): #move simulation of one population to a function for parallelization
    n = 1
    while 0 < n < pop_max:
        n += np.sum(randint(-1,3,n))
    return n

@njit('i4[:](i4,i4)', parallel=True)
def solve(pop_max, iter_max):
    #this could be easily simplified to just return the raio of populations that die off vs survive to pop_max
    # which would save you some ram (though the speed is about the same)
    results = np.zeros(iter_max, dtype=int32) #numba needs int32 here rather than python int
    for i in prange(iter_max): #prange specifies that this loop can be parallelized
        results[i] = simulate(pop_max)
    return results

pop_max = 100
iter_max = 100000

t = time()
print( np.bincount(solve(pop_max, iter_max))[0] / iter_max )
print('time elapsed: ', time()-t)

在我的系统上，并行化编译使计算速度降低到大约 0.15 秒。

Answer 2

没有 numpy 解决方案，100k 次模拟大约需要 5 秒：

from random import choices

def simulate_em():
    def spwn(aliens):
        return choices(range(-1,3), k=aliens)

    aliens = {1:1}
    i = 1
    while aliens[i] > 0 and aliens[i] < 100:    
        i += 1
        num = aliens[i-1]
        aliens[i] = num + sum(spwn(num))

    # commented for speed
    # print(f"Round {i:<5} had {aliens[i]:>20} alien alive.")
    return (i,aliens[i])

正在测试（在 pyfiddle.io 上大约 5 秒）：

from datetime import datetime

t = datetime.now()    
d = {}
wins = 0
test = 100000
for k in range(test):
    d[k] = simulate_em()
    wins += d[k][1]>=100

print(1-wins/test)         # 0.41532
print(datetime.now()-t)    # 0:00:04.840127

因此 10 万次测试运行大约需要 5 秒...

输出（2 次运行）：

Round 1     had                    1 alien alive.
Round 2     had                    3 alien alive.
Round 3     had                    6 alien alive.
Round 4     had                    9 alien alive.
Round 5     had                    7 alien alive.
Round 6     had                   13 alien alive.
Round 7     had                   23 alien alive.
Round 8     had                   20 alien alive.
Round 9     had                   37 alien alive.
Round 10    had                   54 alien alive.
Round 11    had                   77 alien alive.
Round 12    had                  118 alien alive.

Round 1     had                    1 alien alive.
Round 2     had                    0 alien alive.

通过使用 amount_of_aliens + sum 而不是 choices(range(-1,3),k=amount_of_aliens)，您可以更轻松地求和并更快地填写字典？如果您的外星人数量低于 0，它们就会灭绝。

3 个嵌套循环：优化简单模拟以提高速度

3 nested loops: Optimizing a simple simulation for speed

python

markov-chains

montecarlo

背景

重新解释问题

密码

一些输出

后果