有没有办法提高这种分形计算算法的性能？

Question

昨天我看到了关于牛顿分形的新 3Blue1Brown 视频，他对分形的现场表现让我着迷。（这里有视频 link 供任何感兴趣的人观看，位于 13:40：https://www.youtube.com/watch?v=-RdOwhmqP5s）

我想自己尝试一下，并尝试在 python 中编写代码（我认为他也使用 python）。

我花了几个小时试图改进我的天真实现，但到了我不知道如何才能让它更快的地步。

代码如下所示：

import os
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.gridspec import GridSpec
from time import time


def print_fractal(state):
    fig = plt.figure(figsize=(8, 8))
    gs = GridSpec(1, 1)
    axs = [fig.add_subplot(gs[0, 0])]
    fig.tight_layout(pad=5)
    axs[0].matshow(state)
    axs[0].set_xticks([])
    axs[0].set_yticks([])
    plt.show()
    plt.close()


def get_function_value(z):
    return z**5 + z**2 - z + 1


def get_function_derivative_value(z):
    return 5*z**4 + 2*z - 1


def check_distance(state, roots):
    roots2 = np.zeros((roots.shape[0], state.shape[0], state.shape[1]), dtype=complex)
    for r in range(roots.shape[0]):
        roots2[r] = np.full((state.shape[0], state.shape[1]), roots[r])
    dist_2 = np.abs((roots2 - state))
    original_state = np.argmin(dist_2, axis=0) + 1
    return original_state


def static():
    time_start = time()
    s = 4
    c = [0, 0]
    n = 800
    polynomial = [1, 0, 0, 1, -1, 1]
    roots = np.roots(polynomial)
    state = np.transpose((np.linspace(c[0] - s/2, c[0] + s/2, n)[:, None] + 1j*np.linspace(c[1] - s/2, c[1] + s/2, n)))
    n_steps = 15
    time_setup = time()
    for _ in range(n_steps):
        state -= (get_function_value(state) / get_function_derivative_value(state))
    time_evolution = time()
    original_state = check_distance(state, roots)
    time_check = time()
    print_fractal(original_state)
    print("{0:<40}".format("Time to setup the initial configuration:"), "{:20.3f}".format(time_setup - time_start))
    print("{0:<40}".format("Time to evolve the state:"), "{:20.3f}".format(time_evolution - time_setup))
    print("{0:<40}".format("Time to check the closest roots:"), "{:20.3f}".format(time_check - time_evolution))

平均输出如下所示：

设置初始配置的时间：0.004

状态进化时间：0.796

检查最近根的时间：0.094

很明显，进化部分是该过程的瓶颈。它并不“慢”，但我认为它不足以像视频中那样渲染一些活生生的东西。我已经通过使用 numpy 向量和避免循环来尽我所能，但我想这还不够。这里还可以应用哪些其他技巧？

注意：我试过使用numpy.polynomials.Polynomial class来评估函数，但是比这个版本慢。

Answer 1

for _ in range(n_steps):
    state -= (get_function_value(state) / get_function_derivative_value(state))

如果你有足够的内存，你可以尝试向量化这个循环并用矩阵计算存储每个迭代步骤。

Answer 2

通过使用单一复数 (np.complex64) 精度，我得到了改进（快了 40%）。

(...)
state = np.transpose((np.linspace(c[0] - s/2, c[0] + s/2, n)[:, None] 
                      + 1j*np.linspace(c[1] - s/2, c[1] + s/2, n)))
state = state.astype(np.complex64)
(...)

3Blue1Brown 在描述中添加了这个 link：https://codepen.io/mherreshoff/pen/RwZPazd 你可以看看那里是怎么做的（旁注：这支笔的作者也使用了单精度）

Answer 3

试试 pypy3、numba 或 cython。

Pypy3 是 cpython 的快速替代品。对于纯 python 代码，这会快得多。

Numba 的 nopython 模式可以显着加快 cpython 中的数学运算速度。不过，它并没有比数学做得更多。

Cython 是一种 python 转译为 c 的方言，并允许您混合使用 cpython 和 c 类型。您使用的 c 类型越多越好（通常）。查看 yhe cython -a opti9n。

有没有办法提高这种分形计算算法的性能？

Is there a way to improve the performance of this fractal calculation algorithm?

python

optimization

performance

numpy

fractals