Python numpy 数组试图相互广播,但我需要它们在新轴上运行

Python numpy arrays are trying to broadcast into eachother, but I need them to operate on a new axis instead

我正在为 class 编写此代码,并且是 python 的新手,所以如果我做错了很多事情,我不会感到惊讶。我正在尝试计算通过孔径的光的衍射图案强度,这是由贝塞尔函数给出的。我已经使用 numpy 数组相对有效地计算了 Bessel 函数本身,但是当我尝试在网格顶部扩展 Bessel 函数以计算强度时,我 运行 遇到了一些麻烦。

这是我的代码:

import matplotlib.pyplot as plt
numDivs = 100

rmax = 1e-6
num = 400
lmda = 500e-9
k = 2 * np.pi / lmda

#This function is designed to compute the integral using
# Simpson's Rule. The value N passed in is the number of
# polynomial curve fits to be created. The curve is sliced into
# 2N sections. Designed to work with numpy.
def integrateSimpson(func, lower, upper, N):
    delta = (upper - lower) / (2 * N)
    THETA = np.linspace(lower, upper, 2 * N + 1)
    Y = func(THETA)
    Y = Y * (delta / 3)
    
    return Y[0] + Y[2*N] + 4 * np.sum(Y[1:2 * N:2]) + 2 * np.sum(Y[2:2 * N:2])

#this is the formula for the bessel function
J = lambda m, x: integrateSimpson(lambda theta: np.cos(m * theta - x * np.sin(theta)), 0, np.pi, numDivs)

#this is the formula for the intensity using the above bessel function definition
Intensity = lambda r: (J(1, k * r) / (k * r))**2

#setting up the grids for computation.
xSpace = np.linspace(-rmax, rmax, num)
ySpace = np.linspace(-rmax, rmax, num)

xx, yy = np.meshgrid(xSpace, ySpace)
grid2 = np.zeros((num, num))

#This is where my problem begins
grid2 = Intensity(np.hypot(xx, yy))

#Set up the image and show it
plt.figure(figsize=(14, 14), dpi=80)
plt.set_cmap('plasma')
plt.imshow(grid2, extent = [-rmax , rmax, -rmax, rmax])
plt.colorbar()
plt.show()

这是我得到的错误代码:

    ---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-42-f3614ffc9470> in <module>
     21 grid2 = np.zeros((num, num))
     22 
---> 23 grid2 = Intensity(np.hypot(xx, yy))
     24 
     25 #Set up the image and show it

<ipython-input-42-f3614ffc9470> in <lambda>(r)
     11 
     12 #this is the formula for the intensity using the above bessel function definition
---> 13 Intensity = lambda r: (J(1, k * r) / (k * r))**2
     14 
     15 #setting up the grid for computation.

<ipython-input-42-f3614ffc9470> in <lambda>(m, x)
      8 
      9 #this is the formula for the bessel function
---> 10 J = lambda m, x: integrateSimpson(lambda theta: np.cos(m * theta - x * np.sin(theta)), 0, np.pi, numDivs)
     11 
     12 #this is the formula for the intensity using the above bessel function definition

<ipython-input-8-1c73b24d5dd3> in integrateSimpson(func, lower, upper, N)
     19     delta = (upper - lower) / (2 * N)
     20     THETA = np.linspace(lower, upper, 2 * N + 1)
---> 21     Y = func(THETA)
     22     Y = Y * (delta / 3)
     23 

<ipython-input-42-f3614ffc9470> in <lambda>(theta)
      8 
      9 #this is the formula for the bessel function
---> 10 J = lambda m, x: integrateSimpson(lambda theta: np.cos(m * theta - x * np.sin(theta)), 0, np.pi, numDivs)
     11 
     12 #this is the formula for the intensity using the above bessel function definition

ValueError: operands could not be broadcast together with shapes (400,400) (201,) 

看起来错误是 theta 数组和 x 数组试图相互广播,但我真的不希望它们这样做。有没有一种方法可以避免循环但仍能完成任务?我现在通过退出 numpy 数组操作并仅使用几个 for 循环解决了这个问题。

完成我正在尝试做的工作的工作代码如下:

#Type your code here
import matplotlib.pyplot as plt

numDivs = 100

rmax = 1e-6
num = 400
lmda = 500e-9
k = 2 * np.pi / lmda

#This function is designed to compute the integral using
# Simpson's Rule. The value N passed in is the number of
# polynomial curve fits to be created. The curve is sliced into
# 2N sections. Designed to work with numpy.
def integrateSimpson(func, lower, upper, N):
    delta = (upper - lower) / (2 * N)
    THETA = np.linspace(lower, upper, 2 * N + 1)
    Y = func(THETA)
    Y = Y * (delta / 3)
    
    return Y[0] + Y[2 * N] + 4 * np.sum(Y[1:2 * N:2]) + 2 * np.sum(Y[2:2 * N:2])

#this is the formula for the bessel function
J = lambda m, x: integrateSimpson(lambda theta: np.cos(m * theta - x * np.sin(theta)), 0, np.pi, numDivs)

#this is the formula for the intensity using the above bessel function definition
Intensity = lambda r: (J(1, k * r) / (k * r))**2

#setting up the grid for computation.
xSpace = np.linspace(-rmax, rmax, num)
ySpace = np.linspace(-rmax, rmax, num)
grid2 = np.zeros((num, num))

#calculate the diffraction pattern. This is an inefficient way to do it for several reasons
# First I'm not utilizing the efficiency of numpy arrays. I ran into a problem where the
# numpy arrays are trying to broadcast into eachother down the line. Not sure how to fix it so
# I reverted to loops. There's also a level of symmetry where I should only have to compute
# one line along r and then rotate it around, instead of computing the entire grid. 
# computing the grid takes a few seconds at this density.
for i in range(num):
    for j in range(num):
        grid2[i, j] = Intensity(np.hypot(xSpace[i], ySpace[j]))

#Set up the image and show it
plt.figure(figsize=(14, 14), dpi=80)
plt.set_cmap('plasma')
plt.imshow(grid2, extent = [-rmax , rmax, -rmax , rmax])
plt.colorbar()
plt.show()

为工作代码生成的图像:

广播实际上是你的朋友,你可以通过使用它来摆脱很多临时数组。例如,您可以这样设置网格:

x = np.linspace(-rmax, rmax, num)
y = np.linspace(-rmax, rmax, num)
r = np.hypot(x, y[:, None])
image = intensity(r)

为了清楚起见并避免重新计算不必要的数量,我建议不要在此代码中使用 lambda。 Lambda 有时间和地点,但为了几行代码而牺牲易读性是不值得的。最后,除了没有有意义的 __name__ 属性外,lambda 在各个方面都和普通函数一样重要:

def intensity(r):
    r_scaled = k * r
    b = bessel(1, r_scaled)
    m = (r_scaled == 0)
    r_scaled[m] = b[m]
    return (b / r_scaled)**2

你想在 r == 0 时获得 1 的强度来实现这一点,我屏蔽了除数中的零并将它们替换为被除数的值。

现在来解决您的实际问题。您有一个数组 r,或者在本例中为 r_scaled,其形状为 (400, 400)。您想要为每个元素集成一个贝塞尔函数。您正在使用 np.sinnp.cos 生成要集成的元素。但问题是求和是一种归约运算。你在下降什么维度?答案是您需要扩展数组的维度。它应该成为一个 (400, 400, 201) 数组,然后您将其汇总为 (400, 400)。这是一个使用嵌套函数的示例,它比您拥有的混乱的 lambda 更容易阅读。

def bessel(m, x):  # M = 1 (scalar), X = r (400, 400)
    def func(theta):
        return np.cos(m * theta - x[..., None] * np.sin(theta))
    return integrate_simpson(func, 0, np.pi, num_divs)

def integrate_simpson(func, lower, upper, n):
    n *= 2
    delta = (upper - lower) / n
    theta = np.linspace(lower, upper, n + 1)
    y = func(theta)
    y *= delta / 3

    return y[..., 0] + y[..., -1] + 4 * y[..., 1:-1:2].sum(axis=-1) + 2 * y[..., 2:-1:2].sum(axis=-1)

主要变化是 bessel 现在用 x[..., None] 进行数学运算(即 r_scaled[..., None])。 ...的意思是“取所有已有的维度,不管有多少”,None的意思是np.newaxis:增加一个单位维度。结果是一个 (400, 400, 1) 视图(没有数据被复制),然后你可以用你的 201 元素 theta 广播它以获得一个 (400, 400, 201) 数组。

integrate_simpson 内,您使用广播形状沿最后一个轴求和。 y[..., 0] 表示“拍摄第一个图像平面”。 ... 指的是前两个索引,0 是每个像素的第一个元素,这正是您想要的。 y[..., -1] 做同样的事情。 Numpy 和正常的 python 一样,支持负索引。如果您想要最后一个元素,只需使用 -1。它为你节省了一些计算,并且对于一个python-savvyreader更直观,以后就是你了。希望此时,您可以将这些概念外推到总和中的其余两项。 y[..., 1:-1:2].sum(axis=-1) 沿着最后一个轴求和:201 您使用广播制作的那个。在这里使用 -1... 的好处是你甚至不需要知道你有多少个轴。这将完美地工作,下周一不需要修改,当你决定你真的想要整合超过 5 维图像时。

几个小问题:

  1. 我会让 num = 401 看到更好的步骤,或者设置 x = np.linspace(-rmax, rmax, num, endpoint=False)(与 y 相同)以获得具有相同漂亮步骤的略微不对称的图像。
  2. Python 命名约定是常量大写,类 驼峰命名,snake_case 几乎所有其他内容,包括变量、函数和方法。

对不起,我实在忍不住了。假设您确实想利用问题的明显径向对称性。这意味着您只需为一组可能的半径生成贝塞尔函数,然后将这些值分配给输出图像。首先找到你的最大半径:

r_max_xy = np.hypot(rmax, rmax)

现在只为图像中的几百个点生成积分。事实上,既然你不是为了 400 * 400 点而做的,让我们挥霍一下并为 1000 点做吧:

r_radial = np.linspace(0, r_max_xy, 1000)
image_radial = intensity(r_radial)

由于其他答案中的广播方式,intensity(r_radial) 将开箱即用。现在您可以在整个图像中插入值。对于 first-order 近似值,我们只进行 nearest-neighbor 插值。

目标是计算 image_radial 中的索引以从中获取图像值。由于 r_radial 是均匀分布的,您可以使用简单的舍入算法:

r_index = (r / (r_radial.ptp() / (r_radial.size - 1)) + 0.5).astype(int)
image = image_radial[r_index]

为了 smoother/more 准确的插值,(a) 使用更多的点,and/or (b) 添加使用更好的插值器。这是后者的一个例子:

from scipy.interpolate import Akima1DInterpolator

interp = Akima1DInterpolator(r_radial, image_radial)
image = interp(r)

另一种完全不同的方法是混合插值和蛮力,是对图像的一个八分圆进行采样,它将包含整个图像的所有可能半径。这不是插值,而是将计算积分的数量减少到最低限度。您可以将它与插值结合使用,以避免调用插值器 7/8 次。

您可以取四分之一的田地,然后对其应用 np.triu_indices。最好的部分是您不需要生成图像来获取半径。您需要以稍微不同的方式处理偶数和奇数 num 以获得稳健的解决方案。

让我们从 num 偶数开始:

xy = np.linspace(-rmax, -rmax / (num - 1), num // 2)
row, col = np.triu_indices(xy.size)
r_octant = np.hypot(xy[row], xy[col])

剩下的过程很简单。诀窍是将数据恢复为完整图像:

octant = intensity(r_octant)
image = np.empty((num, num), float)
# Reflect octant about diagonal
image[row, col] = image[col, row] = octant
# Reflect quadrant about vertical
image[:xy.size, xy.size:] = image[:xy.size, xy.size - 1::-1]
# Reflect half about horizontal
image[xy.size:] = image[xy.size - 1::-1]

处理奇数尺寸略有不同,因为有一些 off-by-one 调整:

xy = np.linspace(-rmax, 0, (num + 1) // 2)
...
# Reflect quadrant about vertical
image[:xy.size, xy.size:] = image[:xy.size, xy.size - 2::-1]
# Reflect half about horizontal
image[xy.size:] = image[xy.size - 2::-1]

基准测试

考虑到我已经发布的数量,新解决方案的边际价值并不高,所以让我们来看看一些基准。不仅仅是速度,您还需要比较速度和准确性之间的权衡。

以下是三种解决方案:

def full_vectorized(rmax, num):
    x = np.linspace(-rmax, rmax, num)
    y = np.linspace(-rmax, rmax, num)
    r = np.hypot(x, y[:, None])
    return intensity(r)

def octant_vectorized(rmax, num):
    if num % 2:
        n = (num + 1) // 2
        k = n - 2
        s = 0
    else:
        n = num // 2
        k = n - 1
        s = -rmax / (num - 1)
    xy = np.linspace(-rmax, s, n)
    row, col = np.triu_indices(xy.size)
    r_octant = np.hypot(xy[row], xy[col])
    octant = intensity(r_octant)
    image = np.empty((num, num), float)
    image[row, col] = image[col, row] = octant
    image[:xy.size, xy.size:] = image[:xy.size, k::-1]
    # Reflect half about horizontal
    image[xy.size:] = image[k::-1]
    return image

def nn_interpolated(rmax, num, n=100):
    x = np.linspace(-rmax, rmax, num)
    y = np.linspace(-rmax, rmax, num)
    r = np.hypot(x, y[:, None])
    r_max_xy = r[0, 0]
    r_radial = np.linspace(0, r_max_xy, n)
    image_radial = intensity(r_radial)
    r_index = ((r - r_radial[0]) / (r_radial.ptp() / (r_radial.size - 1)) + 0.5).astype(int)
    return image_radial[r_index]

def akima_interpolated(rmax, num, n=100):
    r_max_xy = np.hypot(rmax, rmax)
    r_radial = np.linspace(0.5 * rmax / num, r_max_xy, n)
    image_radial = intensity(r_radial)
    x = np.linspace(-rmax, rmax, num)
    y = np.linspace(-rmax, rmax, num)
    r = np.hypot(x, y[:, None])
    return Akima1DInterpolator(r_radial, image_radial)(r)

首先快速检查所有方法 return 相似的结果:

>>> img_fv = full_vectorized(rmax, num)
>>> img_ov = octant_vectorized(rmax, num)
>>> np.abs(img_fv - img_ov).max()
4.218847493575595e-15   ## This is OK

如果您 plt.imshow(img_fv - img_ov),您会看到舍入伪影,因为 np.linspace 在范围的正半部分创建的值与在负半部分创建的值略有不同。忽略这些错误是完全安全的(见下图,图表(a))。

img_fv为金标准,我们可以计算其他方法的RMS并绘制结果(下图(b))。

>>> from haggis.math import rms
>>> img_ni_100 = nn_interpolated(rmax, num, n=100)
>>> img_ni_1000 = nn_interpolated(rmax, num, n=1000)
>>> img_ai_100 = akima_interpolated(rmax, num, n=100)
>>> img_ai_1000 = akima_interpolated(rmax, num, n=1000)
>>> print('NN 100', rms(img_fv - img_ni_100))
>>> print('NN 1000', rms(img_fv - img_ni_1000))
>>> print('AI 100', rms(img_fv - img_ai_100))
>>> print('AI 1000', rms(img_fv - img_ai_1000))
NN 100 0.009473741009264758
NN 1000 0.0009418991442352894
AI 100 2.9400011847429186e-05
AI 1000 2.479026973749244e-08

不出所料,更好的插值方法和更多的点会产生更好的答案。

现在进行基准测试:

%timeit full_vectorized(rmax, num)
343 ms ± 13.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit octant_vectorized(rmax, num)
40.4 ms ± 1.16 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit nn_interpolated(rmax, num, 100)
1.63 ms ± 14 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit nn_interpolated(rmax, num, 1000)
2.38 ms ± 100 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit akima_interpolated(rmax, num, 100)
3.07 ms ± 11.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit akima_interpolated(rmax, num, 1000)
5.75 ms ± 191 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

如您所见,任何插值解都比八分圆解好约 10 倍,比完整解好约 100 倍。鉴于 1000 点 Akima 插值器给出的 RMS 误差在舍入误差的数量级上,您可以凭良心接受该解决方案。

事实上,您可以让它更快:

def octant_akima_interpolated(rmax, num, n=100):
    if num % 2:
        m = (num + 1) // 2
        k = m - 2
        s = 0
    else:
        m = num // 2
        k = m - 1
        s = -rmax / (num - 1)
    xy = np.linspace(-rmax, s, m)
    row, col = np.triu_indices(xy.size)
    r_octant = np.hypot(xy[row], xy[col])
    r_max_xy = np.hypot(rmax, rmax)
    r_radial = np.linspace(0.5 * rmax / num, r_max_xy, n)
    image_radial = intensity(r_radial)
    octant = Akima1DInterpolator(r_radial, image_radial)(r_octant)
    image = np.empty((num, num), float)
    image[row, col] = image[col, row] = octant
    image[:xy.size, xy.size:] = image[:xy.size, k::-1]
    # Reflect half about horizontal
    image[xy.size:] = image[k::-1]
    return image

>>> img_oi_100 = octant_akima_interpolated(rmax, num, n=100)
>>> img_oi_1000 = octant_akima_interpolated(rmax, num, n=1000)
>>> print('OI 100', rms(img_fv - img_oi_100))
>>> print('OI 1000', rms(img_fv - img_oi_1000))
OI 100 2.940001184744199e-05
OI 1000 2.4790269736396652e-08

%timeit octant_akima_interpolated(rmax, num, 100)
936 µs ± 22.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit octant_akima_interpolated(rmax, num, 1000)
1.87 ms ± 55.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

所以你开始了:整个阵列中最快的线性插值器的速度,几乎没有有意义的精度损失(如果 2.5e-8 困扰你,你总是可以增加到 10000 点)。正如所宣传的那样,您可以将接受答案的时间缩短约 200 倍。