如何简化以下代码以使其运行得更快?
How can I simplify the following code so it runs faster?
我有一个包含许多二维图像(帧)的三维数组。我想通过考虑每个像素值的阈值来删除背景,并在新的 3D 数组中复制新元素。我写了下面的代码行,但是对于 运行 来说太贵了。我怎样才能加快这段代码的速度?
ss = stack #3D array (571, 1040, 1392)
T,ni,nj = ss.shape
Background_intensity = np.ones([T,ni,nj])
Intensity = np.zeros([T,ni,nj])
DeltaF_F_max = np.zeros([T,ni,nj])
for t in range(T):
for i in range(ni):
for j in range(nj):
if ss[t,i,j]<12:
Background_intensity[t,i,j] = ss[t,i,j]
if Background_intensity[t,i,j] == 0 :
Background_intensity[t,i,j] = 1
else:
Intensity[t,i,j] = ss[t,i,j]
DeltaF_F_max[t,i,j]=(((Intensity[t,i,j] - Background_intensity[t,i,j])))/(Background_intensity[t,i,j])
我尝试了 Numpy。我不确定你得到了什么结果,但我的 Mac 大约需要 20 秒。即使在我将所有大小都减少了 8 倍之后,这仍然是一个内存消耗大户,因为您不需要 int64
来存储 1
或 12
或 [= 下的数字14=].
我想知道您是否需要一次完成 571 张图像,或者您是否可以在获取它们而不是收集所有图像时完成它们 "on-the-fly"在一个巨大的肿块中。
您也可以考虑使用 Numba 来执行此操作,因为它非常擅长优化 for
循环 - 尝试将 [numba]
放在上面的搜索框中,或查看 this 示例 - 使用 prange
并行化 CPU 核心中的循环。
无论如何,这是我的代码:
#!/usr/bin/env python3
#
import numpy as np
T, ni, nj = 571, 1040, 1392
# Create representative input data, such that around 1/3 of it is < 12 for testing
ss = np.random.randint(0,36,(T,ni,nj), np.uint8)
# Ravel into 1-D representation for simpler indexing
ss_r = ss.ravel()
# Create extra arrays but using 800MB rather than 6.3GB each, also ravelled
Background_intensity = np.ones(T*ni*nj, np.uint8)
Intensity = np.zeros(T*ni*nj, np.uint8)
# Make Boolean (True/False) mask of elements below threshold
mask = ss_r < 12
# Quick check here - print(np.count_nonzero(mask)/np.size(ss)) and check it is 0.333
# Set Background_intensity to "ss" according to mask
Background_intensity[mask] = ss_r[mask]
# Make sure no zeroes present
Background_intensity[Background_intensity==0] = 1
# This corresponds to the "else" of your original "if" statement
Intensity[~mask] = ss_r[~mask]
# Final calculation and reshaping back to original shape
DeltaF_F_max = (Intensity - Background_intensity)/Background_intensity
DeltaF_F_max.reshape((T,ni,nj))
我有一个包含许多二维图像(帧)的三维数组。我想通过考虑每个像素值的阈值来删除背景,并在新的 3D 数组中复制新元素。我写了下面的代码行,但是对于 运行 来说太贵了。我怎样才能加快这段代码的速度?
ss = stack #3D array (571, 1040, 1392)
T,ni,nj = ss.shape
Background_intensity = np.ones([T,ni,nj])
Intensity = np.zeros([T,ni,nj])
DeltaF_F_max = np.zeros([T,ni,nj])
for t in range(T):
for i in range(ni):
for j in range(nj):
if ss[t,i,j]<12:
Background_intensity[t,i,j] = ss[t,i,j]
if Background_intensity[t,i,j] == 0 :
Background_intensity[t,i,j] = 1
else:
Intensity[t,i,j] = ss[t,i,j]
DeltaF_F_max[t,i,j]=(((Intensity[t,i,j] - Background_intensity[t,i,j])))/(Background_intensity[t,i,j])
我尝试了 Numpy。我不确定你得到了什么结果,但我的 Mac 大约需要 20 秒。即使在我将所有大小都减少了 8 倍之后,这仍然是一个内存消耗大户,因为您不需要 int64
来存储 1
或 12
或 [= 下的数字14=].
我想知道您是否需要一次完成 571 张图像,或者您是否可以在获取它们而不是收集所有图像时完成它们 "on-the-fly"在一个巨大的肿块中。
您也可以考虑使用 Numba 来执行此操作,因为它非常擅长优化 for
循环 - 尝试将 [numba]
放在上面的搜索框中,或查看 this 示例 - 使用 prange
并行化 CPU 核心中的循环。
无论如何,这是我的代码:
#!/usr/bin/env python3
#
import numpy as np
T, ni, nj = 571, 1040, 1392
# Create representative input data, such that around 1/3 of it is < 12 for testing
ss = np.random.randint(0,36,(T,ni,nj), np.uint8)
# Ravel into 1-D representation for simpler indexing
ss_r = ss.ravel()
# Create extra arrays but using 800MB rather than 6.3GB each, also ravelled
Background_intensity = np.ones(T*ni*nj, np.uint8)
Intensity = np.zeros(T*ni*nj, np.uint8)
# Make Boolean (True/False) mask of elements below threshold
mask = ss_r < 12
# Quick check here - print(np.count_nonzero(mask)/np.size(ss)) and check it is 0.333
# Set Background_intensity to "ss" according to mask
Background_intensity[mask] = ss_r[mask]
# Make sure no zeroes present
Background_intensity[Background_intensity==0] = 1
# This corresponds to the "else" of your original "if" statement
Intensity[~mask] = ss_r[~mask]
# Final calculation and reshaping back to original shape
DeltaF_F_max = (Intensity - Background_intensity)/Background_intensity
DeltaF_F_max.reshape((T,ni,nj))