单精度 rfft
Single precision rfft
我寻求单精度rfft
来加速计算; scipy.fftpack.rfft
这样做,但是 returns 一个实数数组,它在同一轴上包含实部和虚部,需要一个 post 处理步骤。我在下面实施以获得标准复数数组,但 Numpy 的 rfft
最终对于 2D 输入更快(但对于 1D 输入更慢)。内存也是个问题,float64 OOM.
scipy 或其他库是否具有 returns 标准复数数组的单精度 rfft
实现? (否则,下面可以做得更快吗?)
import numpy as np
from numpy.fft import rfft
from scipy.fftpack import rfft as srfft
def rfft_sp(x): # assumes len(x) is even
xf = np.zeros((len(x)//2 + 1, x.shape[1]), dtype='complex64')
h = srfft(x, axis=0)
xf[0] = h[0]
xf[1:] = h[1::2]
xf[:1].imag = 0
xf[-1:].imag = 0
xf[1:-1].imag = h[2::2]
return xf
x = np.random.randn(500, 100000).astype('float32')
%timeit rfft_sp(x)
%timeit rfft(x, axis=0)
>>> 565 ms ± 15.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
>>> 517 ms ± 22.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
在我测试的机器上,使用 scipy.fft.rfft 并转换为 complex64
比您的实现更快:
import numpy as np
from numpy.fft import rfft
from scipy.fft import rfft as srfft
from scipy.fftpack import rfft as srfft2
def rfft_sp(x): # assumes len(x) is even
xf = np.zeros((len(x)//2 + 1, x.shape[1]), dtype='complex64')
h = srfft2(x, axis=0)
xf[0] = h[0]
xf[1:] = h[1::2]
xf[:1].imag = 0
xf[-1:].imag = 0
xf[1:-1].imag = h[2::2]
return xf
def rfft_cast(x):
h = srfft(x, axis=0)
return h.astype('complex64')
x = np.random.randn(500, 100000).astype('float32')
%timeit rfft(x, axis = 0 )
%timeit rfft_sp(x )
%timeit rfft_cast(x)
产生:
1.81 s ± 144 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
2.89 s ± 7.58 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
2.24 s ± 9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
scipy.fft
以单精度工作。
我寻求单精度rfft
来加速计算; scipy.fftpack.rfft
这样做,但是 returns 一个实数数组,它在同一轴上包含实部和虚部,需要一个 post 处理步骤。我在下面实施以获得标准复数数组,但 Numpy 的 rfft
最终对于 2D 输入更快(但对于 1D 输入更慢)。内存也是个问题,float64 OOM.
scipy 或其他库是否具有 returns 标准复数数组的单精度 rfft
实现? (否则,下面可以做得更快吗?)
import numpy as np
from numpy.fft import rfft
from scipy.fftpack import rfft as srfft
def rfft_sp(x): # assumes len(x) is even
xf = np.zeros((len(x)//2 + 1, x.shape[1]), dtype='complex64')
h = srfft(x, axis=0)
xf[0] = h[0]
xf[1:] = h[1::2]
xf[:1].imag = 0
xf[-1:].imag = 0
xf[1:-1].imag = h[2::2]
return xf
x = np.random.randn(500, 100000).astype('float32')
%timeit rfft_sp(x)
%timeit rfft(x, axis=0)
>>> 565 ms ± 15.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
>>> 517 ms ± 22.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
在我测试的机器上,使用 scipy.fft.rfft 并转换为 complex64
比您的实现更快:
import numpy as np
from numpy.fft import rfft
from scipy.fft import rfft as srfft
from scipy.fftpack import rfft as srfft2
def rfft_sp(x): # assumes len(x) is even
xf = np.zeros((len(x)//2 + 1, x.shape[1]), dtype='complex64')
h = srfft2(x, axis=0)
xf[0] = h[0]
xf[1:] = h[1::2]
xf[:1].imag = 0
xf[-1:].imag = 0
xf[1:-1].imag = h[2::2]
return xf
def rfft_cast(x):
h = srfft(x, axis=0)
return h.astype('complex64')
x = np.random.randn(500, 100000).astype('float32')
%timeit rfft(x, axis = 0 )
%timeit rfft_sp(x )
%timeit rfft_cast(x)
产生:
1.81 s ± 144 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
2.89 s ± 7.58 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
2.24 s ± 9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
scipy.fft
以单精度工作。