`fft` 乘以 `scipy.signal` window 后速度急剧下降

`fft` dramatic slowdown upon multiplying by `scipy.signal` window

import numpy as np
import scipy.signal as sig
from scipy.fft import fft
from timeit import default_timer as dtime

dtype = 'float32'
n_fft = 598
A = np.random.randn(n_fft, 160000).astype(dtype)
v0 = sig.windows.dpss(n_fft, 4).astype(dtype)
v1 = sig.windows.dpss(n_fft, n_fft // 8).astype(dtype)
v = v1

#%%###############################################################
t0 = dtime()
fft(A)
print(dtime() - t0)

A *= v.reshape(-1, 1)
#%%###############################################################
t0 = dtime()
fft(A)
print(dtime() - t0)
>>> 1.3161122000001342
>>> 4.751361799999813

如果使用 v = v0dtype = 'float64' 则相等。为什么会这样? (more times)

注意:解决方法是 v = v1 + 1v -= 1,但这不是必需的...已提交 Issue.

Win 10 x64,numpy 1.18.5,scipy 1.6.1,Python 3.7.9。

这是由非正规数(极小的非零数)引起的,它使一些 CPU 指令 运行 慢得多; details. Workaround is to zero them manually, as in +1/-1, or 'safely' via e.g. ftz(和类型转换之后):

from ftz import ftz

ftz(v)
A *= v.reshape(-1, 1)

t0 = dtime()
fft(A)
print(dtime() - t0)
>>> 1.4638332999998056
>>> 1.4597183999999288