如何使用 Python 查找 FFT 图的峰值?
How to find peaks of FFT graph using Python?
我正在使用 Python 对某些数据执行快速傅里叶变换。然后我需要以 x 值的形式提取变换中峰值的位置。现在我正在使用 Scipy 的 fft 工具来执行转换,这似乎有效。但是,当我使用 Scipy 的 find_peaks 时,我只得到 y 值,而不是我需要的 x 位置。我也收到警告:
ComplexWarning: Casting complex values to real discards the imaginary part
我有更好的方法吗?这是我目前的代码:
import pandas as pd
import matplotlib.pyplot as plt
from scipy.fft import fft
from scipy.signal import find_peaks
headers = ["X","Y"]
original_data = pd.read_csv("testdata.csv",names=headers)
x = original_data["X"]
y = original_data["Y"]
a = fft(y)
peaks = find_peaks(a)
print(peaks)
plt.plot(x,a)
plt.title("Fast Fourier transform")
plt.xlabel("Frequency")
plt.ylabel("Amplitude")
plt.show()
这里好像有两点容易混淆:
- find_peaks 是什么return。
- 如何解释 FFT returning 的复数值。
我会分开回答
第 1 点
find_peaks returns “a”中对应于峰值的索引,因此我相信它们是您寻求的值,但是您必须以不同的方式绘制它们。您可以从文档 here 中查看第一个示例。但基本上“peaks”是索引或 x 值,而 a[peaks] 将是 y 值。因此,要绘制所有频率并标记您可以做的峰值:
plt.plot(a)
plt.plot(peaks, a[peaks])
第 2 点
至于第二点,您可能应该阅读更多有关 FFT 输出的内容,this post 是一个简短的摘要,但您可能需要更多背景知识才能理解它。
但基本上,FFT 将 return 一组复数,其中包含相位和幅度信息。您目前正在做的是隐式地 仅 查看解决方案的实部(因此警告虚部被丢弃),您 可能 想要取而代之的是你的 "a" 数组的大小,但没有关于你的应用程序的更多信息,这是不可能的。
我尽量写得详细一点:
import pandas as pd
import matplotlib.pyplot as plt
from scipy.fft import fft, fftfreq
from scipy.signal import find_peaks
# First: Let's generate a dummy dataframe with X,Y
# The signal consists in 3 cosine signals with noise added. We terminate by creating
# a pandas dataframe.
import numpy as np
X=np.arange(start=0,stop=20,step=0.01) # 20 seconds long signal sampled every 0.01[s]
# Signal components given by [frequency, phase shift, Amplitude]
GeneratedSignal=np.array([[5.50, 1.60, 1.0], [10.2, 0.25, 0.5], [18.3, 0.70, 0.2]])
Y=np.zeros(len(X))
# Let's add the components one by one
for P in GeneratedSignal:
Y+=np.cos(2*np.pi*P[0]*X-P[1])*P[2]
# Let's add some gaussian random noise (mu=0, sigma=noise):
noise=0.5
Y+=np.random.randn(len(X))*noise
# Let's build the dataframe:
dummy_data=pd.DataFrame({'X':X,'Y':Y})
print('Dummy dataframe: ')
print(dummy_data.head())
# Figure-1: The dummy data
plt.plot(X,Y)
plt.title('Dummy data')
plt.xlabel('time [s]')
plt.ylabel('Amplitude')
plt.show()
# ----------------------------------------------------
# Processing:
headers = ["X","Y"]
#original_data = pd.read_csv("testdata.csv",names=headers)
# Let's take our dummy data:
original_data = dummy_data
x = np.array(original_data["X"])
y = np.array(original_data["Y"])
# Assuming the time step is constant:
# (otherwise you'll need to resample the data at a constant rate).
dt=x[1]-x[0] # time step of the data
# The fourier transform of y:
yf=fft(y, norm='forward')
# Note: see help(fft) --> norm. I chose 'forward' because it gives the amplitudes we put in.
# Otherwise, by default, yf will be scaled by a factor of n: the number of points
# The frequency scale
n = x.size # The number of points in the data
freq = fftfreq(n, d=dt)
# Let's find the peaks with height_threshold >=0.05
# Note: We use the magnitude (i.e the absolute value) of the Fourier transform
height_threshold=0.05 # We need a threshold.
# peaks_index contains the indices in x that correspond to peaks:
peaks_index, properties = find_peaks(np.abs(yf), height=height_threshold)
# Notes:
# 1) peaks_index does not contain the frequency values but indices
# 2) In this case, properties will contain only one property: 'peak_heights'
# for each element in peaks_index (See help(find_peaks) )
# Let's first output the result to the terminal window:
print('Positions and magnitude of frequency peaks:')
[print("%4.4f \t %3.4f" %(freq[peaks_index[i]], properties['peak_heights'][i])) for i in range(len(peaks_index))]
# Figure-2: The frequencies
plt.plot(freq, np.abs(yf),'-', freq[peaks_index],properties['peak_heights'],'x')
plt.xlabel("Frequency")
plt.ylabel("Amplitude")
plt.show()
终端输出:
Dummy dataframe:
X Y
0 0.00 0.611829
1 0.01 0.723775
2 0.02 0.768813
3 0.03 0.798328
Positions and magnitude of frequency peaks:
5.5000 0.4980
10.2000 0.2575
18.3000 0.0999
-18.3000 0.0999
-10.2000 0.2575
-5.5000 0.4980
注意:由于信号是实数值的,每个频率分量都会有一个负的“double”(这是傅里叶变换的 属性)。这也解释了为什么振幅是我们一开始给出的一半。但是,如果对于特定频率,我们将负分量和正分量的幅度相加,我们将得到实值信号的原始幅度。
进一步探索:您可以将信号的长度更改为 1 [s](在脚本开头):
X=np.arange(start=0,stop=1,step=0.01) # 1 seconds long signal sampled every 0.01[s]
由于信号的长度现在减少了,所以频率的定义不太明确(峰值现在有一个宽度)
因此,添加: width=0 到包含 find_peaks 指令的行:
peaks_index, properties = find_peaks(np.abs(yf), height=height_threshold, width=0)
然后看看properties里面包含的内容:
print(properties)
您会发现 find_peaks 为您提供的信息远不止这些
山峰位置。有关内部属性的更多信息:
帮助(find_peaks)
数字:
我正在使用 Python 对某些数据执行快速傅里叶变换。然后我需要以 x 值的形式提取变换中峰值的位置。现在我正在使用 Scipy 的 fft 工具来执行转换,这似乎有效。但是,当我使用 Scipy 的 find_peaks 时,我只得到 y 值,而不是我需要的 x 位置。我也收到警告:
ComplexWarning: Casting complex values to real discards the imaginary part
我有更好的方法吗?这是我目前的代码:
import pandas as pd
import matplotlib.pyplot as plt
from scipy.fft import fft
from scipy.signal import find_peaks
headers = ["X","Y"]
original_data = pd.read_csv("testdata.csv",names=headers)
x = original_data["X"]
y = original_data["Y"]
a = fft(y)
peaks = find_peaks(a)
print(peaks)
plt.plot(x,a)
plt.title("Fast Fourier transform")
plt.xlabel("Frequency")
plt.ylabel("Amplitude")
plt.show()
这里好像有两点容易混淆:
- find_peaks 是什么return。
- 如何解释 FFT returning 的复数值。
我会分开回答
第 1 点
find_peaks returns “a”中对应于峰值的索引,因此我相信它们是您寻求的值,但是您必须以不同的方式绘制它们。您可以从文档 here 中查看第一个示例。但基本上“peaks”是索引或 x 值,而 a[peaks] 将是 y 值。因此,要绘制所有频率并标记您可以做的峰值:
plt.plot(a)
plt.plot(peaks, a[peaks])
第 2 点
至于第二点,您可能应该阅读更多有关 FFT 输出的内容,this post 是一个简短的摘要,但您可能需要更多背景知识才能理解它。
但基本上,FFT 将 return 一组复数,其中包含相位和幅度信息。您目前正在做的是隐式地 仅 查看解决方案的实部(因此警告虚部被丢弃),您 可能 想要取而代之的是你的 "a" 数组的大小,但没有关于你的应用程序的更多信息,这是不可能的。
我尽量写得详细一点:
import pandas as pd
import matplotlib.pyplot as plt
from scipy.fft import fft, fftfreq
from scipy.signal import find_peaks
# First: Let's generate a dummy dataframe with X,Y
# The signal consists in 3 cosine signals with noise added. We terminate by creating
# a pandas dataframe.
import numpy as np
X=np.arange(start=0,stop=20,step=0.01) # 20 seconds long signal sampled every 0.01[s]
# Signal components given by [frequency, phase shift, Amplitude]
GeneratedSignal=np.array([[5.50, 1.60, 1.0], [10.2, 0.25, 0.5], [18.3, 0.70, 0.2]])
Y=np.zeros(len(X))
# Let's add the components one by one
for P in GeneratedSignal:
Y+=np.cos(2*np.pi*P[0]*X-P[1])*P[2]
# Let's add some gaussian random noise (mu=0, sigma=noise):
noise=0.5
Y+=np.random.randn(len(X))*noise
# Let's build the dataframe:
dummy_data=pd.DataFrame({'X':X,'Y':Y})
print('Dummy dataframe: ')
print(dummy_data.head())
# Figure-1: The dummy data
plt.plot(X,Y)
plt.title('Dummy data')
plt.xlabel('time [s]')
plt.ylabel('Amplitude')
plt.show()
# ----------------------------------------------------
# Processing:
headers = ["X","Y"]
#original_data = pd.read_csv("testdata.csv",names=headers)
# Let's take our dummy data:
original_data = dummy_data
x = np.array(original_data["X"])
y = np.array(original_data["Y"])
# Assuming the time step is constant:
# (otherwise you'll need to resample the data at a constant rate).
dt=x[1]-x[0] # time step of the data
# The fourier transform of y:
yf=fft(y, norm='forward')
# Note: see help(fft) --> norm. I chose 'forward' because it gives the amplitudes we put in.
# Otherwise, by default, yf will be scaled by a factor of n: the number of points
# The frequency scale
n = x.size # The number of points in the data
freq = fftfreq(n, d=dt)
# Let's find the peaks with height_threshold >=0.05
# Note: We use the magnitude (i.e the absolute value) of the Fourier transform
height_threshold=0.05 # We need a threshold.
# peaks_index contains the indices in x that correspond to peaks:
peaks_index, properties = find_peaks(np.abs(yf), height=height_threshold)
# Notes:
# 1) peaks_index does not contain the frequency values but indices
# 2) In this case, properties will contain only one property: 'peak_heights'
# for each element in peaks_index (See help(find_peaks) )
# Let's first output the result to the terminal window:
print('Positions and magnitude of frequency peaks:')
[print("%4.4f \t %3.4f" %(freq[peaks_index[i]], properties['peak_heights'][i])) for i in range(len(peaks_index))]
# Figure-2: The frequencies
plt.plot(freq, np.abs(yf),'-', freq[peaks_index],properties['peak_heights'],'x')
plt.xlabel("Frequency")
plt.ylabel("Amplitude")
plt.show()
终端输出:
Dummy dataframe:
X Y
0 0.00 0.611829
1 0.01 0.723775
2 0.02 0.768813
3 0.03 0.798328
Positions and magnitude of frequency peaks:
5.5000 0.4980
10.2000 0.2575
18.3000 0.0999
-18.3000 0.0999
-10.2000 0.2575
-5.5000 0.4980
注意:由于信号是实数值的,每个频率分量都会有一个负的“double”(这是傅里叶变换的 属性)。这也解释了为什么振幅是我们一开始给出的一半。但是,如果对于特定频率,我们将负分量和正分量的幅度相加,我们将得到实值信号的原始幅度。
进一步探索:您可以将信号的长度更改为 1 [s](在脚本开头):
X=np.arange(start=0,stop=1,step=0.01) # 1 seconds long signal sampled every 0.01[s]
由于信号的长度现在减少了,所以频率的定义不太明确(峰值现在有一个宽度) 因此,添加: width=0 到包含 find_peaks 指令的行:
peaks_index, properties = find_peaks(np.abs(yf), height=height_threshold, width=0)
然后看看properties里面包含的内容:
print(properties)
您会发现 find_peaks 为您提供的信息远不止这些 山峰位置。有关内部属性的更多信息: 帮助(find_peaks)
数字: