如何获得信号的高低包络

How to get high and low envelope of a signal

我的数据非常嘈杂,我正在尝试计算出信号的高低包络。它有点像 MATLAB 中的这个例子:

http://uk.mathworks.com/help/signal/examples/signal-smoothing.html

在“提取峰包络”中。 Python 中是否有类似的功能可以做到这一点?我的整个项目都是用 Python 编写的,最坏的情况是我可以提取我的 numpy 数组并将其放入 MATLAB 并使用该示例。但我更喜欢 matplotlib 的外观......而且真的是 cba 在 MATLAB 和 Python 之间做所有这些 I/O...

谢谢,

Is there a similar function in Python that can do that?

据我所知,Numpy / Scipy / Python 中没有这样的函数。但是,创建一个并不难。大致思路如下:

给定值向量 (s):

  1. 找出 (s) 峰的位置。我们称他们为 (u)
  2. 找到 s 波谷的位置。我们称他们为 (l).
  3. 将模型拟合到 (u) 个值对。我们称它为 (u_p)
  4. 将模型拟合到 (l) 个值对。我们称它为 (l_p)
  5. 在 (s) 的域上计算 (u_p) 以获得上包络的内插值。 (我们称他们为(q_u))
  6. 在 (s) 的域上计算 (l_p) 以获得下包络的内插值。 (让我们称他们为(q_l))。

如您所见,它是三个步骤(查找位置、拟合模型、评估模型)的序列,但应用了两次,一次用于包络的上部,一次用于包络的下部。

要收集 (s) 的 "peaks",您需要找到 (s) 的斜率从正变为负的点,并收集 (s) 的 "troughs",您需要找到 (s) 的斜率从负变为正的点。

A峰示例:s = [4,5,4] 5-4为正 4-5为负

一个槽示例:s = [5,4,5] 4-5 为负数 5-4 为正数

这是一个示例脚本,可让您开始使用大量内联注释:

from numpy import array, sign, zeros
from scipy.interpolate import interp1d
from matplotlib.pyplot import plot,show,hold,grid

s = array([1,4,3,5,3,2,4,3,4,5,4,3,2,5,6,7,8,7,8]) #This is your noisy vector of values.

q_u = zeros(s.shape)
q_l = zeros(s.shape)

#Prepend the first value of (s) to the interpolating values. This forces the model to use the same starting point for both the upper and lower envelope models.

u_x = [0,]
u_y = [s[0],]

l_x = [0,]
l_y = [s[0],]

#Detect peaks and troughs and mark their location in u_x,u_y,l_x,l_y respectively.

for k in xrange(1,len(s)-1):
    if (sign(s[k]-s[k-1])==1) and (sign(s[k]-s[k+1])==1):
        u_x.append(k)
        u_y.append(s[k])

    if (sign(s[k]-s[k-1])==-1) and ((sign(s[k]-s[k+1]))==-1):
        l_x.append(k)
        l_y.append(s[k])

#Append the last value of (s) to the interpolating values. This forces the model to use the same ending point for both the upper and lower envelope models.

u_x.append(len(s)-1)
u_y.append(s[-1])

l_x.append(len(s)-1)
l_y.append(s[-1])

#Fit suitable models to the data. Here I am using cubic splines, similarly to the MATLAB example given in the question.

u_p = interp1d(u_x,u_y, kind = 'cubic',bounds_error = False, fill_value=0.0)
l_p = interp1d(l_x,l_y,kind = 'cubic',bounds_error = False, fill_value=0.0)

#Evaluate each model over the domain of (s)
for k in xrange(0,len(s)):
    q_u[k] = u_p(k)
    q_l[k] = l_p(k)

#Plot everything
plot(s);hold(True);plot(q_u,'r');plot(q_l,'g');grid(True);show()

这会产生以下输出:

进一步改进的要点:

  1. 以上代码不会 过滤 可能出现的波峰或波谷比某个阈值 "distance" (Tl)(例如时间)更近。这类似于envelope的第二个参数。通过检查 u_x,u_y.

  2. 的连续值之间的差异,很容易添加它
  3. 但是,对前面提到的点的快速改进是使用移动平均滤波器对数据进行低通滤波 BEFORE 插值上下包络函数。您可以通过将您的 (s) 与合适的移动平均滤波器进行卷积来轻松地做到这一点。此处无需详细介绍(如果需要可以做),要生成一个对 N 个连续样本进行操作的移动平均滤波器,您可以这样做:s_filtered = numpy.convolve(s, numpy.ones((1,N))/float(N)。 (N) 越高,您的数据就越平滑。但是请注意,由于称为 group delay of the smoothing filter. For more information about the moving average, please see this link.

    [= 的原因,这会将您的 (s) 值 (N/2) 样本向右移动(在 s_filtered 中) 58=]

希望这对您有所帮助。

(如果提供有关原始应用程序的更多信息,很高兴修改响应。也许可以以更合适的方式预处理数据(?))

基于@A_A 的回答,将符号检查替换为 nim/max 测试以使其更可靠。

import numpy as np
import scipy.interpolate
import matplotlib.pyplot as pt
%matplotlib inline

t = np.multiply(list(range(1000)), .1)
s = 10*np.sin(t)*t**.5

u_x = [0]
u_y = [s[0]]

l_x = [0]
l_y = [s[0]]

#Detect peaks and troughs and mark their location in u_x,u_y,l_x,l_y respectively.
for k in range(2,len(s)-1):
    if s[k] >= max(s[:k-1]):
        u_x.append(t[k])
        u_y.append(s[k])

for k in range(2,len(s)-1):
    if s[k] <= min(s[:k-1]):
        l_x.append(t[k])
        l_y.append(s[k])

u_p = scipy.interpolate.interp1d(u_x, u_y, kind = 'cubic', bounds_error = False, fill_value=0.0)
l_p = scipy.interpolate.interp1d(l_x, l_y, kind = 'cubic', bounds_error = False, fill_value=0.0)

q_u = np.zeros(s.shape)
q_l = np.zeros(s.shape)
for k in range(0,len(s)):
    q_u[k] = u_p(t[k])
    q_l[k] = l_p(t[k])

pt.plot(t,s)
pt.plot(t, q_u, 'r')
pt.plot(t, q_l, 'g')

如果你希望函数是递增的,试试:

for k in range(1,len(s)-2):
    if s[k] <= min(s[k+1:]):
        l_x.append(t[k])
        l_y.append(s[k])

下封套。

第一次尝试是利用 scipy Hilbert transform to determine the amplitude envelope but this didn't work as expected in many cases, mainly reason because, citing from this digital signal processing answer:

Hilbert envelope, also called Energy-Time Curve (ETC), only works well for narrow-band fluctuations. Producing an analytic signal, of which you later take the absolute value, is a linear operation, so it treats all frequencies of your signal equally. If you give it a pure sine wave, it will indeed return to you a straight line. When you give it white noise however, you will likely get noise back.

从那时起,由于其他答案使用的是三次样条插值,并且确实会变得麻烦,有点不稳定(虚假振荡)并且对于非常长且嘈杂的数据阵列非常耗时,我将在这里提供一个简单且似乎运行良好的 numpy 高效版本:

import numpy as np
from matplotlib import pyplot as plt

def hl_envelopes_idx(s, dmin=1, dmax=1, split=False):
    """
    Input :
    s: 1d-array, data signal from which to extract high and low envelopes
    dmin, dmax: int, optional, size of chunks, use this if the size of the input signal is too big
    split: bool, optional, if True, split the signal in half along its mean, might help to generate the envelope in some cases
    Output :
    lmin,lmax : high/low envelope idx of input signal s
    """

    # locals min      
    lmin = (np.diff(np.sign(np.diff(s))) > 0).nonzero()[0] + 1 
    # locals max
    lmax = (np.diff(np.sign(np.diff(s))) < 0).nonzero()[0] + 1 
    

    if split:
        # s_mid is zero if s centered around x-axis or more generally mean of signal
        s_mid = np.mean(s) 
        # pre-sorting of locals min based on relative position with respect to s_mid 
        lmin = lmin[s[lmin]<s_mid]
        # pre-sorting of local max based on relative position with respect to s_mid 
        lmax = lmax[s[lmax]>s_mid]


    # global max of dmax-chunks of locals max 
    lmin = lmin[[i+np.argmin(s[lmin[i:i+dmin]]) for i in range(0,len(lmin),dmin)]]
    # global min of dmin-chunks of locals min 
    lmax = lmax[[i+np.argmax(s[lmax[i:i+dmax]]) for i in range(0,len(lmax),dmax)]]
    
    return lmin,lmax

例1:准周期振动

t = np.linspace(0,8*np.pi,5000)
s = 0.8*np.cos(t)**3 + 0.5*np.sin(np.exp(1)*t)
high_idx, low_idx = hl_envelopes_idx(s)

# plot
plt.plot(t,s,label='signal')
plt.plot(t[high_idx], s[high_idx], 'r', label='low')
plt.plot(t[low_idx], s[low_idx], 'g', label='high')

示例 2:噪声衰减信号

t = np.linspace(0,2*np.pi,5000)
s = 5*np.cos(5*t)*np.exp(-t) + np.random.rand(len(t))

high_idx, low_idx = hl_envelopes_idx(s,dmin=15,dmax=15)

# plot
plt.plot(t,s,label='signal')
plt.plot(t[high_idx], s[high_idx], 'r', label='low')
plt.plot(t[low_idx], s[low_idx], 'g', label='high')

示例 3:非对称调制线性调频脉冲

18867925 个样本的复杂得多的信号(此处未包括在内):

您可能想要查看 Hilbert 变换,这可能是 MATLAB 中包络函数背后的实际代码。 scipy 的信号子模块具有内置的希尔伯特变换,文档中有一个很好的示例,其中提取了振荡信号的包络: https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.hilbert.html

我发现使用 scipy 函数的组合比其他方法表现更好

def envelope(sig, distance):
    # split signal into negative and positive parts
    u_x = np.where(sig > 0)[0]
    l_x = np.where(sig < 0)[0]
    u_y = sig.copy()
    u_y[l_x] = 0
    l_y = -sig.copy()
    l_y[u_x] = 0
    
    # find upper and lower peaks
    u_peaks, _ = scipy.signal.find_peaks(u_y, distance=distance)
    l_peaks, _ = scipy.signal.find_peaks(l_y, distance=distance)
    
    # use peaks and peak values to make envelope
    u_x = u_peaks
    u_y = sig[u_peaks]
    l_x = l_peaks
    l_y = sig[l_peaks]
    
    # add start and end of signal to allow proper indexing
    end = len(sig)
    u_x = np.concatenate((u_x, [0, end]))
    u_y = np.concatenate((u_y, [0, 0]))
    l_x = np.concatenate((l_x, [0, end]))
    l_y = np.concatenate((l_y, [0, 0]))
    
    # create envelope functions
    u = scipy.interpolate.interp1d(u_x, u_y)
    l = scipy.interpolate.interp1d(l_x, l_y)
    return u, l

def test():
    x = np.arange(200)
    sig = np.sin(x)
    u, l = envelope(sig, 1)
    
    plt.figure(figsize=(25,5))
    plt.plot(x, u(x))
    plt.plot(x, l(x))
    plt.plot(x, sig*0.9)
    plt.show()
    
test()

或者你使用pandas。这里我只需要两行代码:

import pandas as pd
import numpy as np


x=np.linspace(0,5*np.pi,1000)
y=np.sin(x)+0.4*np.cos(x/4)*np.sin(x*20)

df=pd.DataFrame(data={"y":y},index=x)

windowsize = 20
df["y_upperEnv"]=df["y"].rolling(window=windowsize).max().shift(int(-windowsize/2))
df["y_lowerEnv"]=df["y"].rolling(window=windowsize).min().shift(int(-windowsize/2))

df.plot(figsize=(20,10))

输出: