了解平稳序列的周期

Know the period of a stationary series

我在研究时间序列,我正在使用Python,我需要知道平稳部分的周期(分解如下,我对季节性部分感兴趣)。

我所做的是取一个数字(任意)并计算步数(直到我再次找到它,所以我找到了句点)。这是非常过时的(在我看来)。

你知道计算序列周期的函数吗?或者也许 .. 你知道 Pandas 中的任何指令集,以避免使用循环和条件吗?我该如何执行此任务?

PS: 我得到的数据是类似这样的: 如果计数完成,数据每十二步重复一次。

import pandas as pd
import matplotlib.pyplot as plt

seasonal = [-0.0012477419628991032, -0.0042654910887713745, -0.006234490214646844, 0.0007106773261453963, 1.1533604530851796e-08, -0.004141904258934777, 0.0006148978972421542, 0.0017480068715999646, 0.0011169491792932956, 0.002641724820318341, 0.005415250461344693, 0.003642109435703726, -0.0012477419628991032, -0.0042654910887713745, -0.006234490214646844, 0.0007106773261453963, 1.1533604530851796e-08, -0.004141904258934777, 0.0006148978972421542, 0.0017480068715999646, 0.0011169491792932956, 0.002641724820318341, 0.005415250461344693, 0.003642109435703726, -0.0012477419628991032, -0.0042654910887713745, -0.006234490214646844, 0.0007106773261453963, 1.1533604530851796e-08, -0.004141904258934777, 0.0006148978972421542, 0.0017480068715999646, 0.0011169491792932956, 0.002641724820318341, 0.005415250461344693, 0.003642109435703726, -0.0012477419628991032, -0.0042654910887713745, -0.006234490214646844, 0.0007106773261453963, 1.1533604530851796e-08, -0.004141904258934777, 0.0006148978972421542, 0.0017480068715999646, 0.0011169491792932956, 0.002641724820318341, 0.005415250461344693, 0.003642109435703726, -0.0012477419628991032, -0.0042654910887713745, -0.006234490214646844, 0.0007106773261453963, 1.1533604530851796e-08, -0.004141904258934777, 0.0006148978972421542, 0.0017480068715999646, 0.0011169491792932956, 0.002641724820318341, 0.005415250461344693, 0.003642109435703726, -0.0012477419628991032, -0.0042654910887713745, -0.006234490214646844, 0.0007106773261453963, 1.1533604530851796e-08, -0.004141904258934777, 0.0006148978972421542, 0.0017480068715999646, 0.0011169491792932956, 0.002641724820318341, 0.005415250461344693, 0.003642109435703726, -0.0012477419628991032, -0.0042654910887713745, -0.006234490214646844, 0.0007106773261453963, 1.1533604530851796e-08, -0.004141904258934777, 0.0006148978972421542, 0.0017480068715999646, 0.0011169491792932956, 0.002641724820318341, 0.005415250461344693, 0.003642109435703726, -0.0012477419628991032, -0.0042654910887713745, -0.006234490214646844, 0.0007106773261453963, 1.1533604530851796e-08, -0.004141904258934777, 0.0006148978972421542, 0.0017480068715999646, 0.0011169491792932956, 0.002641724820318341, 0.005415250461344693, 0.003642109435703726, -0.0012477419628991032, -0.0042654910887713745, -0.006234490214646844, 0.0007106773261453963, 1.1533604530851796e-08, -0.004141904258934777, 0.0006148978972421542, 0.0017480068715999646, 0.0011169491792932956, 0.002641724820318341, 0.005415250461344693, 0.003642109435703726, -0.0012477419628991032, -0.0042654910887713745, -0.006234490214646844, 0.0007106773261453963, 1.1533604530851796e-08, -0.004141904258934777, 0.0006148978972421542, 0.0017480068715999646, 0.0011169491792932956, 0.002641724820318341, 0.005415250461344693, 0.003642109435703726]
indice = pd.date_range("2019-07-31 23:55:00", periods=len(seasonal), freq="T")
seasonal = pd.Series(data=seasonal, index=indice)

periodo = 0                                 ### 
valor = seasonal.iloc[0]                      #    All this part ...  
                                              # can it be changed
for item in seasonal:                         # for a better structured function,
  if periodo != 0 and item == valor:          # which looks for the period
    break                                     # of a group of data?
                                              # 
  periodo += 1                              ###    Thanks

print("Periodo: {}".format(periodo))
seasonal.plot()
plt.show()

提供的答案主要来自。 使用自相关来解决你的问题。

def find_period(signal):
    acf = np.correlate(signal, signal, 'full')[-len(signal):]
    inflection = np.diff(np.sign(np.diff(acf)))
    peaks = (inflection < 0).nonzero()[0] + 1
    return peaks[acf[peaks].argmax()]
>>> find_period(seasonal)
12

请记住,这很容易,因为您的信号被复制了十次。如果信号中有噪声,则必须预处理数据。