了解平稳序列的周期
Know the period of a stationary series
我在研究时间序列,我正在使用Python,我需要知道平稳部分的周期(分解如下,我对季节性部分感兴趣)。
我所做的是取一个数字(任意)并计算步数(直到我再次找到它,所以我找到了句点)。这是非常过时的(在我看来)。
你知道计算序列周期的函数吗?或者也许 .. 你知道 Pandas 中的任何指令集,以避免使用循环和条件吗?我该如何执行此任务?
PS: 我得到的数据是类似这样的:
如果计数完成,数据每十二步重复一次。
import pandas as pd
import matplotlib.pyplot as plt
seasonal = [-0.0012477419628991032, -0.0042654910887713745, -0.006234490214646844, 0.0007106773261453963, 1.1533604530851796e-08, -0.004141904258934777, 0.0006148978972421542, 0.0017480068715999646, 0.0011169491792932956, 0.002641724820318341, 0.005415250461344693, 0.003642109435703726, -0.0012477419628991032, -0.0042654910887713745, -0.006234490214646844, 0.0007106773261453963, 1.1533604530851796e-08, -0.004141904258934777, 0.0006148978972421542, 0.0017480068715999646, 0.0011169491792932956, 0.002641724820318341, 0.005415250461344693, 0.003642109435703726, -0.0012477419628991032, -0.0042654910887713745, -0.006234490214646844, 0.0007106773261453963, 1.1533604530851796e-08, -0.004141904258934777, 0.0006148978972421542, 0.0017480068715999646, 0.0011169491792932956, 0.002641724820318341, 0.005415250461344693, 0.003642109435703726, -0.0012477419628991032, -0.0042654910887713745, -0.006234490214646844, 0.0007106773261453963, 1.1533604530851796e-08, -0.004141904258934777, 0.0006148978972421542, 0.0017480068715999646, 0.0011169491792932956, 0.002641724820318341, 0.005415250461344693, 0.003642109435703726, -0.0012477419628991032, -0.0042654910887713745, -0.006234490214646844, 0.0007106773261453963, 1.1533604530851796e-08, -0.004141904258934777, 0.0006148978972421542, 0.0017480068715999646, 0.0011169491792932956, 0.002641724820318341, 0.005415250461344693, 0.003642109435703726, -0.0012477419628991032, -0.0042654910887713745, -0.006234490214646844, 0.0007106773261453963, 1.1533604530851796e-08, -0.004141904258934777, 0.0006148978972421542, 0.0017480068715999646, 0.0011169491792932956, 0.002641724820318341, 0.005415250461344693, 0.003642109435703726, -0.0012477419628991032, -0.0042654910887713745, -0.006234490214646844, 0.0007106773261453963, 1.1533604530851796e-08, -0.004141904258934777, 0.0006148978972421542, 0.0017480068715999646, 0.0011169491792932956, 0.002641724820318341, 0.005415250461344693, 0.003642109435703726, -0.0012477419628991032, -0.0042654910887713745, -0.006234490214646844, 0.0007106773261453963, 1.1533604530851796e-08, -0.004141904258934777, 0.0006148978972421542, 0.0017480068715999646, 0.0011169491792932956, 0.002641724820318341, 0.005415250461344693, 0.003642109435703726, -0.0012477419628991032, -0.0042654910887713745, -0.006234490214646844, 0.0007106773261453963, 1.1533604530851796e-08, -0.004141904258934777, 0.0006148978972421542, 0.0017480068715999646, 0.0011169491792932956, 0.002641724820318341, 0.005415250461344693, 0.003642109435703726, -0.0012477419628991032, -0.0042654910887713745, -0.006234490214646844, 0.0007106773261453963, 1.1533604530851796e-08, -0.004141904258934777, 0.0006148978972421542, 0.0017480068715999646, 0.0011169491792932956, 0.002641724820318341, 0.005415250461344693, 0.003642109435703726]
indice = pd.date_range("2019-07-31 23:55:00", periods=len(seasonal), freq="T")
seasonal = pd.Series(data=seasonal, index=indice)
periodo = 0 ###
valor = seasonal.iloc[0] # All this part ...
# can it be changed
for item in seasonal: # for a better structured function,
if periodo != 0 and item == valor: # which looks for the period
break # of a group of data?
#
periodo += 1 ### Thanks
print("Periodo: {}".format(periodo))
seasonal.plot()
plt.show()
提供的答案主要来自。
使用自相关来解决你的问题。
def find_period(signal):
acf = np.correlate(signal, signal, 'full')[-len(signal):]
inflection = np.diff(np.sign(np.diff(acf)))
peaks = (inflection < 0).nonzero()[0] + 1
return peaks[acf[peaks].argmax()]
>>> find_period(seasonal)
12
请记住,这很容易,因为您的信号被复制了十次。如果信号中有噪声,则必须预处理数据。
我在研究时间序列,我正在使用Python,我需要知道平稳部分的周期(分解如下,我对季节性部分感兴趣)。
我所做的是取一个数字(任意)并计算步数(直到我再次找到它,所以我找到了句点)。这是非常过时的(在我看来)。
你知道计算序列周期的函数吗?或者也许 .. 你知道 Pandas 中的任何指令集,以避免使用循环和条件吗?我该如何执行此任务?
PS: 我得到的数据是类似这样的: 如果计数完成,数据每十二步重复一次。
import pandas as pd
import matplotlib.pyplot as plt
seasonal = [-0.0012477419628991032, -0.0042654910887713745, -0.006234490214646844, 0.0007106773261453963, 1.1533604530851796e-08, -0.004141904258934777, 0.0006148978972421542, 0.0017480068715999646, 0.0011169491792932956, 0.002641724820318341, 0.005415250461344693, 0.003642109435703726, -0.0012477419628991032, -0.0042654910887713745, -0.006234490214646844, 0.0007106773261453963, 1.1533604530851796e-08, -0.004141904258934777, 0.0006148978972421542, 0.0017480068715999646, 0.0011169491792932956, 0.002641724820318341, 0.005415250461344693, 0.003642109435703726, -0.0012477419628991032, -0.0042654910887713745, -0.006234490214646844, 0.0007106773261453963, 1.1533604530851796e-08, -0.004141904258934777, 0.0006148978972421542, 0.0017480068715999646, 0.0011169491792932956, 0.002641724820318341, 0.005415250461344693, 0.003642109435703726, -0.0012477419628991032, -0.0042654910887713745, -0.006234490214646844, 0.0007106773261453963, 1.1533604530851796e-08, -0.004141904258934777, 0.0006148978972421542, 0.0017480068715999646, 0.0011169491792932956, 0.002641724820318341, 0.005415250461344693, 0.003642109435703726, -0.0012477419628991032, -0.0042654910887713745, -0.006234490214646844, 0.0007106773261453963, 1.1533604530851796e-08, -0.004141904258934777, 0.0006148978972421542, 0.0017480068715999646, 0.0011169491792932956, 0.002641724820318341, 0.005415250461344693, 0.003642109435703726, -0.0012477419628991032, -0.0042654910887713745, -0.006234490214646844, 0.0007106773261453963, 1.1533604530851796e-08, -0.004141904258934777, 0.0006148978972421542, 0.0017480068715999646, 0.0011169491792932956, 0.002641724820318341, 0.005415250461344693, 0.003642109435703726, -0.0012477419628991032, -0.0042654910887713745, -0.006234490214646844, 0.0007106773261453963, 1.1533604530851796e-08, -0.004141904258934777, 0.0006148978972421542, 0.0017480068715999646, 0.0011169491792932956, 0.002641724820318341, 0.005415250461344693, 0.003642109435703726, -0.0012477419628991032, -0.0042654910887713745, -0.006234490214646844, 0.0007106773261453963, 1.1533604530851796e-08, -0.004141904258934777, 0.0006148978972421542, 0.0017480068715999646, 0.0011169491792932956, 0.002641724820318341, 0.005415250461344693, 0.003642109435703726, -0.0012477419628991032, -0.0042654910887713745, -0.006234490214646844, 0.0007106773261453963, 1.1533604530851796e-08, -0.004141904258934777, 0.0006148978972421542, 0.0017480068715999646, 0.0011169491792932956, 0.002641724820318341, 0.005415250461344693, 0.003642109435703726, -0.0012477419628991032, -0.0042654910887713745, -0.006234490214646844, 0.0007106773261453963, 1.1533604530851796e-08, -0.004141904258934777, 0.0006148978972421542, 0.0017480068715999646, 0.0011169491792932956, 0.002641724820318341, 0.005415250461344693, 0.003642109435703726]
indice = pd.date_range("2019-07-31 23:55:00", periods=len(seasonal), freq="T")
seasonal = pd.Series(data=seasonal, index=indice)
periodo = 0 ###
valor = seasonal.iloc[0] # All this part ...
# can it be changed
for item in seasonal: # for a better structured function,
if periodo != 0 and item == valor: # which looks for the period
break # of a group of data?
#
periodo += 1 ### Thanks
print("Periodo: {}".format(periodo))
seasonal.plot()
plt.show()
提供的答案主要来自
def find_period(signal):
acf = np.correlate(signal, signal, 'full')[-len(signal):]
inflection = np.diff(np.sign(np.diff(acf)))
peaks = (inflection < 0).nonzero()[0] + 1
return peaks[acf[peaks].argmax()]
>>> find_period(seasonal)
12
请记住,这很容易,因为您的信号被复制了十次。如果信号中有噪声,则必须预处理数据。