Pandas 获取指定时间段内时间序列的最大增量

Pandas get max delta in a timeseries for a specified period

给定一个以非常规时间序列作为索引的数据帧,我想找到 10 秒时间段内值之间的最大增量。这是一些做同样事情的代码:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

np.random.seed(0)
xs = np.cumsum(np.random.rand(200))
# This function is to create a general situation where the max is not aways at the end or beginning
ys = xs**1.2 + 10 * np.sin(xs)

plt.plot(xs, ys, '+-')

threshold = 10
xs_thresh_ind = np.zeros_like(xs, dtype=int)
deltas = np.zeros_like(ys)

for i, x in enumerate(xs):
  # Find indices that lie within the time threshold
  period_end_ind = np.argmax(xs > x + threshold)

  # Only operate when the window is wide enough (this can be treated differently)
  if period_end_ind > 0:
    xs_thresh_ind[i] = period_end_ind

    # Find extrema in the period
    period_min = np.min(ys[i:period_end_ind + 1])
    period_max = np.max(ys[i:period_end_ind + 1])
    deltas[i] = period_max - period_min

max_ind_low = np.argmax(deltas)
max_ind_high = xs_thresh_ind[max_ind_low]
max_delta = deltas[max_ind_low]

print(
    'Max delta {:.2f} is in period x[{}]={:.2f},{:.2f} and x[{}]={:.2f},{:.2f}'
    .format(max_delta, max_ind_low, xs[max_ind_low], ys[max_ind_low],
            max_ind_high, xs[max_ind_high], ys[max_ind_high]))

df = pd.DataFrame(ys, index=xs)

OUTPUT:
Max delta 48.76 is in period x[167]=86.10,200.32 and x[189]=96.14,249.09

是否有一种有效的 panadic 方法来实现类似的目标?

ys 值创建一个系列,由 xs 索引 - 但将 xs 转换为实际的 timedelta 元素,而不是等效的浮点数。

ts = pd.Series(ys, index=pd.to_timedelta(xs, unit="s"))

我们要应用领先的 10 秒 window,我们在其中计算最大值和最小值之间的差异。因为我们希望它领先,所以我们将按降序对系列进行排序并应用尾随 window.

deltas = ts.sort_index(ascending=False).rolling("10s").agg(lambda s: s.max() - s.min())

找到 deltas[deltas == deltas.max()] 的最大增量,得到

0 days 00:01:26.104797298    48.354851

意味着在区间 [86.1, 96.1)

中发现了 48.35 的增量