有什么好的方法可以对时间序列数据块进行计算和赋值吗?

Is there any good way to perform calculation and assign values on time series data chunks?

我想对时间序列数据执行一些计算,并复制特定行的输出,我尝试了几种方法,例如对数据进行重采样来获取值。 以下是我试图获得的代码和输出。获得一些专家提示以有效地完成工作将非常有帮助。

问题:1. 我想从时间范围为 15 分钟的数据集中获取 'High' 'Low' 和“收盘价”。 2. 然后计算“(High+Low+Close)/3”并将此值复制到第二天的 15 分钟时间范围内。 等等。

以下为实际数据

............................................................
Datetime                 |   Open  | High |   Low   |Close
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
2021-01-04 09:15:00+05:30   2575.05 2591.95 2570.0  2571.5 ---------------Day 1
2021-01-04 09:30:00+05:30   2571.35 2575.05 2561.05 2569.35
2021-01-04 09:45:00+05:30   2568.6  2579.8  2568.6  2576.25
2021-01-04 10:00:00+05:30   2576.25 2578.0  2567.0  2567.8
2021-01-04 10:15:00+05:30   2568.25 2570.0  2564.25 2565.0
2021-01-04 10:30:00+05:30   2565.0  2565.15 2556.5  2557.45
2021-01-04 10:45:00+05:30   2558.35 2559.0  2543.85 2553.85
2021-01-04 11:00:00+05:30   2553.85 2555.55 2533.15 2553.55
2021-01-04 11:15:00+05:30   2553.5  2555.9  2547.25 2551.85
2021-01-04 11:30:00+05:30   2551.85 2576.7  2550.15 2571.8
2021-01-04 11:45:00+05:30   2571.45 2571.8  2565.1  2565.7
2021-01-04 12:00:00+05:30   2566.25 2569.9  2561.55 2565.45
2021-01-04 12:15:00+05:30   2564.65 2566.1  2560.25 2562.5
2021-01-04 12:30:00+05:30   2562.5  2568.15 2562.15 2566.0
2021-01-04 12:45:00+05:30   2566.45 2569.9  2565.1  2567.8
2021-01-04 13:00:00+05:30   2567.9  2568.0  2561.9  2565.0
2021-01-04 13:15:00+05:30   2565.0  2565.0  2556.4  2558.0
2021-01-04 13:30:00+05:30   2559.1  2567.9  2557.2  2565.65
2021-01-04 13:45:00+05:30   2565.8  2571.1  2565.0  2571.0
2021-01-04 14:00:00+05:30   2570.95 2575.0  2565.1  2572.75
2021-01-04 14:15:00+05:30   2572.4  2576.0  2567.45 2575.35
2021-01-04 14:30:00+05:30   2575.6  2580.8  2570.95 2578.65
2021-01-04 14:45:00+05:30   2579.25 2584.0  2575.55 2581.8
2021-01-04 15:00:00+05:30   2581.7  2582.7  2575.0  2579.7
2021-01-04 15:15:00+05:30   2579.75 2582.65 2575.0  2578.5
2021-01-05 09:15:00+05:30   2581.15 2623.45 2573.3  2611.0-----------------Day2
2021-01-05 09:30:00+05:30   2611.45 2628.0  2609.0  2628.0
2021-01-05 09:45:00+05:30   2627.0  2635.0  2622.5  2623.0
2021-01-05 10:00:00+05:30   2622.65 2630.0  2621.95 2626.7
2021-01-05 10:15:00+05:30   2626.65 2628.0  2617.0  2617.4
2021-01-05 10:30:00+05:30   2617.4  2624.55 2615.0  2616.95
2021-01-05 10:45:00+05:30   2616.95 2627.0  2615.1  2627.0
2021-01-05 11:00:00+05:30   2627.6  2642.0  2626.55 2635.85
2021-01-05 11:15:00+05:30   2635.85 2644.0  2633.2  2638.85
2021-01-05 11:30:00+05:30   2638.85 2640.05 2633.0  2639.85
2021-01-05 11:45:00+05:30   2639.9  2640.0  2633.0  2636.65
2021-01-05 12:00:00+05:30   2636.05 2639.95 2633.0  2638.95
2021-01-05 12:15:00+05:30   2639.15 2642.0  2635.2  2637.0
2021-01-05 12:30:00+05:30   2637.2  2649.0  2636.8  2644.5
2021-01-05 12:45:00+05:30   2644.8  2650.0  2644.45 2647.9
2021-01-05 13:00:00+05:30   2647.35 2650.0  2644.0  2644.65
2021-01-05 13:15:00+05:30   2644.65 2645.0  2628.55 2634.45
2021-01-05 13:30:00+05:30   2634.5  2643.5  2631.0  2641.0
2021-01-05 13:45:00+05:30   2641.05 2641.4  2633.0  2634.7
2021-01-05 14:00:00+05:30   2634.8  2638.9  2632.0  2632.0
2021-01-05 14:15:00+05:30   2632.0  2638.0  2630.0  2635.95
2021-01-05 14:30:00+05:30   2635.95 2642.0  2633.2  2639.95
2021-01-05 14:45:00+05:30   2639.85 2642.0  2637.25 2639.4
2021-01-05 15:00:00+05:30   2639.25 2658.0  2638.35 2656.9
2021-01-05 15:15:00+05:30   2656.15 2659.0  2646.95 2654.95
2021-01-06 09:15:00+05:30   2660.1  2662.1  2638.5  2639.0 ----------------Day3
2021-01-06 09:30:00+05:30   2639.0  2648.8  2635.2  2646.05

我尝试重新采样数据并获取高、低和收盘价,然后分别计算 h+L+C/3,然后通过键入复制它。 下面是重新采样的代码

df1 = df.resample('1D', # on='col name'   if the date column is not an index
                 origin='start').agg({'Open':'first',
                                      'Close':'last',
                                      'Low':'min',
                                      'High':'max'})

有没有什么好的方法可以轻松得到想要的df。 下面是所需的输出

for date 2021-01-04 Day's High = 2591.95, Low = 2533.15, Close = 2578.5
(High+Low+Close)/3 = 2567.68
the value 2567.68 get copied to next day 2021-01-05 showing below 

then wants to calculate H+L+C/3 on 2021-04-05 and then copy it to next day's 15 min time frame data shown below. and so on 
...........................................................................
Datetime                 |   Open  | High |   Low   |Close | ((H+L+C)/3) |
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
2021-01-04 09:15:00+05:30   2575.05 2591.95 2570.0  2571.5               ------Day1 
2021-01-04 09:30:00+05:30   2571.35 2575.05 2561.05 2569.35
2021-01-04 09:45:00+05:30   2568.6  2579.8  2568.6  2576.25
2021-01-04 10:00:00+05:30   2576.25 2578.0  2567.0  2567.8
2021-01-04 10:15:00+05:30   2568.25 2570.0  2564.25 2565.0
2021-01-04 10:30:00+05:30   2565.0  2565.15 2556.5  2557.45
2021-01-04 10:45:00+05:30   2558.35 2559.0  2543.85 2553.85
2021-01-04 11:00:00+05:30   2553.85 2555.55 2533.15 2553.55
2021-01-04 11:15:00+05:30   2553.5  2555.9  2547.25 2551.85
2021-01-04 11:30:00+05:30   2551.85 2576.7  2550.15 2571.8
2021-01-04 11:45:00+05:30   2571.45 2571.8  2565.1  2565.7
2021-01-04 12:00:00+05:30   2566.25 2569.9  2561.55 2565.45
2021-01-04 12:15:00+05:30   2564.65 2566.1  2560.25 2562.5
2021-01-04 12:30:00+05:30   2562.5  2568.15 2562.15 2566.0
2021-01-04 12:45:00+05:30   2566.45 2569.9  2565.1  2567.8
2021-01-04 13:00:00+05:30   2567.9  2568.0  2561.9  2565.0
2021-01-04 13:15:00+05:30   2565.0  2565.0  2556.4  2558.0
2021-01-04 13:30:00+05:30   2559.1  2567.9  2557.2  2565.65
2021-01-04 13:45:00+05:30   2565.8  2571.1  2565.0  2571.0
2021-01-04 14:00:00+05:30   2570.95 2575.0  2565.1  2572.75
2021-01-04 14:15:00+05:30   2572.4  2576.0  2567.45 2575.35
2021-01-04 14:30:00+05:30   2575.6  2580.8  2570.95 2578.65
2021-01-04 14:45:00+05:30   2579.25 2584.0  2575.55 2581.8
2021-01-04 15:00:00+05:30   2581.7  2582.7  2575.0  2579.7
2021-01-04 15:15:00+05:30   2579.75 2582.65 2575.0  2578.5
2021-01-05 09:15:00+05:30   2581.15 2623.45 2573.3  2611.0-   2567.68 -------day2
2021-01-05 09:30:00+05:30   2611.45 2628.0  2609.0  2628.0    2567.68
2021-01-05 09:45:00+05:30   2627.0  2635.0  2622.5  2623.0    2567.68
2021-01-05 10:00:00+05:30   2622.65 2630.0  2621.95 2626.7    2567.68
2021-01-05 10:15:00+05:30   2626.65 2628.0  2617.0  2617.4    2567.68
2021-01-05 10:30:00+05:30   2617.4  2624.55 2615.0  2616.95   2567.68
2021-01-05 10:45:00+05:30   2616.95 2627.0  2615.1  2627.0    2567.68
2021-01-05 11:00:00+05:30   2627.6  2642.0  2626.55 2635.85   2567.68
2021-01-05 11:15:00+05:30   2635.85 2644.0  2633.2  2638.85   2567.68
2021-01-05 11:30:00+05:30   2638.85 2640.05 2633.0  2639.85   2567.68
2021-01-05 11:45:00+05:30   2639.9  2640.0  2633.0  2636.65   2567.68
2021-01-05 12:00:00+05:30   2636.05 2639.95 2633.0  2638.95   2567.68
2021-01-05 12:15:00+05:30   2639.15 2642.0  2635.2  2637.0    2567.68
2021-01-05 12:30:00+05:30   2637.2  2649.0  2636.8  2644.5    2567.68
2021-01-05 12:45:00+05:30   2644.8  2650.0  2644.45 2647.9    2567.68
2021-01-05 13:00:00+05:30   2647.35 2650.0  2644.0  2644.65   2567.68
2021-01-05 13:15:00+05:30   2644.65 2645.0  2628.55 2634.45   2567.68
2021-01-05 13:30:00+05:30   2634.5  2643.5  2631.0  2641.0    2567.68
2021-01-05 13:45:00+05:30   2641.05 2641.4  2633.0  2634.7    2567.68
2021-01-05 14:00:00+05:30   2634.8  2638.9  2632.0  2632.0    2567.68
2021-01-05 14:15:00+05:30   2632.0  2638.0  2630.0  2635.95   2567.68
2021-01-05 14:30:00+05:30   2635.95 2642.0  2633.2  2639.95   2567.68
2021-01-05 14:45:00+05:30   2639.85 2642.0  2637.25 2639.4    2567.68
2021-01-05 15:00:00+05:30   2639.25 2658.0  2638.35 2656.9    2567.68
2021-01-05 15:15:00+05:30   2656.15 2659.0  2646.95 2654.95   2567.68
2021-01-06 09:15:00+05:30   2660.1  2662.1  2638.5  2639.0            -----Day3
2021-01-06 09:30:00+05:30   2639.0  2648.8  2635.2  2646.05

我是新手。 非常感谢您的帮助:)

如果可能,将 df1 中 DatetimeIndex 的第一个值与 df.index 中原始日期的第一个值相匹配是可能的聚合,创建均值(与求和除法 3 相同),移动第二天并分配回具有前向填充缺失值的新列:

df1 = df.resample('1D', # on='col name'   if the date column is not an index
                 origin='start').agg({'Open':'first',
                                      'Close':'last',
                                      'Low':'min',
                                      'High':'max'})
                                      
s = df1[['High','Low','Close']].mean(axis=1).shift()

df['new'] = s
df['new'] = df['new'].ffill()

另一个想法是使用 Series.dt.normalize for grouping per DatetimeIndex without times and for new column is used Index.map:

df1 = df.groupby(df.index.normalize()).agg({'Open':'first',
                                      'Close':'last',
                                      'Low':'min',
                                      'High':'max'})
                                      
s = df1[['High','Low','Close']].mean(axis=1).shift()

df['new'] = df.index.normalize().map(s)