每天最后10分钟
Last 10 minutes per day
我正在尝试获取每天最后每 10 分钟完成的业务交易量
我的数据如下:
DF_Q
Out[97]:
LongTime
2016-01-04 09:30:00 35077034
2016-01-04 09:30:11 1119
2016-01-04 09:30:21 12295250
2016-01-04 09:30:23 1387856
2016-01-04 09:30:40 877954
...
2016-05-27 15:59:53 16986
2016-05-27 15:59:58 50080165
2016-05-27 15:59:59 17097260
Name: Volume, dtype: int64
我首先将该系列重新采样为 10 分钟间隔,然后我获得:
DF_Qmin = DF_Q.resample('10min').sum()
DF_Qmin
Out[102]:
LongTime
2016-01-04 09:30:00 3.202500e+05
2016-01-04 09:40:00 1.192028e+08
2016-01-04 09:50:00 6.156090e+07
2016-01-04 10:00:00 1.289250e+09
...
2016-05-27 15:20:00 1.035539e+09
2016-05-27 15:30:00 1.489631e+09
2016-05-27 15:40:00 2.228257e+09
2016-05-27 15:50:00 5.352179e+09
Freq: 10T, Name: Volume, dtype: float64
然后我做一个支点table
,我将其保存为 excel 并手动获取每天最后 10 分钟的成交量
2016-01-04 16:50:00 3.693279e+09
2016-01-05 16:50:00 2.158429e+09
...
2016-05-26 15:50:00 1.256878e+08
2016-05-27 15:50:00 6.521489e+09
没有 excel 也可以这样做吗?还是每天迭代?
在对 Series/DF 重新采样后,您可以这样做:
DF_Qmin.ix[DF_Qmin.index.minute == 50]
我觉得你需要groupby
by date
and aggregating last
. Last rename_axis
(new in pandas
0.18.0
) and reset_index
:
#if need column LongTime
DF_Qmin = DF_Qmin.reset_index()
print (DF_Qmin.groupby(DF_Qmin.LongTime.dt.date).last())
样本:
import pandas as pd
DF_Qmin = pd.Series({pd.Timestamp('2016-01-04 09:30:00'): 320250.0, pd.Timestamp('2016-01-04 09:50:00'): 61560900.0, pd.Timestamp('2016-05-27 15:40:00'): 2228257000.0, pd.Timestamp('2016-01-04 09:40:00'): 119202800.0, pd.Timestamp('2016-05-27 15:30:00'): 1489631000.0, pd.Timestamp('2016-01-04 10:00:00'): 1289250000.0, pd.Timestamp('2016-05-27 15:50:00'): 5352179000.0, pd.Timestamp('2016-05-27 15:20:00'): 1035539000.0}, name='Volume')
DF_Qmin.index.name = 'LongTime'
print (DF_Qmin)
LongTime
2016-01-04 09:30:00 3.202500e+05
2016-01-04 09:40:00 1.192028e+08
2016-01-04 09:50:00 6.156090e+07
2016-01-04 10:00:00 1.289250e+09
2016-05-27 15:20:00 1.035539e+09
2016-05-27 15:30:00 1.489631e+09
2016-05-27 15:40:00 2.228257e+09
2016-05-27 15:50:00 5.352179e+09
Name: Volume, dtype: float64
DF_Qmin = DF_Qmin.reset_index()
print (DF_Qmin)
LongTime Volume
0 2016-01-04 09:30:00 3.202500e+05
1 2016-01-04 09:40:00 1.192028e+08
2 2016-01-04 09:50:00 6.156090e+07
3 2016-01-04 10:00:00 1.289250e+09
4 2016-05-27 15:20:00 1.035539e+09
5 2016-05-27 15:30:00 1.489631e+09
6 2016-05-27 15:40:00 2.228257e+09
7 2016-05-27 15:50:00 5.352179e+09
print (DF_Qmin.groupby(DF_Qmin.LongTime.dt.date)
.last()
.rename_axis('Date')
.reset_index())
Date LongTime Volume
0 2016-01-04 2016-01-04 10:00:00 1.289250e+09
1 2016-05-27 2016-05-27 15:50:00 5.352179e+09
如果不需要上次:
print (DF_Qmin.groupby(DF_Qmin.index.date)
.last()
.rename_axis('Date')
.reset_index())
Date Volume
0 2016-01-04 1.289250e+09
1 2016-05-27 5.352179e+09
我正在尝试获取每天最后每 10 分钟完成的业务交易量
我的数据如下:
DF_Q
Out[97]:
LongTime
2016-01-04 09:30:00 35077034
2016-01-04 09:30:11 1119
2016-01-04 09:30:21 12295250
2016-01-04 09:30:23 1387856
2016-01-04 09:30:40 877954
...
2016-05-27 15:59:53 16986
2016-05-27 15:59:58 50080165
2016-05-27 15:59:59 17097260
Name: Volume, dtype: int64
我首先将该系列重新采样为 10 分钟间隔,然后我获得:
DF_Qmin = DF_Q.resample('10min').sum()
DF_Qmin
Out[102]:
LongTime
2016-01-04 09:30:00 3.202500e+05
2016-01-04 09:40:00 1.192028e+08
2016-01-04 09:50:00 6.156090e+07
2016-01-04 10:00:00 1.289250e+09
...
2016-05-27 15:20:00 1.035539e+09
2016-05-27 15:30:00 1.489631e+09
2016-05-27 15:40:00 2.228257e+09
2016-05-27 15:50:00 5.352179e+09
Freq: 10T, Name: Volume, dtype: float64
然后我做一个支点table
,我将其保存为 excel 并手动获取每天最后 10 分钟的成交量
2016-01-04 16:50:00 3.693279e+09
2016-01-05 16:50:00 2.158429e+09
...
2016-05-26 15:50:00 1.256878e+08
2016-05-27 15:50:00 6.521489e+09
没有 excel 也可以这样做吗?还是每天迭代?
在对 Series/DF 重新采样后,您可以这样做:
DF_Qmin.ix[DF_Qmin.index.minute == 50]
我觉得你需要groupby
by date
and aggregating last
. Last rename_axis
(new in pandas
0.18.0
) and reset_index
:
#if need column LongTime
DF_Qmin = DF_Qmin.reset_index()
print (DF_Qmin.groupby(DF_Qmin.LongTime.dt.date).last())
样本:
import pandas as pd
DF_Qmin = pd.Series({pd.Timestamp('2016-01-04 09:30:00'): 320250.0, pd.Timestamp('2016-01-04 09:50:00'): 61560900.0, pd.Timestamp('2016-05-27 15:40:00'): 2228257000.0, pd.Timestamp('2016-01-04 09:40:00'): 119202800.0, pd.Timestamp('2016-05-27 15:30:00'): 1489631000.0, pd.Timestamp('2016-01-04 10:00:00'): 1289250000.0, pd.Timestamp('2016-05-27 15:50:00'): 5352179000.0, pd.Timestamp('2016-05-27 15:20:00'): 1035539000.0}, name='Volume')
DF_Qmin.index.name = 'LongTime'
print (DF_Qmin)
LongTime
2016-01-04 09:30:00 3.202500e+05
2016-01-04 09:40:00 1.192028e+08
2016-01-04 09:50:00 6.156090e+07
2016-01-04 10:00:00 1.289250e+09
2016-05-27 15:20:00 1.035539e+09
2016-05-27 15:30:00 1.489631e+09
2016-05-27 15:40:00 2.228257e+09
2016-05-27 15:50:00 5.352179e+09
Name: Volume, dtype: float64
DF_Qmin = DF_Qmin.reset_index()
print (DF_Qmin)
LongTime Volume
0 2016-01-04 09:30:00 3.202500e+05
1 2016-01-04 09:40:00 1.192028e+08
2 2016-01-04 09:50:00 6.156090e+07
3 2016-01-04 10:00:00 1.289250e+09
4 2016-05-27 15:20:00 1.035539e+09
5 2016-05-27 15:30:00 1.489631e+09
6 2016-05-27 15:40:00 2.228257e+09
7 2016-05-27 15:50:00 5.352179e+09
print (DF_Qmin.groupby(DF_Qmin.LongTime.dt.date)
.last()
.rename_axis('Date')
.reset_index())
Date LongTime Volume
0 2016-01-04 2016-01-04 10:00:00 1.289250e+09
1 2016-05-27 2016-05-27 15:50:00 5.352179e+09
如果不需要上次:
print (DF_Qmin.groupby(DF_Qmin.index.date)
.last()
.rename_axis('Date')
.reset_index())
Date Volume
0 2016-01-04 1.289250e+09
1 2016-05-27 5.352179e+09