如何使用groupby计算vwap(成交量加权平均价格)并申请?
How to calculate vwap (volume weighted average price) using groupby and apply?
我已经阅读了多个 post 与我的问题类似的问题,但我仍然无法理解。我有一个 pandas df 看起来像下面这样(多天):
Out[1]:
price quantity
time
2016-06-08 09:00:22 32.30 1960.0
2016-06-08 09:00:22 32.30 142.0
2016-06-08 09:00:22 32.30 3857.0
2016-06-08 09:00:22 32.30 1000.0
2016-06-08 09:00:22 32.35 991.0
2016-06-08 09:00:22 32.30 447.0
...
要计算我可以做的 vwap:
df['vwap'] = (np.cumsum(df.quantity * df.price) / np.cumsum(df.quantity))
但是,我想每天都重新开始 (groupby),但我不知道如何让它与 (lambda?) 函数一起工作。
df['vwap_day'] = df.groupby(df.index.date)['vwap'].apply(lambda ...
速度至关重要。非常感谢任何帮助:)
选项 0
普通方法
def vwap(df):
q = df.quantity.values
p = df.price.values
return df.assign(vwap=(p * q).cumsum() / q.cumsum())
df = df.groupby(df.index.date, group_keys=False).apply(vwap)
df
price quantity vwap
time
2016-06-08 09:00:22 32.30 1960.0 32.300000
2016-06-08 09:00:22 32.30 142.0 32.300000
2016-06-08 09:00:22 32.30 3857.0 32.300000
2016-06-08 09:00:22 32.30 1000.0 32.300000
2016-06-08 09:00:22 32.35 991.0 32.306233
2016-06-08 09:00:22 32.30 447.0 32.305901
选项 1
稍微投入一点eval
df = df.assign(
vwap=df.eval(
'wgtd = price * quantity', inplace=False
).groupby(df.index.date).cumsum().eval('wgtd / quantity')
)
df
price quantity vwap
time
2016-06-08 09:00:22 32.30 1960.0 32.300000
2016-06-08 09:00:22 32.30 142.0 32.300000
2016-06-08 09:00:22 32.30 3857.0 32.300000
2016-06-08 09:00:22 32.30 1000.0 32.300000
2016-06-08 09:00:22 32.35 991.0 32.306233
2016-06-08 09:00:22 32.30 447.0 32.305901
我以前也用过这个方法,但如果你想限制 window 时间段,它就不太准确了。相反,我发现 TA python 库工作得很好:
https://technical-analysis-library-in-python.readthedocs.io/en/latest/index.html
from ta.volume import VolumeWeightedAveragePrice
# ...
def vwap(dataframe, label='vwap', window=3, fillna=True):
dataframe[label] = VolumeWeightedAveragePrice(high=dataframe['high'], low=dataframe['low'], close=dataframe["close"], volume=dataframe['volume'], window=window, fillna=fillna).volume_weighted_average_price()
return dataframe
我已经阅读了多个 post 与我的问题类似的问题,但我仍然无法理解。我有一个 pandas df 看起来像下面这样(多天):
Out[1]:
price quantity
time
2016-06-08 09:00:22 32.30 1960.0
2016-06-08 09:00:22 32.30 142.0
2016-06-08 09:00:22 32.30 3857.0
2016-06-08 09:00:22 32.30 1000.0
2016-06-08 09:00:22 32.35 991.0
2016-06-08 09:00:22 32.30 447.0
...
要计算我可以做的 vwap:
df['vwap'] = (np.cumsum(df.quantity * df.price) / np.cumsum(df.quantity))
但是,我想每天都重新开始 (groupby),但我不知道如何让它与 (lambda?) 函数一起工作。
df['vwap_day'] = df.groupby(df.index.date)['vwap'].apply(lambda ...
速度至关重要。非常感谢任何帮助:)
选项 0
普通方法
def vwap(df):
q = df.quantity.values
p = df.price.values
return df.assign(vwap=(p * q).cumsum() / q.cumsum())
df = df.groupby(df.index.date, group_keys=False).apply(vwap)
df
price quantity vwap
time
2016-06-08 09:00:22 32.30 1960.0 32.300000
2016-06-08 09:00:22 32.30 142.0 32.300000
2016-06-08 09:00:22 32.30 3857.0 32.300000
2016-06-08 09:00:22 32.30 1000.0 32.300000
2016-06-08 09:00:22 32.35 991.0 32.306233
2016-06-08 09:00:22 32.30 447.0 32.305901
选项 1
稍微投入一点eval
df = df.assign(
vwap=df.eval(
'wgtd = price * quantity', inplace=False
).groupby(df.index.date).cumsum().eval('wgtd / quantity')
)
df
price quantity vwap
time
2016-06-08 09:00:22 32.30 1960.0 32.300000
2016-06-08 09:00:22 32.30 142.0 32.300000
2016-06-08 09:00:22 32.30 3857.0 32.300000
2016-06-08 09:00:22 32.30 1000.0 32.300000
2016-06-08 09:00:22 32.35 991.0 32.306233
2016-06-08 09:00:22 32.30 447.0 32.305901
我以前也用过这个方法,但如果你想限制 window 时间段,它就不太准确了。相反,我发现 TA python 库工作得很好: https://technical-analysis-library-in-python.readthedocs.io/en/latest/index.html
from ta.volume import VolumeWeightedAveragePrice
# ...
def vwap(dataframe, label='vwap', window=3, fillna=True):
dataframe[label] = VolumeWeightedAveragePrice(high=dataframe['high'], low=dataframe['low'], close=dataframe["close"], volume=dataframe['volume'], window=window, fillna=fillna).volume_weighted_average_price()
return dataframe