如何将 Dataframe Multiindex 中的值与相同索引的系列进行比较
How to compare values in Dataframe Multindex with equally indexed series
我有一个名为 chunk 的数据框,如下所示:
Expiration DataDate durations
UnderlyingSymbol Delta OptionSymbol
A 0.9991 32500 2020-05-15 2020-05-01 14 days
35000 2020-05-15 2020-05-01 14 days
37500 2020-05-15 2020-05-01 15 days
40000 2020-05-15 2020-05-01 16 days
42500 2020-05-15 2020-05-01 14 days
45000 2020-05-15 2020-05-01 13 days
-0.9152 15000 2020-11-20 2020-05-01 203 days
AAL -0.9142 20000 2020-06-05 2020-05-01 35 days
我有一个名为 durations_star 的系列,其 multindex 与数据框中的前两个级别匹配:
UnderlyingSymbol Delta
A 0.9991 14 days
-0.9152 203 days
AAL -0.9142 35 days
Name: durations, dtype: timedelta64[ns]
有没有办法 select 数据框中的行,对于每个(UnderlyingSymbol,Delta),都等于(一旦转换为整数)系列中的相应值?
例如
Expiration DataDate durations \
UnderlyingSymbol Delta OptionSymbol
A 0.9991 32500 2020-05-15 2020-05-01 14 days
35000 2020-05-15 2020-05-01 14 days
42500 2020-05-15 2020-05-01 14 days
-0.9152 15000 2020-11-20 2020-05-01 203 days
AAL -0.9142 20000 2020-06-05 2020-05-01 35 days
有没有更好的方法:
chunk.reset_index().groupby(['UnderlyingSymbol', 'Delta']).apply(lambda t : t[ t.durations == durations_star[t.name[0], t.name[1] ] ] )
您可以使用 join
:
>>> df.join(sr.rename('durations2')) \
.query('durations == durations2') \
.drop(columns='durations2')
Expiration DataDate durations
UnderlyingSymbol Delta OptionSymbol
A -0.9152 15000 2020-11-20 2020-05-01 203 days
0.9991 32500 2020-05-15 2020-05-01 14 days
35000 2020-05-15 2020-05-01 14 days
42500 2020-05-15 2020-05-01 14 days
AAL -0.9142 20000 2020-06-05 2020-05-01 35 days
设置:
# Your DataFrame
>>> df
Expiration DataDate durations
UnderlyingSymbol Delta OptionSymbol
A 0.9991 32500 2020-05-15 2020-05-01 14 days
35000 2020-05-15 2020-05-01 14 days
37500 2020-05-15 2020-05-01 15 days
40000 2020-05-15 2020-05-01 16 days
42500 2020-05-15 2020-05-01 14 days
45000 2020-05-15 2020-05-01 13 days
-0.9152 15000 2020-11-20 2020-05-01 203 days
AAL -0.9142 20000 2020-06-05 2020-05-01 35 days
# Your Series
>>> sr
UnderlyingSymbol Delta
A 0.9991 14 days
-0.9152 203 days
AAL -0.9142 35 days
Name: durations, dtype: object
我有一个名为 chunk 的数据框,如下所示:
Expiration DataDate durations
UnderlyingSymbol Delta OptionSymbol
A 0.9991 32500 2020-05-15 2020-05-01 14 days
35000 2020-05-15 2020-05-01 14 days
37500 2020-05-15 2020-05-01 15 days
40000 2020-05-15 2020-05-01 16 days
42500 2020-05-15 2020-05-01 14 days
45000 2020-05-15 2020-05-01 13 days
-0.9152 15000 2020-11-20 2020-05-01 203 days
AAL -0.9142 20000 2020-06-05 2020-05-01 35 days
我有一个名为 durations_star 的系列,其 multindex 与数据框中的前两个级别匹配:
UnderlyingSymbol Delta
A 0.9991 14 days
-0.9152 203 days
AAL -0.9142 35 days
Name: durations, dtype: timedelta64[ns]
有没有办法 select 数据框中的行,对于每个(UnderlyingSymbol,Delta),都等于(一旦转换为整数)系列中的相应值? 例如
Expiration DataDate durations \
UnderlyingSymbol Delta OptionSymbol
A 0.9991 32500 2020-05-15 2020-05-01 14 days
35000 2020-05-15 2020-05-01 14 days
42500 2020-05-15 2020-05-01 14 days
-0.9152 15000 2020-11-20 2020-05-01 203 days
AAL -0.9142 20000 2020-06-05 2020-05-01 35 days
有没有更好的方法:
chunk.reset_index().groupby(['UnderlyingSymbol', 'Delta']).apply(lambda t : t[ t.durations == durations_star[t.name[0], t.name[1] ] ] )
您可以使用 join
:
>>> df.join(sr.rename('durations2')) \
.query('durations == durations2') \
.drop(columns='durations2')
Expiration DataDate durations
UnderlyingSymbol Delta OptionSymbol
A -0.9152 15000 2020-11-20 2020-05-01 203 days
0.9991 32500 2020-05-15 2020-05-01 14 days
35000 2020-05-15 2020-05-01 14 days
42500 2020-05-15 2020-05-01 14 days
AAL -0.9142 20000 2020-06-05 2020-05-01 35 days
设置:
# Your DataFrame
>>> df
Expiration DataDate durations
UnderlyingSymbol Delta OptionSymbol
A 0.9991 32500 2020-05-15 2020-05-01 14 days
35000 2020-05-15 2020-05-01 14 days
37500 2020-05-15 2020-05-01 15 days
40000 2020-05-15 2020-05-01 16 days
42500 2020-05-15 2020-05-01 14 days
45000 2020-05-15 2020-05-01 13 days
-0.9152 15000 2020-11-20 2020-05-01 203 days
AAL -0.9142 20000 2020-06-05 2020-05-01 35 days
# Your Series
>>> sr
UnderlyingSymbol Delta
A 0.9991 14 days
-0.9152 203 days
AAL -0.9142 35 days
Name: durations, dtype: object