使用级别值过滤 pandas df
Filtering pandas df with level values
我有以下 pandas df:
df
price max maxperhour
Site Commodity Type
Mid Biomass Stock 6.0 1.500000e+15 1.500000e+15
CO2 Env 0.0 1.500000e+15 1.500000e+15
Coal Stock 7.0 1.500000e+15 1.500000e+15
Elec Demand NaN NaN NaN
Gas Stock 27.0 1.500000e+15 1.500000e+15
Hydro SupIm NaN NaN NaN
Lignite Stock 4.0 1.500000e+15 1.500000e+15
Solar SupIm NaN NaN NaN
Wind SupIm NaN NaN NaN
我想过滤上面提到的 df 并创建一个包含 Commodity
个项目的列表,当 Site == 'Mid'
和 Type == ('Stock' or 'Demand')
.
因此应使用某些 pandas 过滤功能创建以下列表:
df.somefunction()
['Biomass', 'Coal', 'Gas', 'Lignite', 'Elec']
我该如何实现?
最后,如果可能的话,我想将 'Elec'
作为最后一个元素,我的意思是;创建列表时,'Elec'
可能是列表的第三个元素,例如:
['Biomass', 'Coal', 'Elec', 'Gas', 'Lignite']
但是,如果我能得到 'Elec'
作为最后一个元素,那就最好了:
['Biomass', 'Coal', 'Gas', 'Lignite', 'Elec']
因为它是唯一具有 Type == 'Demand'
的元素
来自@jezrael
df[(df.index.get_level_values('Site') == 'Mid') & (df.index.get_level_values('Type') == 'Stock')].index.remove_unused_levels().get_level_values('Commodity').tolist()
MultiIndex
的解决方案:
m1 = (df.index.get_level_values('Site') == 'Mid')
m2 = (df.index.get_level_values('Type') == 'Stock')
m3 = (df.index.get_level_values('Type') == 'Demand')
idx1 = df[m1 & m2].index.remove_unused_levels().get_level_values('Commodity')
idx2 = df[m1 & m3].index.remove_unused_levels().get_level_values('Commodity')
idx = idx1.append(idx2)
print (idx)
Index(['Biomass', 'Coal', 'Gas', 'Lignite', 'Elec'], dtype='object', name='Commodity')
备选列:
df1 = df.reset_index()
m1 = (df1['Site'] == 'Mid')
m2 = (df1['Type'] == 'Stock')
m3 = (df1['Type'] == 'Demand')
idx1 = df1.loc[m1 & m2, 'Commodity']
idx2 = df1.loc[m1 & m3, 'Commodity']
idx = idx1.append(idx2).tolist()
print (idx)
['Biomass', 'Coal', 'Gas', 'Lignite', 'Elec']
我有以下 pandas df:
df
price max maxperhour
Site Commodity Type
Mid Biomass Stock 6.0 1.500000e+15 1.500000e+15
CO2 Env 0.0 1.500000e+15 1.500000e+15
Coal Stock 7.0 1.500000e+15 1.500000e+15
Elec Demand NaN NaN NaN
Gas Stock 27.0 1.500000e+15 1.500000e+15
Hydro SupIm NaN NaN NaN
Lignite Stock 4.0 1.500000e+15 1.500000e+15
Solar SupIm NaN NaN NaN
Wind SupIm NaN NaN NaN
我想过滤上面提到的 df 并创建一个包含 Commodity
个项目的列表,当 Site == 'Mid'
和 Type == ('Stock' or 'Demand')
.
因此应使用某些 pandas 过滤功能创建以下列表:
df.somefunction()
['Biomass', 'Coal', 'Gas', 'Lignite', 'Elec']
我该如何实现?
最后,如果可能的话,我想将 'Elec'
作为最后一个元素,我的意思是;创建列表时,'Elec'
可能是列表的第三个元素,例如:
['Biomass', 'Coal', 'Elec', 'Gas', 'Lignite']
但是,如果我能得到 'Elec'
作为最后一个元素,那就最好了:
['Biomass', 'Coal', 'Gas', 'Lignite', 'Elec']
因为它是唯一具有 Type == 'Demand'
来自@jezrael
df[(df.index.get_level_values('Site') == 'Mid') & (df.index.get_level_values('Type') == 'Stock')].index.remove_unused_levels().get_level_values('Commodity').tolist()
MultiIndex
的解决方案:
m1 = (df.index.get_level_values('Site') == 'Mid')
m2 = (df.index.get_level_values('Type') == 'Stock')
m3 = (df.index.get_level_values('Type') == 'Demand')
idx1 = df[m1 & m2].index.remove_unused_levels().get_level_values('Commodity')
idx2 = df[m1 & m3].index.remove_unused_levels().get_level_values('Commodity')
idx = idx1.append(idx2)
print (idx)
Index(['Biomass', 'Coal', 'Gas', 'Lignite', 'Elec'], dtype='object', name='Commodity')
备选列:
df1 = df.reset_index()
m1 = (df1['Site'] == 'Mid')
m2 = (df1['Type'] == 'Stock')
m3 = (df1['Type'] == 'Demand')
idx1 = df1.loc[m1 & m2, 'Commodity']
idx2 = df1.loc[m1 & m3, 'Commodity']
idx = idx1.append(idx2).tolist()
print (idx)
['Biomass', 'Coal', 'Gas', 'Lignite', 'Elec']