根据索引过滤 Pandas 数据框

Filtering Pandas Dataframe according to indices

考虑以下数据框:

m.transmission

                                         eff    inv-cost    fix-cost  var-cost
Site In Site Out Transmission Commodity

Mid     North    hvac         Elec       0.90   1650000     16500         0
Mid     South    hvac         Elec       0.90   1650000     16500         0
North   Mid      hvac         Elec       0.90   1650000     16500         0
North   South    hvac         Elec       0.85   3000000     30000         0
South   Mid      hvac         Elec       0.90   1650000     16500         0
South   North    hvac         Elec       0.85   3000000     30000         0

我想根据 Site In == 'Mid'Site Out == 'Mid'

过滤值

我该怎么做?在说什么之前,这不是期望的结果:

m.transmission.loc[['Mid']]

Site In Site Out Transmission Commodity 
Mid     North    hvac         Elec
Mid     South    hvac         Elec

因为如果 Site In == 'Mid'

它只是过滤

所需的输出将是(ofc 与列;例如 eff、inv-cost、fix-cost、var-cost):

Site In Site Out Transmission Commodity 
Mid     North    hvac         Elec
Mid     South    hvac         Elec
North   Mid      hvac         Elec
South   Mid      hvac         Elec

额外

(Pdb) m.transmission.columns
Index(['eff', 'inv-cost', 'fix-cost', 'var-cost', 'inst-cap', 'cap-lo',
       'cap-up', 'wacc', 'depreciation'],
      dtype='object')
(Pdb) m.transmission.index
MultiIndex(levels=[['Mid', 'North', 'South'], ['Mid', 'North', 'South'], ['hvac'], ['Elec']],
           labels=[[0, 0, 1, 1, 2, 2], [1, 2, 0, 2, 0, 1], [0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0]],
           names=['Site In', 'Site Out', 'Transmission', 'Commodity'])

如果 'columns' 在索引中。

In [217]: df.loc[(df.index.get_level_values('Site In') == 'Mid') | 
                 (df.index.get_level_values('Site Out') == 'Mid')]
Out[217]:
                                         v
Site In Site Out Transmission Commodity
Mid     North    hvac         Elec       1
        South    hvac         Elec       1
North   Mid      hvac         Elec       1
South   Mid      hvac         Elec       1

如果是平面数据框,使用

In [168]: df.loc[(df['Site In'] == 'Mid') | (df['Site Out'] == 'Mid')]
Out[168]:
  Site In Site Out Transmission Commodity
0     Mid    North         hvac      Elec
1     Mid    South         hvac      Elec
2   North      Mid         hvac      Elec
4   South      Mid         hvac      Elec

或者

In [169]: df.loc[df['Site In'].eq('Mid') | df['Site Out'].eq('Mid')]
Out[169]:
  Site In Site Out Transmission Commodity
0     Mid    North         hvac      Elec
1     Mid    South         hvac      Elec
2   North      Mid         hvac      Elec
4   South      Mid         hvac      Elec

更新:

演示:

In [94]: df
Out[94]:
                                         val
Site_In Site_Out Transmission Commodity
Mid     North    hvac         Elec         1
        South    hvac         Elec         2
North   Mid      hvac         Elec         3
        South    hvac         Elec         4
South   Mid      hvac         Elec         5
        North    hvac         Elec         6

In [95]: df.query("Site_In == 'Mid' or Site_Out == 'Mid'")
Out[95]:
                                         val
Site_In Site_Out Transmission Commodity
Mid     North    hvac         Elec         1
        South    hvac         Elec         2
North   Mid      hvac         Elec         3
South   Mid      hvac         Elec         5

注意:此方法仅适用于 index/column 不包含空格的名称

你可以试试这个。很容易实现。

In [12]: df[(df["Site In"]=="Mid")|(df["Site Out"]=="Mid")]
Out[12]:
  Site In Site Out Transmission Commodity
0     Mid    North         hvac      Elec
1     Mid    South         hvac      Elec
2   North      Mid         hvac      Elec
4   South      Mid         hvac      Elec