Pandas 基于来自其他 dataFrame 的 dataframe 选择 Dataframe

Pandas Dataframe selection based on dataframe from other dataFrame

我有一个多比率的数据框。我正在创建分组数据框 pgDF,并希望根据分组数据计算出的平均值查询 summDF。

我的想法是创建如下所述的选区。 EquityMultiplierRatio 是 summDF 中的列之一。我想从基于 summDF 行业和细分市场的 pgDF 数据框中访问平均值。

summDF[summDF['EquityMultiplierRatio'] < pgDF.loc[summDF['Industry'],summDF['Segement']['meanEquityMultiplierRatio']] 

dataDF = summDF.select_dtypes(include=['float64']).applymap(dataFormatter)
strDF = summDF.select_dtypes(include=['object'])
#summDF = summDF.apply(dataFormatter)
# print(summDF.head(10))
summDF = pd.concat([strDF,dataDF], axis=1)
#colsd = ['Symbol', 'Name']
colsd = []
for x in summDF.columns:
    if 'Ratio' in x:
        colsd.append(x)

groupInd = summDF.groupby(['Industry', 'Segment'])
pgDF = groupInd[colsd].mean()#.to_csv('GroupedData.csv')
pgDF.columns = ['mean' + x for x in pgDF.columns]

summDF 数据

     Symbol                   Name PriceCCY  YearLowDate YearHighDate  \
0  NESN:VTX              Nestle SA      CHF  Dec 02 2016  Jun 26 2017   
1  NOVN:VTX            Novartis AG      CHF  Nov 04 2016  Jun 23 2017   
2   ROG:VTX       Roche Holding AG      CHF  Dec 02 2016  May 09 2017   
3  HSBA:LSE      HSBC Holdings PLC      GBX  Aug 31 2016  Jul 31 2017   
4  RDSA:LSE  Royal Dutch Shell PLC      GBX  Sep 27 2016  Jan 16 2017   

  MarketCapCCY EPSCCY AnnualDiviCCY AnnualDiviYield   DiviExDate    ...     \
0          CHF    CHF           CHF           2.84%  Apr 10 2017    ...      
1          CHF    CHF           CHF           3.47%  Mar 02 2017    ...      
2          CHF    CHF           CHF           3.40%  Mar 16 2017    ...      
3          GBP    GBP           GBX           5.32%  Aug 03 2017    ...      
4          EUR    EUR           GBX           6.91%  Aug 10 2017    ...      

  STDebttoICRatio TaxBurdenRatio InterestBurdenRatio MarginRatio  \
0          0.1424         0.7092              0.9051      0.1547   
1          0.0577         0.8569              0.9168      0.1726   
2          0.1189         0.7483              0.9498      0.2708   
3             NaN         1.0000                 NaN         NaN   
4          0.0298         0.8521              0.7364      0.0326   

  AssetTurnoverRatio EquityMultiplierRatio ReturnonEquityRatio  \
0             0.6783                2.0421              0.1375   
1             0.3795                1.7389              0.0895   
2             0.6584                3.2127              0.4071   
3                NaN               13.5415              0.0406   
4             0.5680                2.2035              0.0256   

   PtoTBVPerShare  PtoBVPerShare  PtoSales  
0         22.5302         3.9019    2.8169  
1         15.9968         2.6747    4.0528  
2        355.8525         8.6764    4.1020  
3          1.2524         1.1000       NaN  
4          1.3780         1.2010    0.9597

pgDF 数据

                                     meanCashOprRatio  meanCurrentRatio  \
Industry        Segment                                                   
Basic Materials Chemicals                    0.578000          1.859360   
                Forestry & Paper             0.582920          1.328260   
                Industrial Metals            0.330800          1.564900   
                Mining                       1.566150          2.814140   
Consumer Goods  Automobiles & Parts          0.272469          1.476006   

                                     meanQuickRatio  meanCashRatio  \
Industry        Segment                                              
Basic Materials Chemicals                  1.180216       0.463956   
                Forestry & Paper           0.770320       0.217080   
                Industrial Metals          0.774533       0.365133   
                Mining                     2.027540       1.401680   
Consumer Goods  Automobiles & Parts        1.113837       0.598381   

                                     meanDebttoAssetRatio  \
Industry        Segment                                     
Basic Materials Chemicals                        0.538152   
                Forestry & Paper                 0.505640   
                Industrial Metals                0.524033   
                Mining                           0.501900   
Consumer Goods  Automobiles & Parts              0.658581   

                                     meanDebtoCapitalRatio  \
Industry        Segment                                      
Basic Materials Chemicals                         0.339304   
                Forestry & Paper                  0.302560   
                Industrial Metals                 0.279833   
                Mining                            0.391257   
Consumer Goods  Automobiles & Parts               0.426271   

                                     meanDebttoEquityRatio  \
Industry        Segment                                      
Basic Materials Chemicals                         0.652592   
                Forestry & Paper                  0.446280   
                Industrial Metals                 0.442300   
                Mining                            0.695986   
Consumer Goods  Automobiles & Parts               0.921236   

                                     meanInterestCoverageRatio  \
Industry        Segment                                          
Basic Materials Chemicals                            66.589904   
                Forestry & Paper                     13.094880   
                Industrial Metals                    14.631200   
                Mining                               38.101220   
Consumer Goods  Automobiles & Parts                  25.721947   

                                     meanGrossProfitMarginRatio  \
Industry        Segment                                           
Basic Materials Chemicals                              0.660825   
                Forestry & Paper                       0.668180   
                Industrial Metals                      0.770800   
                Mining                                 0.597567   
Consumer Goods  Automobiles & Parts                    1.922875   

                                     meanOperatingProfitMarginRatio  \
Industry        Segment                                               
Basic Materials Chemicals                                  0.132728   
                Forestry & Paper                           0.105120   
                Industrial Metals                          0.077233   
                Mining                                     0.201630   
Consumer Goods  Automobiles & Parts                       87.706919   

                                     meanNetProfitMarginRatio  \
Industry        Segment                                         
Basic Materials Chemicals                            0.103144   
                Forestry & Paper                     0.087080   
                Industrial Metals                    0.052533   
                Mining                               0.163700   
Consumer Goods  Automobiles & Parts                 85.941688   

                                     meanReturnonAssetsRatio  \
Industry        Segment                                        
Basic Materials Chemicals                           0.108065   
                Forestry & Paper                    0.096200   
                Industrial Metals                   0.060967   
                Mining                              0.109590   
Consumer Goods  Automobiles & Parts                 0.074853   

                                     meanLTDebttoICRatio  meanSTDebttoICRatio  \
Industry        Segment                                                         
Basic Materials Chemicals                       0.258408             0.056579   
                Forestry & Paper                0.220620             0.078540   
                Industrial Metals               0.182733             0.089633   
                Mining                          0.271450             0.070014   
Consumer Goods  Automobiles & Parts             0.318544             0.281814   

                                     meanTaxBurdenRatio  \
Industry        Segment                                   
Basic Materials Chemicals                      0.880248   
                Forestry & Paper               0.913420   
                Industrial Metals              0.703733   
                Mining                         0.874950   
Consumer Goods  Automobiles & Parts            0.875575   

                                     meanInterestBurdenRatio  meanMarginRatio  \
Industry        Segment                                                         
Basic Materials Chemicals                           0.805548         0.136052   
                Forestry & Paper                    0.781280         0.120640   
                Industrial Metals                   0.824800         0.086767   
                Mining                              0.691380         0.244280   
Consumer Goods  Automobiles & Parts                 0.840720        93.815747   

                                     meanAssetTurnoverRatio  \
Industry        Segment                                       
Basic Materials Chemicals                          0.895612   
                Forestry & Paper                   0.791480   
                Industrial Metals                  0.716767   
                Mining                             0.531190   
Consumer Goods  Automobiles & Parts                0.864700   

                                     meanEquityMultiplierRatio  \
Industry        Segment                                          
Basic Materials Chemicals                             2.400252   
                Forestry & Paper                      2.043860   
                Industrial Metals                     2.195533   
                Mining                                2.150580   
Consumer Goods  Automobiles & Parts                   3.907062   

                                     meanReturnonEquityRatio  
Industry        Segment                                       
Basic Materials Chemicals                           0.165624  
                Forestry & Paper                    0.143180  
                Industrial Metals                   0.075767  
                Mining                              0.157280  
Consumer Goods  Automobiles & Parts                 0.262512  

看来我已经解决了所以想分享一下。不确定它是否会帮助其他人但是..

我想要更多 sql 内部连接来查询两个数据帧,所以我首先使用内部连接在一个 DF 的列上使用另一个 DF 上的索引合并数据。

mergeDF = pd.merge(summDF, pgDF, left_on=['Industry', 'Segment'], right_index=True, how='inner')

之后,合并查询变得更加容易。

mergeDF[mergeDF['EquityMultiplierRatio'] < mergeDF['meanEquityMultiplierRatio']]

希望对您有所帮助..