AttributeError: 'tuple' object has no attribute 'loc' when filtering on pandas dataframe

AttributeError: 'tuple' object has no attribute 'loc' when filtering on pandas dataframe

给定以下 DataFrame -

json_path Reporting Group Entity/Grouping Entity ID Adjusted Value (Today, No Div, USD) Adjusted TWR (Current Quarter, No Div, USD) Adjusted TWR (YTD, No Div, USD) Annualized Adjusted TWR (Since Inception, No Div, USD) Adjusted Value (No Div, USD) TWR Audit Note
data.attributes.total.children.[0].children.[0].children.[0] Barrack Family William and Rupert Trust 9957007 -1.44 -1.44
data.attributes.total.children.[0].children.[0].children.[0].children.[0] Barrack Family Cash - -1.44 -1.44
data.attributes.total.children.[0].children.[0].children.[1] Barrack Family Gratia Holdings No. 2 LLC 8413655 55491732.66 -0.971018847 -0.971018847 11.52490309 55491732.66
data.attributes.total.children.[0].children.[0].children.[1].children.[0] Barrack Family Investment Grade Fixed Income - 18469768.6 18469768.6
data.attributes.total.children.[0].children.[0].children.[1].children.[1] Barrack Family High Yield Fixed Income - 3668982.44 -0.205356545 -0.205356545 4.441190127 3668982.44

以下代码应过滤掉行 != 'Cash'(Entity/Grouping 列)并且在 Adjusted TWR (Current Quarter, No Div, USD) 列、Adjusted TWR (YTD, No Div, USD) 列中具有空白值的行或 Annualized Adjusted TWR (Since Inception, No Div, USD) 列。

代码:下面的代码期望实现这个-

def twr_exceptions_logic():
    perf_asset_class_df = databases_creation()

    m1 = perf_asset_class_df.loc[(perf_asset_class_df['Entity/Grouping']!= 'Cash')]
    m2 = perf_asset_class_df[['Adjusted TWR (Current Quarter, No Div, USD)',
                              'Adjusted TWR (YTD, No Div, USD)',
                              'Annualized Adjusted TWR (Since Inception, No Div, USD)']].eq('').any(1)
    perf_asset_class_df.loc[m1&m2]
    
    return perf_asset_class_df

错误: 对于 Python 来说还是比较新的,我不确定为什么这个 AttributeError 会倒退 -

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
C:\Users\WILLIA~1.FOR\AppData\Local\Temp/ipykernel_18756/2689024934.py in <module>
     48     writer.save()
     49 
---> 50 xlsx_writer()

C:\Users\WILLIA~1.FOR\AppData\Local\Temp/ipykernel_18756/2689024934.py in xlsx_writer()
      1 # Function that writes Exceptions Report and API Response as a consolidated .xlsx file.
      2 def xlsx_writer():
----> 3     reporting_group_df, unknown_df, perf_asset_class_df, perf_entity_df, perf_entity_group_df = twr_exceptions_logic()
      4 
      5 #   Creating and defining filename for exceptions report

C:\Users\WILLIA~1.FOR\AppData\Local\Temp/ipykernel_18756/2834095962.py in twr_exceptions_logic()
      2     perf_asset_class_df = databases_creation()
      3 
----> 4     m1 = perf_asset_class_df.loc[(perf_asset_class_df['Entity/Grouping']!= 'Cash')]
      5     m2 = perf_asset_class_df[['Adjusted TWR (Current Quarter, No Div, USD)',
      6                               'Adjusted TWR (YTD, No Div, USD)',

AttributeError: 'tuple' object has no attribute 'loc'

帮助: 我对此进行了一些研究 AttributionError 并且发现了相互矛盾的信息,因为我认为它与我的特定问题有关。看起来好像 perf_asset_class_df 作为元组从 database_creation() 函数返回。但是,它绝对是一个 pandas 数据框,database_creation() 唯一做的就是获取一个名为 df 的数据框并应用 .loc 以创建一个 pandas名为 perf_asset_class_df 的数据框还是我遗漏了一些东西

perf_asset_class_df = df[df['json_path'].str.contains(r'(?:\.children\.\[\d+\]){4}')]

databases_creation()函数-

def databases_creation():
    df = data_cleansing()

    unknown_df = df[df['Entity/Grouping'].str.contains('Unknown')==True]

    perf_asset_class_df = df[df['json_path'].str.contains(r'(?:\.children\.\[\d+\]){4}')]
    perf_asset_class_df = pd.DataFrame(perf_asset_class_df)
    
    perf_entity_df = df[df['json_path'].str.count(r'\.children').eq(3)]
    perf_entity_group_df = df[df['json_path'].str.count(r'\.children').eq(2)]

    return reporting_group_df, unknown_df, perf_asset_class_df, perf_entity_df, perf_entity_group_df

有人有什么建议吗?

return reporting_group_df, unknown_df, perf_asset_class_df, perf_entity_df, perf_entity_group_df

这一行returns 一个数据帧的元组。当您调用该函数以获取您感兴趣的数据框时,您需要解压缩它。当您的代码调用 databases_creation() 时,它将整个元组保存为 perf_asset_class_df。如果你只想要那个数据框,你需要解压它:

_, _, perf_asset_class_df, _, _ = databases_creation()

这将解压元组,将每个元素保存到相应的变量中。我们使用 _ 作为我们不关心的部分,但它可以是任何其他变量。