如何在 python 中多次将自定义函数应用于同一数据框?
How to apply a custom function to the same dataframe multiple times in python?
我正在尝试分解 pandas dataframe
的列以创建新的 columns
.
def explode(child_df, column_value):
child_df = child_df.dropna(subset=[column_value])
if isinstance(child_df[column_value].iloc[0], str):
print('tried')
child_df[column_value] = child_df[column_value].apply(ast.literal_eval)
expanded_child_df = (pd.concat({i: json_normalize(x) for i, x in child_df.pop(column_value).items()}).reset_index(level=1, drop=True).join(child_df,how='right',lsuffix='_left',rsuffix='_right').reset_index(drop=True))
expanded_child_df.columns = map(str.lower, expanded_child_df.columns)
return expanded_child_df
有没有办法多次将 explode
函数应用于数据帧,
这是我尝试将 explode
函数应用于数据框的地方 consolidated_df
:
def clean():
column_value = ['tracking_results','trackable_items','events']
consolidated_df_cleaner = explode(consolidated_df,column_value.value)
# Need to iterate over column_value and pass the value as the second argument into `explode` function on the same dataframe
consolidated_df_cleaner.to_csv('/home/response4.csv',index=False)
试过这个但不行:
pd_list = []
for param in column_value:
pd_list.append(apply(explode(consolidated_df),param))
这就是我现在正在做的事情,我需要避免这种情况:
consolidated_df_cleaner=explode(consolidated_df,'tracking_results')
consolidated_df_cleaner2=explode(consolidated_df_cleaner,'trackable_items')
consolidated_df_cleaner3= explode(consolidated_df_cleaner2,'events')
consolidated_df_cleaner3.to_csv('/home/response4.csv',index=False)
预期输出:
tracking_results trackable_items events
intransit abc 22
intransit xqy 23
尝试
(consolidated_df
.pipe(explode,'tracking_results')
.pipe(explode,'trackable_items')
.pipe(explode,'events')
.to_csv('/home/response4.csv',index=False)
)
我正在尝试分解 pandas dataframe
的列以创建新的 columns
.
def explode(child_df, column_value):
child_df = child_df.dropna(subset=[column_value])
if isinstance(child_df[column_value].iloc[0], str):
print('tried')
child_df[column_value] = child_df[column_value].apply(ast.literal_eval)
expanded_child_df = (pd.concat({i: json_normalize(x) for i, x in child_df.pop(column_value).items()}).reset_index(level=1, drop=True).join(child_df,how='right',lsuffix='_left',rsuffix='_right').reset_index(drop=True))
expanded_child_df.columns = map(str.lower, expanded_child_df.columns)
return expanded_child_df
有没有办法多次将 explode
函数应用于数据帧,
这是我尝试将 explode
函数应用于数据框的地方 consolidated_df
:
def clean():
column_value = ['tracking_results','trackable_items','events']
consolidated_df_cleaner = explode(consolidated_df,column_value.value)
# Need to iterate over column_value and pass the value as the second argument into `explode` function on the same dataframe
consolidated_df_cleaner.to_csv('/home/response4.csv',index=False)
试过这个但不行:
pd_list = []
for param in column_value:
pd_list.append(apply(explode(consolidated_df),param))
这就是我现在正在做的事情,我需要避免这种情况:
consolidated_df_cleaner=explode(consolidated_df,'tracking_results')
consolidated_df_cleaner2=explode(consolidated_df_cleaner,'trackable_items')
consolidated_df_cleaner3= explode(consolidated_df_cleaner2,'events')
consolidated_df_cleaner3.to_csv('/home/response4.csv',index=False)
预期输出:
tracking_results trackable_items events
intransit abc 22
intransit xqy 23
尝试
(consolidated_df
.pipe(explode,'tracking_results')
.pipe(explode,'trackable_items')
.pipe(explode,'events')
.to_csv('/home/response4.csv',index=False)
)