Python 在函数内写入 Dataframe

Python writing to Dataframe within a function

我确信这一定是直截了当的,但几天来我一直在努力解决这个问题,我想我知道我做错了什么,但想法很困难:

我正在使用函数对 Dataframe 进行子集化,并基于该子集在该子集中创建一个新列并填充它。这行得通,但除非我将其分配回名为 mod_df 的新数据框,否则我看不到将其返回到 df

好像函数结束后,数据丢失了。

如有任何想法,我们将不胜感激

mod_df = []

def Pop_Gen(lower, upper, val):
    x = df[(df['byear'] >= lower) &  (df['byear'] <= upper)].assign(Gen = val)
    mod_df.append(x)

for index, row in gen_Ref_df.iterrows():
    Pop_Gen(row.lower,row.upper,row.val)

输入

第一个数据帧:

df:

   Name  byear  
0  John  1980  
1  Mary  1990 

第二个数据帧:

gen_Ref_df:

   val   lower   upper  
0  old   1970    1985  
1  new   1986    1995

当前输出

mod_df:

   Name  byear Gen  
0  John  1980  old  
1  Mary  1990  new

预期输出(在df中不需要放入mod_df

df:

   Name  byear Gen  
0  John  1980  old  
1  Mary  1990  new  

假设 dfgen_Ref_df 具有相同的行数,我将执行以下操作:

# These should be your input DataFrame

d = {'name': ['John', 'Mary'], 'byear': [1980, 1990]}
df = pd.DataFrame(data=d)

d = {'val': ['old', 'new'], 'lower': [1970, 1986], 'upper': [1985, 1995]}
gen_Ref_df = pd.DataFrame(data=d)


# Replace the for loop and the function call with a single line
# Create a new column 'Gen' in df and populate each row with the Gen val obtained by the two conditions 
df['Gen'] = gen_Ref_df[(df['byear'] >= gen_Ref_df.lower) & (df['byear'] <= gen_Ref_df.upper)].assign(Gen = gen_Ref_df.val).Gen

print(df)

结果:

   name  byear  Gen
0  John   1980  old
1  Mary   1990  new