Python 在函数内写入 Dataframe
Python writing to Dataframe within a function
我确信这一定是直截了当的,但几天来我一直在努力解决这个问题,我想我知道我做错了什么,但想法很困难:
我正在使用函数对 Dataframe 进行子集化,并基于该子集在该子集中创建一个新列并填充它。这行得通,但除非我将其分配回名为 mod_df 的新数据框,否则我看不到将其返回到 df
好像函数结束后,数据丢失了。
如有任何想法,我们将不胜感激
mod_df = []
def Pop_Gen(lower, upper, val):
x = df[(df['byear'] >= lower) & (df['byear'] <= upper)].assign(Gen = val)
mod_df.append(x)
for index, row in gen_Ref_df.iterrows():
Pop_Gen(row.lower,row.upper,row.val)
输入
第一个数据帧:
df:
Name byear
0 John 1980
1 Mary 1990
第二个数据帧:
gen_Ref_df:
val lower upper
0 old 1970 1985
1 new 1986 1995
当前输出
mod_df:
Name byear Gen
0 John 1980 old
1 Mary 1990 new
预期输出(在df
中不需要放入mod_df
)
df:
Name byear Gen
0 John 1980 old
1 Mary 1990 new
假设 df
和 gen_Ref_df
具有相同的行数,我将执行以下操作:
# These should be your input DataFrame
d = {'name': ['John', 'Mary'], 'byear': [1980, 1990]}
df = pd.DataFrame(data=d)
d = {'val': ['old', 'new'], 'lower': [1970, 1986], 'upper': [1985, 1995]}
gen_Ref_df = pd.DataFrame(data=d)
# Replace the for loop and the function call with a single line
# Create a new column 'Gen' in df and populate each row with the Gen val obtained by the two conditions
df['Gen'] = gen_Ref_df[(df['byear'] >= gen_Ref_df.lower) & (df['byear'] <= gen_Ref_df.upper)].assign(Gen = gen_Ref_df.val).Gen
print(df)
结果:
name byear Gen
0 John 1980 old
1 Mary 1990 new
我确信这一定是直截了当的,但几天来我一直在努力解决这个问题,我想我知道我做错了什么,但想法很困难:
我正在使用函数对 Dataframe 进行子集化,并基于该子集在该子集中创建一个新列并填充它。这行得通,但除非我将其分配回名为 mod_df 的新数据框,否则我看不到将其返回到 df
好像函数结束后,数据丢失了。
如有任何想法,我们将不胜感激
mod_df = []
def Pop_Gen(lower, upper, val):
x = df[(df['byear'] >= lower) & (df['byear'] <= upper)].assign(Gen = val)
mod_df.append(x)
for index, row in gen_Ref_df.iterrows():
Pop_Gen(row.lower,row.upper,row.val)
输入
第一个数据帧:
df:
Name byear
0 John 1980
1 Mary 1990
第二个数据帧:
gen_Ref_df:
val lower upper
0 old 1970 1985
1 new 1986 1995
当前输出
mod_df:
Name byear Gen
0 John 1980 old
1 Mary 1990 new
预期输出(在df
中不需要放入mod_df
)
df:
Name byear Gen
0 John 1980 old
1 Mary 1990 new
假设 df
和 gen_Ref_df
具有相同的行数,我将执行以下操作:
# These should be your input DataFrame
d = {'name': ['John', 'Mary'], 'byear': [1980, 1990]}
df = pd.DataFrame(data=d)
d = {'val': ['old', 'new'], 'lower': [1970, 1986], 'upper': [1985, 1995]}
gen_Ref_df = pd.DataFrame(data=d)
# Replace the for loop and the function call with a single line
# Create a new column 'Gen' in df and populate each row with the Gen val obtained by the two conditions
df['Gen'] = gen_Ref_df[(df['byear'] >= gen_Ref_df.lower) & (df['byear'] <= gen_Ref_df.upper)].assign(Gen = gen_Ref_df.val).Gen
print(df)
结果:
name byear Gen
0 John 1980 old
1 Mary 1990 new