For 循环计算值并附加到 pandas 中的初始记录
For loop calculate values and append to initial record in pandas
我试图根据作为起点的记录来计算十年的 12 年级入学预测值。
我有几个学区的数据框以及截至 2020-21 年 11 年级和 12 年级的总入学人数。这是一条记录的示例:
df = pd.DataFrame({"year": ['2020_21'],
"district_name" : ["School District A"] ,
"grade11" : [5000],
"grade12": [5200],
"grade11_change": [1.01],
"grade11_grade12_ratio": [0.9]})
df
我想生成 10 年的 11 年级和 12 年级估算值。每年的 grade11 值都基于对上一年 grade11 值的预测变化。每年的 grade12 值基于上一年 grade11 值的预测比率。因此,在示例记录中,2021-222 年的 grade12 值将是 2020-21 年的 grade11 值的 90%。
我浏览了其他帖子并尝试制作一个 for 循环来执行所需的计算。但是我的循环用去年覆盖了前几年,我得到了 11 级和 12 级值的 NaN。
projection_years=['2021_22','2022_23','2023_24','2024_25','2025_26','2026_27','2027_28','2028_29','2029_30','2030_31']
change_11=df.iloc[0]['grade11_change']
ratio_11_12=df.iloc[0]['grade11_grade12_ratio']
district_data=[]
for school_year in projection_years:
print(school_year)
df['year']=school_year
df.loc[:,'grade11']=df['grade11'].shift(1)*change_11
df.loc[:,'grade12']=df['grade11'].shift(1)*ratio_11_12
district_data.append(df)
all_years_df=pd.concat(district_data)
all_years_df_final=all_years_df[['year','district_name','grade11','grade12']]
print ('Done with ' + school_year)
print('')
print('all done')
下面是我想要的结果。所需数据帧的第一条记录将是 2020-21 数据并以 2030_31.
结尾
result = pd.DataFrame({"year": ['2020_21','2021_22','2022_23','2023_24','2024_25','2025_26','2026_27','2027_28','2028_29','2029_30','2030_31'],
"district_name":["School District A","School District A","School District A","School District A","School District A","School District A","School District A","School District A","School District A","School District A","School District A"],
"grade11":[5000,5050,5100,5151,5203,5255,5307,5360,5414,5468,5523],
"grade12":[5200,4500,4545,4590,4636,4683,4730,4777,4825,4873,4922]})
result
感谢您的帮助。
类似的东西应该有用吗?
projection_years=['2021_22','2022_23','2023_24','2024_25','2025_26','2026_27','2027_28','2028_29','2029_30','2030_31']
grade11_change = 1.01
grade11_grade12_ratio = 0.9
for year in projection_years:
lr = df.iloc[-1]
row = {}
row['year'] = year
row['district_name'] = 'School District A'
row['grade11'] = int(lr['grade11'] * grade11_change)
row['grade12'] = int(lr['grade11'] * grade11_grade12_ratio)
df = df.append([row])
>>> df[['year','district_name','grade11','grade12']]
year district_name grade11 grade12
0 2020_21 School District A 5000 5200
0 2021_22 School District A 5050 4500
0 2022_23 School District A 5100 4545
0 2023_24 School District A 5151 4590
0 2024_25 School District A 5203 4636
0 2025_26 School District A 5255 4683
0 2026_27 School District A 5308 4730
0 2027_28 School District A 5361 4777
0 2028_29 School District A 5415 4825
0 2029_30 School District A 5469 4874
0 2030_31 School District A 5524 4922
我试图根据作为起点的记录来计算十年的 12 年级入学预测值。
我有几个学区的数据框以及截至 2020-21 年 11 年级和 12 年级的总入学人数。这是一条记录的示例:
df = pd.DataFrame({"year": ['2020_21'],
"district_name" : ["School District A"] ,
"grade11" : [5000],
"grade12": [5200],
"grade11_change": [1.01],
"grade11_grade12_ratio": [0.9]})
df
我想生成 10 年的 11 年级和 12 年级估算值。每年的 grade11 值都基于对上一年 grade11 值的预测变化。每年的 grade12 值基于上一年 grade11 值的预测比率。因此,在示例记录中,2021-222 年的 grade12 值将是 2020-21 年的 grade11 值的 90%。
我浏览了其他帖子并尝试制作一个 for 循环来执行所需的计算。但是我的循环用去年覆盖了前几年,我得到了 11 级和 12 级值的 NaN。
projection_years=['2021_22','2022_23','2023_24','2024_25','2025_26','2026_27','2027_28','2028_29','2029_30','2030_31']
change_11=df.iloc[0]['grade11_change']
ratio_11_12=df.iloc[0]['grade11_grade12_ratio']
district_data=[]
for school_year in projection_years:
print(school_year)
df['year']=school_year
df.loc[:,'grade11']=df['grade11'].shift(1)*change_11
df.loc[:,'grade12']=df['grade11'].shift(1)*ratio_11_12
district_data.append(df)
all_years_df=pd.concat(district_data)
all_years_df_final=all_years_df[['year','district_name','grade11','grade12']]
print ('Done with ' + school_year)
print('')
print('all done')
下面是我想要的结果。所需数据帧的第一条记录将是 2020-21 数据并以 2030_31.
结尾result = pd.DataFrame({"year": ['2020_21','2021_22','2022_23','2023_24','2024_25','2025_26','2026_27','2027_28','2028_29','2029_30','2030_31'],
"district_name":["School District A","School District A","School District A","School District A","School District A","School District A","School District A","School District A","School District A","School District A","School District A"],
"grade11":[5000,5050,5100,5151,5203,5255,5307,5360,5414,5468,5523],
"grade12":[5200,4500,4545,4590,4636,4683,4730,4777,4825,4873,4922]})
result
感谢您的帮助。
类似的东西应该有用吗?
projection_years=['2021_22','2022_23','2023_24','2024_25','2025_26','2026_27','2027_28','2028_29','2029_30','2030_31']
grade11_change = 1.01
grade11_grade12_ratio = 0.9
for year in projection_years:
lr = df.iloc[-1]
row = {}
row['year'] = year
row['district_name'] = 'School District A'
row['grade11'] = int(lr['grade11'] * grade11_change)
row['grade12'] = int(lr['grade11'] * grade11_grade12_ratio)
df = df.append([row])
>>> df[['year','district_name','grade11','grade12']]
year district_name grade11 grade12
0 2020_21 School District A 5000 5200
0 2021_22 School District A 5050 4500
0 2022_23 School District A 5100 4545
0 2023_24 School District A 5151 4590
0 2024_25 School District A 5203 4636
0 2025_26 School District A 5255 4683
0 2026_27 School District A 5308 4730
0 2027_28 School District A 5361 4777
0 2028_29 School District A 5415 4825
0 2029_30 School District A 5469 4874
0 2030_31 School District A 5524 4922