预测依据历史数据
Forecasting basis the historical figures
我想根据历史数据预测分配。
用户提供的手动输入:
year month x y z k
2018 JAN 9,267,581 627,129 254,110 14,980
2018 FEB 7,771,691 738,041 217,027 17,363
历史数据输出:
year month segment pg is_p x y z k
2018 JAN A p Y 600 600 600 600
2018 JAN A p N 200 200 200 200
2018 JAN B r Y 400 400 400 400
2018 JAN A r Y 400 400 400 400
2018 JAN A r N 400 400 400 400
2018 JAN B r N 300 300 300 300
2018 JAN C s Y 200 200 200 200
2018 JAN C s N 10 10 10 10
2018 JAN C t Y 11 11 11 11
2018 JAN C t N 12 12 12 12
2018 FEB A p Y 789 789 789 789
2018 FEB A p N 2093874 2093874 2093874 2093874
我已经尝试从总数中计算 is_p
的分配,比如我添加某些列来计算分配百分比:
%ofx_segment
=600+200+400+400/600+200+400+400+400+300+200+10+11+12。这将告诉我从细分市场贡献了多少 x
y,z,k 也是如此
- 我用手动输入9276581 *
%ofx_segment
来计算segment_x 的值
- 然后,我计算
%_pg
。对于 2018 年 1 月的 A 段,%_pg
= 600+200/600+200+400+400
- 然后,我将从第 2 步接收到的手动输入乘以从 3 接收到的 'p' %pg in pg for A segment
- 然后,最后,我会计算is_p的%,我会计算%Y或%N
对于 p in pg for A in segment % Y is =600/600+200.
- 从第 5 步收到的值必须乘以从 4 收到的输出。
import pandas as pd
first=pd.read_csv('/Users/arork/Downloads/first.csv')
second=pd.read_csv('/Users/arork/Downloads/second.csv')
interested_columns=['x','y','z','k']
second=pd.read_csv('/Users/arork/Downloads/second.csv')
interested_columns=['x','y','z','k']
primeallocation=first.groupby(['year','month','pg','segment'])[['is_p']+interested_columns].apply(f)
segmentallocation=first.groupby(['year','month'])[['segment']+interested_columns].apply(g)
pgallocation=first.groupby(['year','month','segment'])[['pg']+interested_columns].apply(h)
segmentallocation['%of allocation_segment x']
np.array(second)
func = lambda x: x * np.asarray(second['x'])
segmentallocation['%of allocation_segment x'].apply(func)
您需要连接这两个数据帧以执行两列的乘法运算。
merged_df = segmentallocation.merge(second,on=['year','month'],how='left',suffixes=['','_second'])
for c in interested_columns:
merged_df['allocation'+str(c)] = merged_df['%of allocation'+str(c)] * merged_df[c]
merged_df
year month segment x y z k %of allocationx %of allocationy %of allocationz %of allocationk x_second y_second z_second k_second allocationx allocationy allocationz allocationk
0 2018 FEB A 2094663 2094663 2094663 2094663 1.000000 1.000000 1.000000 1.000000 7,771,691 738,041 217,027 17,363 2.094663e+06 2.094663e+06 2.094663e+06 2.094663e+06
1 2018 JAN A 1600 1600 1600 1600 0.631662 0.631662 0.631662 0.631662 9,267,581 627,129 254,110 14,980 1.010659e+03 1.010659e+03 1.010659e+03 1.010659e+03
2 2018 JAN B 700 700 700 700 0.276352 0.276352 0.276352 0.276352 9,267,581 627,129 254,110 14,980 1.934465e+02 1.934465e+02 1.934465e+02 1.934465e+02
3 2018 JAN C 233 233 233 233 0.091986 0.091986 0.091986 0.091986 9,267,581 627,129 254,110 14,980 2.143269e+01 2.143269e+01 2.143269e+01 2.143269e+01
我想根据历史数据预测分配。
用户提供的手动输入:
year month x y z k
2018 JAN 9,267,581 627,129 254,110 14,980
2018 FEB 7,771,691 738,041 217,027 17,363
历史数据输出:
year month segment pg is_p x y z k
2018 JAN A p Y 600 600 600 600
2018 JAN A p N 200 200 200 200
2018 JAN B r Y 400 400 400 400
2018 JAN A r Y 400 400 400 400
2018 JAN A r N 400 400 400 400
2018 JAN B r N 300 300 300 300
2018 JAN C s Y 200 200 200 200
2018 JAN C s N 10 10 10 10
2018 JAN C t Y 11 11 11 11
2018 JAN C t N 12 12 12 12
2018 FEB A p Y 789 789 789 789
2018 FEB A p N 2093874 2093874 2093874 2093874
我已经尝试从总数中计算 is_p
的分配,比如我添加某些列来计算分配百分比:
%ofx_segment
=600+200+400+400/600+200+400+400+400+300+200+10+11+12。这将告诉我从细分市场贡献了多少 x y,z,k 也是如此
- 我用手动输入9276581 *
%ofx_segment
来计算segment_x 的值
- 然后,我计算
%_pg
。对于 2018 年 1 月的 A 段,%_pg
= 600+200/600+200+400+400 - 然后,我将从第 2 步接收到的手动输入乘以从 3 接收到的 'p' %pg in pg for A segment
- 然后,最后,我会计算is_p的%,我会计算%Y或%N 对于 p in pg for A in segment % Y is =600/600+200.
- 从第 5 步收到的值必须乘以从 4 收到的输出。
import pandas as pd
first=pd.read_csv('/Users/arork/Downloads/first.csv')
second=pd.read_csv('/Users/arork/Downloads/second.csv')
interested_columns=['x','y','z','k']
second=pd.read_csv('/Users/arork/Downloads/second.csv')
interested_columns=['x','y','z','k']
primeallocation=first.groupby(['year','month','pg','segment'])[['is_p']+interested_columns].apply(f)
segmentallocation=first.groupby(['year','month'])[['segment']+interested_columns].apply(g)
pgallocation=first.groupby(['year','month','segment'])[['pg']+interested_columns].apply(h)
segmentallocation['%of allocation_segment x']
np.array(second)
func = lambda x: x * np.asarray(second['x'])
segmentallocation['%of allocation_segment x'].apply(func)
您需要连接这两个数据帧以执行两列的乘法运算。
merged_df = segmentallocation.merge(second,on=['year','month'],how='left',suffixes=['','_second'])
for c in interested_columns:
merged_df['allocation'+str(c)] = merged_df['%of allocation'+str(c)] * merged_df[c]
merged_df
year month segment x y z k %of allocationx %of allocationy %of allocationz %of allocationk x_second y_second z_second k_second allocationx allocationy allocationz allocationk
0 2018 FEB A 2094663 2094663 2094663 2094663 1.000000 1.000000 1.000000 1.000000 7,771,691 738,041 217,027 17,363 2.094663e+06 2.094663e+06 2.094663e+06 2.094663e+06
1 2018 JAN A 1600 1600 1600 1600 0.631662 0.631662 0.631662 0.631662 9,267,581 627,129 254,110 14,980 1.010659e+03 1.010659e+03 1.010659e+03 1.010659e+03
2 2018 JAN B 700 700 700 700 0.276352 0.276352 0.276352 0.276352 9,267,581 627,129 254,110 14,980 1.934465e+02 1.934465e+02 1.934465e+02 1.934465e+02
3 2018 JAN C 233 233 233 233 0.091986 0.091986 0.091986 0.091986 9,267,581 627,129 254,110 14,980 2.143269e+01 2.143269e+01 2.143269e+01 2.143269e+01