将日期中的总音量拆分为小时比率 python pandas
Split overall volume in a date to ratios of hours python pandas
我正在处理一个基本上有两个数据框的要求,一个带有日期量,另一个带有小时分割百分比,如下图所示。现在,我正在尝试获取第三个数据框,它将结合上述两个数据框,根据数据框 2 中获得的拆分百分比,日期量在 24 小时内分布。你能帮我实现这个吗?我设法在 Talend 中实现了这一点,但想在 python 中执行并尝试相同的操作,提前致谢
以下是我为实现份额百分比而工作的代码片段
weekday_split_group['%_share'] = weekday_split_group['Calls Presented']/weekday_split_group['Total']
weekday_split_group['%_share'].sum()
weekday_split_group[['%_share']]
日期卷图像
hour_split_%(24小时循环)
Final_output_expected
谢谢
你的数据是图片,所以我合成了 %_share 数据。
用 merge()
执行笛卡尔积得到数据粒度,然后一个简单的乘法计算 volume 列。
import datetime as dt
df_pct = pd.DataFrame({"hour":[i for i in range(24)], "%_share":np.random.dirichlet(np.ones(24),size=1)[0]})
df_val = pd.DataFrame({"Date":pd.date_range('2021-01-11','2021-01-12'),
"yhat":[185.835,182.220]})
(df_val.assign(foo=1).merge(df_pct.assign(foo=1), on="foo")
.assign(volume=lambda dfa: dfa["yhat"]*dfa["%_share"])
.drop(columns=["foo","yhat","%_share"])
)
输出
Date hour volume
0 2021-01-11 0 8.75130181190106
1 2021-01-11 1 3.800304203310593
2 2021-01-11 2 4.435534207384316
3 2021-01-11 3 0.8042485649482456
4 2021-01-11 4 0.30780836690202823
5 2021-01-11 5 0.4034771303868087
6 2021-01-11 6 9.757185959437273
7 2021-01-11 7 10.419717656981055
8 2021-01-11 8 17.5343995272983
9 2021-01-11 9 4.697775947037123
10 2021-01-11 10 1.3684239898273962
11 2021-01-11 11 1.842340112885734
12 2021-01-11 12 0.9282440981226737
13 2021-01-11 13 15.003403435577233
14 2021-01-11 14 0.8639868910813613
15 2021-01-11 15 11.385655349816991
16 2021-01-11 16 2.4637722378464177
17 2021-01-11 17 5.96733782226381
18 2021-01-11 18 4.352473102105978
19 2021-01-11 19 3.7736758659074923
20 2021-01-11 20 51.87723769628018
21 2021-01-11 21 1.3543189874277177
22 2021-01-11 22 17.20074788880526
23 2021-01-11 23 6.5416291464649685
24 2021-01-12 0 8.581065010168219
25 2021-01-12 1 3.7263778724527468
26 2021-01-12 2 4.349250912204751
27 2021-01-12 3 0.7886037264501806
28 2021-01-12 4 0.30182065066799896
29 2021-01-12 5 0.3956283945386191
30 2021-01-12 6 9.567381954576154
31 2021-01-12 7 10.217025595044463
32 2021-01-12 8 17.19330740637822
33 2021-01-12 9 4.606391331391312
34 2021-01-12 10 1.3418043932862385
35 2021-01-12 11 1.8065015490625471
36 2021-01-12 12 0.9101872067151698
37 2021-01-12 13 14.711546124416195
38 2021-01-12 14 0.8471799784370309
39 2021-01-12 15 11.164173152762677
40 2021-01-12 16 2.415845116261061
41 2021-01-12 17 5.851256749121055
42 2021-01-12 18 4.267805573039262
43 2021-01-12 19 3.700267529182679
44 2021-01-12 20 50.868083262120564
45 2021-01-12 21 1.327973771835654
46 2021-01-12 22 16.866146206570853
47 2021-01-12 23 6.414376533316364
我正在处理一个基本上有两个数据框的要求,一个带有日期量,另一个带有小时分割百分比,如下图所示。现在,我正在尝试获取第三个数据框,它将结合上述两个数据框,根据数据框 2 中获得的拆分百分比,日期量在 24 小时内分布。你能帮我实现这个吗?我设法在 Talend 中实现了这一点,但想在 python 中执行并尝试相同的操作,提前致谢
以下是我为实现份额百分比而工作的代码片段
weekday_split_group['%_share'] = weekday_split_group['Calls Presented']/weekday_split_group['Total']
weekday_split_group['%_share'].sum()
weekday_split_group[['%_share']]
日期卷图像
hour_split_%(24小时循环)
Final_output_expected
谢谢
你的数据是图片,所以我合成了 %_share 数据。
用 merge()
执行笛卡尔积得到数据粒度,然后一个简单的乘法计算 volume 列。
import datetime as dt
df_pct = pd.DataFrame({"hour":[i for i in range(24)], "%_share":np.random.dirichlet(np.ones(24),size=1)[0]})
df_val = pd.DataFrame({"Date":pd.date_range('2021-01-11','2021-01-12'),
"yhat":[185.835,182.220]})
(df_val.assign(foo=1).merge(df_pct.assign(foo=1), on="foo")
.assign(volume=lambda dfa: dfa["yhat"]*dfa["%_share"])
.drop(columns=["foo","yhat","%_share"])
)
输出
Date hour volume
0 2021-01-11 0 8.75130181190106
1 2021-01-11 1 3.800304203310593
2 2021-01-11 2 4.435534207384316
3 2021-01-11 3 0.8042485649482456
4 2021-01-11 4 0.30780836690202823
5 2021-01-11 5 0.4034771303868087
6 2021-01-11 6 9.757185959437273
7 2021-01-11 7 10.419717656981055
8 2021-01-11 8 17.5343995272983
9 2021-01-11 9 4.697775947037123
10 2021-01-11 10 1.3684239898273962
11 2021-01-11 11 1.842340112885734
12 2021-01-11 12 0.9282440981226737
13 2021-01-11 13 15.003403435577233
14 2021-01-11 14 0.8639868910813613
15 2021-01-11 15 11.385655349816991
16 2021-01-11 16 2.4637722378464177
17 2021-01-11 17 5.96733782226381
18 2021-01-11 18 4.352473102105978
19 2021-01-11 19 3.7736758659074923
20 2021-01-11 20 51.87723769628018
21 2021-01-11 21 1.3543189874277177
22 2021-01-11 22 17.20074788880526
23 2021-01-11 23 6.5416291464649685
24 2021-01-12 0 8.581065010168219
25 2021-01-12 1 3.7263778724527468
26 2021-01-12 2 4.349250912204751
27 2021-01-12 3 0.7886037264501806
28 2021-01-12 4 0.30182065066799896
29 2021-01-12 5 0.3956283945386191
30 2021-01-12 6 9.567381954576154
31 2021-01-12 7 10.217025595044463
32 2021-01-12 8 17.19330740637822
33 2021-01-12 9 4.606391331391312
34 2021-01-12 10 1.3418043932862385
35 2021-01-12 11 1.8065015490625471
36 2021-01-12 12 0.9101872067151698
37 2021-01-12 13 14.711546124416195
38 2021-01-12 14 0.8471799784370309
39 2021-01-12 15 11.164173152762677
40 2021-01-12 16 2.415845116261061
41 2021-01-12 17 5.851256749121055
42 2021-01-12 18 4.267805573039262
43 2021-01-12 19 3.700267529182679
44 2021-01-12 20 50.868083262120564
45 2021-01-12 21 1.327973771835654
46 2021-01-12 22 16.866146206570853
47 2021-01-12 23 6.414376533316364