基于日期段和组的条件
Conditions based in date periods and groups
A B C D
0 2002-01-13 Dan 2002-01-15 10
1 2002-01-13 Dan 2002-01-25 24
2 2002-01-13 Vic 2002-01-17 14
3 2002-01-13 Vic 2002-01-03 18
4 2002-01-28 Mel 2002-02-08 37
5 2002-01-28 Mel 2002-02-06 29
6 2002-01-28 Mel 2002-02-10 20
7 2002-01-28 Rob 2002-02-12 30
8 2002-01-28 Rob 2002-02-01 47
我想为每个 B
组创建一个具有下一个条件的新 df['E']
列:
- E=D 值,其中
A
日期最接近于 C
日期晚 10 天。
- 如果有两个
C
日期距离 A
相差 10 天(2002-01-28
Mel
的情况),E
将是这些同期 D
值的平均值。
输出应该是:
A B C D E
0 2002-01-13 Dan 2002-01-15 10 24
1 2002-01-13 Dan 2002-01-25 24 24
2 2002-01-13 Vic 2002-01-17 14 14
3 2002-01-13 Vic 2002-01-03 18 14
4 2002-01-28 Mel 2002-02-08 37 33
5 2002-01-28 Mel 2002-02-06 29 33
6 2002-01-28 Mel 2002-02-10 20 33
7 2002-01-28 Rob 2002-02-12 30 30
8 2002-01-28 Rob 2002-02-01 47 30
好的,看来你需要
df['E']=abs((df.C-df.A).dt.days-10)# get the days different
df['E']=df.B.map(df.loc[df.E==df.groupby('B').E.transform('min')].groupby('B').D.mean())# find the min value for the different , and get the mean
df
Out[106]:
A B C D E
0 2002-01-13 Dan 2002-01-15 10 24
1 2002-01-13 Dan 2002-01-25 24 24
2 2002-01-13 Vic 2002-01-17 14 14
3 2002-01-13 Vic 2002-01-03 18 14
4 2002-01-28 Mel 2002-02-08 37 33
5 2002-01-28 Mel 2002-02-06 29 33
6 2002-01-28 Mel 2002-02-10 20 33
7 2002-01-28 Rob 2002-02-12 30 30
8 2002-01-28 Rob 2002-02-01 47 30
A B C D
0 2002-01-13 Dan 2002-01-15 10
1 2002-01-13 Dan 2002-01-25 24
2 2002-01-13 Vic 2002-01-17 14
3 2002-01-13 Vic 2002-01-03 18
4 2002-01-28 Mel 2002-02-08 37
5 2002-01-28 Mel 2002-02-06 29
6 2002-01-28 Mel 2002-02-10 20
7 2002-01-28 Rob 2002-02-12 30
8 2002-01-28 Rob 2002-02-01 47
我想为每个 B
组创建一个具有下一个条件的新 df['E']
列:
- E=D 值,其中
A
日期最接近于C
日期晚 10 天。 - 如果有两个
C
日期距离A
相差 10 天(2002-01-28
Mel
的情况),E
将是这些同期D
值的平均值。
输出应该是:
A B C D E
0 2002-01-13 Dan 2002-01-15 10 24
1 2002-01-13 Dan 2002-01-25 24 24
2 2002-01-13 Vic 2002-01-17 14 14
3 2002-01-13 Vic 2002-01-03 18 14
4 2002-01-28 Mel 2002-02-08 37 33
5 2002-01-28 Mel 2002-02-06 29 33
6 2002-01-28 Mel 2002-02-10 20 33
7 2002-01-28 Rob 2002-02-12 30 30
8 2002-01-28 Rob 2002-02-01 47 30
好的,看来你需要
df['E']=abs((df.C-df.A).dt.days-10)# get the days different
df['E']=df.B.map(df.loc[df.E==df.groupby('B').E.transform('min')].groupby('B').D.mean())# find the min value for the different , and get the mean
df
Out[106]:
A B C D E
0 2002-01-13 Dan 2002-01-15 10 24
1 2002-01-13 Dan 2002-01-25 24 24
2 2002-01-13 Vic 2002-01-17 14 14
3 2002-01-13 Vic 2002-01-03 18 14
4 2002-01-28 Mel 2002-02-08 37 33
5 2002-01-28 Mel 2002-02-06 29 33
6 2002-01-28 Mel 2002-02-10 20 33
7 2002-01-28 Rob 2002-02-12 30 30
8 2002-01-28 Rob 2002-02-01 47 30