将日期归入月份
Bin the date into month
现在我有一个table:
Score Customer ID my_dates Threshold Model_name is_alert
50 8 2017-08-05 50 Mod1 yes
50 9 2017-12-05 50 Mod1 yes
50 28 2017-05-22 50 Mod2 yes
50 28 2017-05-26 50 Mod2 yes
50 36 2017-06-20 50 Mod2 yes
如果分数等于或超过阈值,is_alert将显示'yes'
现在我想将日期放入以下格式中,并打印每个型号下每个箱子中的警报数量,但如果在 7 天内对一个客户多次发出警报,则只有第一次点击有助于总分:
Model_name Jan-17 Feb-17 Mar-17 APR-17 May-17 Jun-17
Mod1
Mod2
有人可以帮我吗?谢谢
使用crosstab
with convert datetimes to month periods by Series.dt.to_period
, last convert to names of months by PeriodIndex.strftime
, but before get difference per groups by DataFrameGroupBy.diff
and filter rows with missing values (first rows per groups) and less or equal like 7
by Series.ge
and boolean indexing
:
df['my_dates'] = pd.to_datetime(df['my_dates'])
m = df['my_dates'].dt.to_period('m')
df['diff'] = df.groupby(['Model_name'])['my_dates'].diff().dt.days
print (df)
Score Customer ID my_dates Threshold Model_name is_alert diff
0 50 8 2017-08-05 50 Mod1 yes NaN
1 50 9 2017-12-05 50 Mod1 yes 122.0
2 50 28 2017-05-22 50 Mod2 yes NaN
3 50 28 2017-05-26 50 Mod2 yes 4.0
4 50 36 2017-06-20 50 Mod2 yes 25.0
df = df[df['diff'].ge(7) | df['diff'].isna()]
df1 = pd.crosstab(df['Model_name'], m)
df1.columns = df1.columns.strftime('%b-%y')
print (df1)
my_dates May-17 Jun-17 Aug-17 Dec-17
Model_name
Mod1 0 0 1 1
Mod2 1 1 0 0
现在我有一个table:
Score Customer ID my_dates Threshold Model_name is_alert
50 8 2017-08-05 50 Mod1 yes
50 9 2017-12-05 50 Mod1 yes
50 28 2017-05-22 50 Mod2 yes
50 28 2017-05-26 50 Mod2 yes
50 36 2017-06-20 50 Mod2 yes
如果分数等于或超过阈值,is_alert将显示'yes'
现在我想将日期放入以下格式中,并打印每个型号下每个箱子中的警报数量,但如果在 7 天内对一个客户多次发出警报,则只有第一次点击有助于总分:
Model_name Jan-17 Feb-17 Mar-17 APR-17 May-17 Jun-17
Mod1
Mod2
有人可以帮我吗?谢谢
使用crosstab
with convert datetimes to month periods by Series.dt.to_period
, last convert to names of months by PeriodIndex.strftime
, but before get difference per groups by DataFrameGroupBy.diff
and filter rows with missing values (first rows per groups) and less or equal like 7
by Series.ge
and boolean indexing
:
df['my_dates'] = pd.to_datetime(df['my_dates'])
m = df['my_dates'].dt.to_period('m')
df['diff'] = df.groupby(['Model_name'])['my_dates'].diff().dt.days
print (df)
Score Customer ID my_dates Threshold Model_name is_alert diff
0 50 8 2017-08-05 50 Mod1 yes NaN
1 50 9 2017-12-05 50 Mod1 yes 122.0
2 50 28 2017-05-22 50 Mod2 yes NaN
3 50 28 2017-05-26 50 Mod2 yes 4.0
4 50 36 2017-06-20 50 Mod2 yes 25.0
df = df[df['diff'].ge(7) | df['diff'].isna()]
df1 = pd.crosstab(df['Model_name'], m)
df1.columns = df1.columns.strftime('%b-%y')
print (df1)
my_dates May-17 Jun-17 Aug-17 Dec-17
Model_name
Mod1 0 0 1 1
Mod2 1 1 0 0