一年中不包括某些日子的时间增量
Time Delta of Year Excluding Certain Days
我正在制作一个热图,x 轴为公司名称,y 轴为月份,阴影区域为呼叫次数。
我正在从过去一年的数据库中获取一部分数据以创建热图。但是,这意味着如果您将鼠标悬停在当前月份,例如今天是 7 月 13 日,您将收到今年 7 月 1 日至 13 日的电话,以及去年 7 月 13 日至 31 日的电话。在当月,我只想显示 7 月 1 日至 13 日的来电。
#This section selects the last year of data
# convert strings to datetimes
df['recvd_dttm'] = pd.to_datetime(df['recvd_dttm'])
#Only retrieve data before now (ignore typos that are future dates)
mask = df['recvd_dttm'] <= datetime.datetime.now()
df = df.loc[mask]
# get first and last datetime for final week of data
range_max = df['recvd_dttm'].max()
range_min = range_max - datetime.timedelta(days=365)
# take slice with final week of data
df = df[(df['recvd_dttm'] >= range_min) &
(df['recvd_dttm'] <= range_max)]
您可以使用 pd.tseries.offsets.MonthEnd()
来实现您的目标。
import pandas as pd
import numpy as np
import datetime as dt
np.random.seed(0)
val = np.random.randn(600)
date_rng = pd.date_range('2014-01-01', periods=600, freq='D')
df = pd.DataFrame(dict(dates=date_rng,col=val))
print(df)
col dates
0 1.7641 2014-01-01
1 0.4002 2014-01-02
2 0.9787 2014-01-03
3 2.2409 2014-01-04
4 1.8676 2014-01-05
5 -0.9773 2014-01-06
6 0.9501 2014-01-07
7 -0.1514 2014-01-08
8 -0.1032 2014-01-09
9 0.4106 2014-01-10
.. ... ...
590 0.5433 2015-08-14
591 0.4390 2015-08-15
592 -0.2195 2015-08-16
593 -1.0840 2015-08-17
594 0.3518 2015-08-18
595 0.3792 2015-08-19
596 -0.4700 2015-08-20
597 -0.2167 2015-08-21
598 -0.9302 2015-08-22
599 -0.1786 2015-08-23
[600 rows x 2 columns]
print(df.dates.dtype)
datetime64[ns]
datetime_now = dt.datetime.now()
datetime_now_month_end = datetime_now + pd.tseries.offsets.MonthEnd(1)
print(datetime_now_month_end)
2015-07-31 03:19:18.292739
datetime_start = datetime_now_month_end - pd.tseries.offsets.DateOffset(years=1)
print(datetime_start)
2014-07-31 03:19:18.292739
print(df[(df.dates > datetime_start) & (df.dates < datetime_now)])
col dates
212 0.7863 2014-08-01
213 -0.4664 2014-08-02
214 -0.9444 2014-08-03
215 -0.4100 2014-08-04
216 -0.0170 2014-08-05
217 0.3792 2014-08-06
218 2.2593 2014-08-07
219 -0.0423 2014-08-08
220 -0.9559 2014-08-09
221 -0.3460 2014-08-10
.. ... ...
550 0.1639 2015-07-05
551 0.0963 2015-07-06
552 0.9425 2015-07-07
553 -0.2676 2015-07-08
554 -0.6780 2015-07-09
555 1.2978 2015-07-10
556 -2.3642 2015-07-11
557 0.0203 2015-07-12
558 -1.3479 2015-07-13
559 -0.7616 2015-07-14
[348 rows x 2 columns]
我正在制作一个热图,x 轴为公司名称,y 轴为月份,阴影区域为呼叫次数。
我正在从过去一年的数据库中获取一部分数据以创建热图。但是,这意味着如果您将鼠标悬停在当前月份,例如今天是 7 月 13 日,您将收到今年 7 月 1 日至 13 日的电话,以及去年 7 月 13 日至 31 日的电话。在当月,我只想显示 7 月 1 日至 13 日的来电。
#This section selects the last year of data
# convert strings to datetimes
df['recvd_dttm'] = pd.to_datetime(df['recvd_dttm'])
#Only retrieve data before now (ignore typos that are future dates)
mask = df['recvd_dttm'] <= datetime.datetime.now()
df = df.loc[mask]
# get first and last datetime for final week of data
range_max = df['recvd_dttm'].max()
range_min = range_max - datetime.timedelta(days=365)
# take slice with final week of data
df = df[(df['recvd_dttm'] >= range_min) &
(df['recvd_dttm'] <= range_max)]
您可以使用 pd.tseries.offsets.MonthEnd()
来实现您的目标。
import pandas as pd
import numpy as np
import datetime as dt
np.random.seed(0)
val = np.random.randn(600)
date_rng = pd.date_range('2014-01-01', periods=600, freq='D')
df = pd.DataFrame(dict(dates=date_rng,col=val))
print(df)
col dates
0 1.7641 2014-01-01
1 0.4002 2014-01-02
2 0.9787 2014-01-03
3 2.2409 2014-01-04
4 1.8676 2014-01-05
5 -0.9773 2014-01-06
6 0.9501 2014-01-07
7 -0.1514 2014-01-08
8 -0.1032 2014-01-09
9 0.4106 2014-01-10
.. ... ...
590 0.5433 2015-08-14
591 0.4390 2015-08-15
592 -0.2195 2015-08-16
593 -1.0840 2015-08-17
594 0.3518 2015-08-18
595 0.3792 2015-08-19
596 -0.4700 2015-08-20
597 -0.2167 2015-08-21
598 -0.9302 2015-08-22
599 -0.1786 2015-08-23
[600 rows x 2 columns]
print(df.dates.dtype)
datetime64[ns]
datetime_now = dt.datetime.now()
datetime_now_month_end = datetime_now + pd.tseries.offsets.MonthEnd(1)
print(datetime_now_month_end)
2015-07-31 03:19:18.292739
datetime_start = datetime_now_month_end - pd.tseries.offsets.DateOffset(years=1)
print(datetime_start)
2014-07-31 03:19:18.292739
print(df[(df.dates > datetime_start) & (df.dates < datetime_now)])
col dates
212 0.7863 2014-08-01
213 -0.4664 2014-08-02
214 -0.9444 2014-08-03
215 -0.4100 2014-08-04
216 -0.0170 2014-08-05
217 0.3792 2014-08-06
218 2.2593 2014-08-07
219 -0.0423 2014-08-08
220 -0.9559 2014-08-09
221 -0.3460 2014-08-10
.. ... ...
550 0.1639 2015-07-05
551 0.0963 2015-07-06
552 0.9425 2015-07-07
553 -0.2676 2015-07-08
554 -0.6780 2015-07-09
555 1.2978 2015-07-10
556 -2.3642 2015-07-11
557 0.0203 2015-07-12
558 -1.3479 2015-07-13
559 -0.7616 2015-07-14
[348 rows x 2 columns]