如何摆脱 MonthEnds 类型

How to get rid of MonthEnds type

我正在尝试获取 Pandas DataFrame 中开始日期和结束日期之间的月差值。结果并不完全令人满意...

首先,结果是 <[value] * MonthEnds> 形式的某种日期时间类型。我不能用这个来计算。第一个问题是如何将其转换为整数。我尝试了 .n 属性,但随后出现以下错误:

AttributeError: 'Series' object has no attribute 'n'  

第二,结果是'missing'一个月。这可以通过使用另一个 solution/method 来避免吗?或者我应该在答案上加上 1 个月?

为了支持我的问题,我创建了一些简化的代码:

dates = [{'Start':'1-1-2020', 'End':'31-10-2020'}, {'Start':'1-2-2020', 'End':'30-11-2020'}]
df = pd.DataFrame(dates)

df['Start'] = pd.to_datetime(df['Start'], dayfirst=True)
df['End'] = pd.to_datetime(df['End'], dayfirst=True)
df['Duration'] = (df['End'].dt.to_period('M') - df['Start'].dt.to_period('M'))
df

这导致:

    Start       End         Duration
0   2020-01-01  2020-10-31  <9 * MonthEnds>
1   2020-02-01  2020-11-30  <9 * MonthEnds>

首选结果是:

    Start       End         Duration
0   2020-01-01  2020-10-31  10
1   2020-02-01  2020-11-30  10

从 end-date 中减去 start-date 并将时间增量转换为月数。

import pandas as pd

dates = [{'Start':'1-1-2020', 'End':'31-10-2020'}, {'Start':'1-2-2020', 'End':'30-11-2020'}]
df = pd.DataFrame(dates)
df['Start'] = pd.to_datetime(df['Start'], dayfirst=True)
df['End'] = pd.to_datetime(df['End'], dayfirst=True)
df['Duration'] = (df['End']-df['Start']).astype('<m8[M]').astype(int)+1
print(df)

输出:

       Start        End  Duration
0 2020-01-01 2020-10-31        10
1 2020-02-01 2020-11-30        10

试试这个

dates = [{'Start':'1-1-2020', 'End':'31-10-2020'}, {'Start':'1-2-2020', 'End':'30-11-2020'}]
df = pd.DataFrame(dates)

df['Start'] = pd.to_datetime(df['Start'], dayfirst=True)
df['End'] = pd.to_datetime(df['End'], dayfirst=True)
df['Duration'] = (df['End'] - df['Start']).apply(lambda x:x.days//30)
print(df)