如何摆脱 MonthEnds 类型
How to get rid of MonthEnds type
我正在尝试获取 Pandas DataFrame 中开始日期和结束日期之间的月差值。结果并不完全令人满意...
首先,结果是 <[value] * MonthEnds> 形式的某种日期时间类型。我不能用这个来计算。第一个问题是如何将其转换为整数。我尝试了 .n 属性,但随后出现以下错误:
AttributeError: 'Series' object has no attribute 'n'
第二,结果是'missing'一个月。这可以通过使用另一个 solution/method 来避免吗?或者我应该在答案上加上 1 个月?
为了支持我的问题,我创建了一些简化的代码:
dates = [{'Start':'1-1-2020', 'End':'31-10-2020'}, {'Start':'1-2-2020', 'End':'30-11-2020'}]
df = pd.DataFrame(dates)
df['Start'] = pd.to_datetime(df['Start'], dayfirst=True)
df['End'] = pd.to_datetime(df['End'], dayfirst=True)
df['Duration'] = (df['End'].dt.to_period('M') - df['Start'].dt.to_period('M'))
df
这导致:
Start End Duration
0 2020-01-01 2020-10-31 <9 * MonthEnds>
1 2020-02-01 2020-11-30 <9 * MonthEnds>
首选结果是:
Start End Duration
0 2020-01-01 2020-10-31 10
1 2020-02-01 2020-11-30 10
从 end-date 中减去 start-date 并将时间增量转换为月数。
import pandas as pd
dates = [{'Start':'1-1-2020', 'End':'31-10-2020'}, {'Start':'1-2-2020', 'End':'30-11-2020'}]
df = pd.DataFrame(dates)
df['Start'] = pd.to_datetime(df['Start'], dayfirst=True)
df['End'] = pd.to_datetime(df['End'], dayfirst=True)
df['Duration'] = (df['End']-df['Start']).astype('<m8[M]').astype(int)+1
print(df)
输出:
Start End Duration
0 2020-01-01 2020-10-31 10
1 2020-02-01 2020-11-30 10
试试这个
dates = [{'Start':'1-1-2020', 'End':'31-10-2020'}, {'Start':'1-2-2020', 'End':'30-11-2020'}]
df = pd.DataFrame(dates)
df['Start'] = pd.to_datetime(df['Start'], dayfirst=True)
df['End'] = pd.to_datetime(df['End'], dayfirst=True)
df['Duration'] = (df['End'] - df['Start']).apply(lambda x:x.days//30)
print(df)
我正在尝试获取 Pandas DataFrame 中开始日期和结束日期之间的月差值。结果并不完全令人满意...
首先,结果是 <[value] * MonthEnds> 形式的某种日期时间类型。我不能用这个来计算。第一个问题是如何将其转换为整数。我尝试了 .n 属性,但随后出现以下错误:
AttributeError: 'Series' object has no attribute 'n'
第二,结果是'missing'一个月。这可以通过使用另一个 solution/method 来避免吗?或者我应该在答案上加上 1 个月?
为了支持我的问题,我创建了一些简化的代码:
dates = [{'Start':'1-1-2020', 'End':'31-10-2020'}, {'Start':'1-2-2020', 'End':'30-11-2020'}]
df = pd.DataFrame(dates)
df['Start'] = pd.to_datetime(df['Start'], dayfirst=True)
df['End'] = pd.to_datetime(df['End'], dayfirst=True)
df['Duration'] = (df['End'].dt.to_period('M') - df['Start'].dt.to_period('M'))
df
这导致:
Start End Duration
0 2020-01-01 2020-10-31 <9 * MonthEnds>
1 2020-02-01 2020-11-30 <9 * MonthEnds>
首选结果是:
Start End Duration
0 2020-01-01 2020-10-31 10
1 2020-02-01 2020-11-30 10
从 end-date 中减去 start-date 并将时间增量转换为月数。
import pandas as pd
dates = [{'Start':'1-1-2020', 'End':'31-10-2020'}, {'Start':'1-2-2020', 'End':'30-11-2020'}]
df = pd.DataFrame(dates)
df['Start'] = pd.to_datetime(df['Start'], dayfirst=True)
df['End'] = pd.to_datetime(df['End'], dayfirst=True)
df['Duration'] = (df['End']-df['Start']).astype('<m8[M]').astype(int)+1
print(df)
输出:
Start End Duration
0 2020-01-01 2020-10-31 10
1 2020-02-01 2020-11-30 10
试试这个
dates = [{'Start':'1-1-2020', 'End':'31-10-2020'}, {'Start':'1-2-2020', 'End':'30-11-2020'}]
df = pd.DataFrame(dates)
df['Start'] = pd.to_datetime(df['Start'], dayfirst=True)
df['End'] = pd.to_datetime(df['End'], dayfirst=True)
df['Duration'] = (df['End'] - df['Start']).apply(lambda x:x.days//30)
print(df)