转换字符串日期,不能用零填充

Converting string dates, which may not be padded by zeros

我从非常规来源获取我的数据和一些日期,因此字符串日期存在一些细微差别。最大的区别是有些日期混合在一起,其中一天没有用零填充,一天之后可以有一个白色 space (在日期 2/9 /2018 的情况下)月份也没有用零填充。我在尝试 datetime.strptime 时收到“时间数据与格式 '%m %d %Y' 不匹配”的错误。如何转换存在此类细微差异的日期列?请查看代码和下面的示例数据

d_o = datetime.datetime.strptime(df['start'][1], '%m %d %Y')

您应该使用第 3 方库,例如 dateutil。该库以牺牲性能为代价接受各种日期格式。

from dateutil import parser

lst = ['1/26/2018', '1/26/2018', '2/2/2018', '2/2/2018', '2/9 /2018', '2/9 /2018',
       '1/19/2018', '1/19/2018', '1/26/2018', '1/26/2018', '2/2/2018', '2/2/2018',
       '2/9 /2018']

res = [parser.parse(i) for i in lst]

结果:

[datetime.datetime(2018, 1, 26, 0, 0),
 datetime.datetime(2018, 1, 26, 0, 0),
 datetime.datetime(2018, 2, 2, 0, 0),
 datetime.datetime(2018, 2, 2, 0, 0),
 datetime.datetime(2018, 2, 9, 0, 0),
 datetime.datetime(2018, 2, 9, 0, 0),
 datetime.datetime(2018, 1, 19, 0, 0),
 datetime.datetime(2018, 1, 19, 0, 0),
 datetime.datetime(2018, 1, 26, 0, 0),
 datetime.datetime(2018, 1, 26, 0, 0),
 datetime.datetime(2018, 2, 2, 0, 0),
 datetime.datetime(2018, 2, 2, 0, 0),
 datetime.datetime(2018, 2, 9, 0, 0)]

您可以使用 re.splitstr.zfill:

import re
dates = ['1/26/2018', '1/26/2018', '2/2/2018', '2/2/2018', '2/9 /2018', '2/9 /2018', '1/19/2018', '1/19/2018', '1/26/2018', '1/26/2018', '2/2/2018', '2/2/2018', '2/9 /2018']
new_dates = ['{}/{}/{}'.format(a.zfill(2), *b) for a, *b in map(lambda x:re.split('[/\s]+', x), dates)]

输出:

['01/26/2018', '01/26/2018', '02/2/2018', '02/2/2018', '02/9/2018', '02/9/2018', '01/19/2018', '01/19/2018', '01/26/2018', '01/26/2018', '02/2/2018', '02/2/2018', '02/9/2018']