如何解析这个时间格式?
How to parse this time format?
下面的例子说明dateutil.parser.parse
无法解析:
Tue, 27 May 2014 20:06:08 +0800 (GMT+08:00)
什么python方法可以解析它以及:
Thu, 16 Dec 2010 12:14:05 +0000
我试过了:
$ ./main.py
Traceback (most recent call last):
File "./main.py", line 5, in <module>
date = parser.parse('Tue, 27 May 2014 20:06:08 +0800 (GMT+08:00)')
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/dateutil/parser.py", line 1008, in parse
return DEFAULTPARSER.parse(timestr, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/dateutil/parser.py", line 395, in parse
raise ValueError("Unknown string format")
ValueError: Unknown string format
$ cat ./main.py
#!/usr/bin/env python
# vim: set noexpandtab tabstop=2 shiftwidth=2 softtabstop=-1 fileencoding=utf-8:
import dateutil.parser as parser
date = parser.parse('Tue, 27 May 2014 20:06:08 +0800 (GMT+08:00)')
如果额外的文本位于字符串的末尾且格式未知,那么您可以 trim 额外的文本直到字符串可解析为:
代码:
def parse_datetime_remove_useless_end(date_str):
for i in range(len(date_str) + 1, 0, -1):
try:
return parser.parse(date_str[:i])
except ValueError:
pass
测试代码:
import dateutil.parser as parser
print(parse_datetime_remove_useless_end('Tue, 27 May 2014 20:06:08 +0800 (GMT+08:00)'))
print(parse_datetime_remove_useless_end('Thu, 16 Dec 2010 12:14:05 +0000'))
结果:
2014-05-27 20:06:08+08:00
2010-12-16 12:14:05+00:00
您可以选择能够解析两种变体的 "fuzzy" 模式:
In [7]: parser.parse('Tue, 27 May 2014 20:06:08 +0800 (GMT+08:00)', fuzzy=True)
Out[7]: datetime.datetime(2014, 5, 27, 20, 6, 8, tzinfo=tzoffset(None, 28800))
In [8]: parser.parse('Thu, 16 Dec 2010 12:14:05 +0000', fuzzy=True)
Out[8]: datetime.datetime(2010, 12, 16, 12, 14, 5, tzinfo=tzutc())
下面的例子说明dateutil.parser.parse
无法解析:
Tue, 27 May 2014 20:06:08 +0800 (GMT+08:00)
什么python方法可以解析它以及:
Thu, 16 Dec 2010 12:14:05 +0000
我试过了:
$ ./main.py
Traceback (most recent call last):
File "./main.py", line 5, in <module>
date = parser.parse('Tue, 27 May 2014 20:06:08 +0800 (GMT+08:00)')
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/dateutil/parser.py", line 1008, in parse
return DEFAULTPARSER.parse(timestr, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/dateutil/parser.py", line 395, in parse
raise ValueError("Unknown string format")
ValueError: Unknown string format
$ cat ./main.py
#!/usr/bin/env python
# vim: set noexpandtab tabstop=2 shiftwidth=2 softtabstop=-1 fileencoding=utf-8:
import dateutil.parser as parser
date = parser.parse('Tue, 27 May 2014 20:06:08 +0800 (GMT+08:00)')
如果额外的文本位于字符串的末尾且格式未知,那么您可以 trim 额外的文本直到字符串可解析为:
代码:
def parse_datetime_remove_useless_end(date_str):
for i in range(len(date_str) + 1, 0, -1):
try:
return parser.parse(date_str[:i])
except ValueError:
pass
测试代码:
import dateutil.parser as parser
print(parse_datetime_remove_useless_end('Tue, 27 May 2014 20:06:08 +0800 (GMT+08:00)'))
print(parse_datetime_remove_useless_end('Thu, 16 Dec 2010 12:14:05 +0000'))
结果:
2014-05-27 20:06:08+08:00
2010-12-16 12:14:05+00:00
您可以选择能够解析两种变体的 "fuzzy" 模式:
In [7]: parser.parse('Tue, 27 May 2014 20:06:08 +0800 (GMT+08:00)', fuzzy=True)
Out[7]: datetime.datetime(2014, 5, 27, 20, 6, 8, tzinfo=tzoffset(None, 28800))
In [8]: parser.parse('Thu, 16 Dec 2010 12:14:05 +0000', fuzzy=True)
Out[8]: datetime.datetime(2010, 12, 16, 12, 14, 5, tzinfo=tzutc())