python - dateutil / 解析器中的语言环境

python - locale in dateutil / parser

我设置

locale.setlocale(locale.LC_TIME, ('de', 'UTF-8'))

要解析的字符串是:

Montag, 11. April 2016 19:35:57

我使用:

note_date = parser.parse(result.group(2))

但出现以下错误:

Traceback (most recent call last): File "/Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py", line 1531, in globals = debugger.run(setup['file'], None, None, is_module) File "/Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py", line 938, in run pydev_imports.execfile(file, globals, locals) # execute the script File "/Applications/PyCharm.app/Contents/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "/Users/adieball/Dropbox/Multiverse/Programming/python/repositories/kindle/kindle2en.py", line 250, in main(sys.argv[1:]) File "/Users/adieball/Dropbox/Multiverse/Programming/python/repositories/kindle/kindle2en.py", line 154, in main note_date = parser.parse(result.group(2)) File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/dateutil/parser.py", line 1164, in parse return DEFAULTPARSER.parse(timestr, **kwargs) File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/dateutil/parser.py", line 555, in parse raise ValueError("Unknown string format") ValueError: Unknown string format

调试显示解析器未使用 "correct" dateutil 值(德语),它仍在使用英语值。

我确定我在这里遗漏了一些明显的东西,但找不到它。

谢谢。

dateutil.parser 不使用 locale。您需要子类化 dateutil.parser.parserinfo 并构造一个德语等价物:.

from dateutil import parser

class GermanParserInfo(parser.parserinfo):
    WEEKDAYS = [("Mo.", "Montag"),
                ("Di.", "Dienstag"),
                ("Mi.", "Mittwoch"),
                ("Do.", "Donnerstag"),
                ("Fr.", "Freitag"),
                ("Sa.", "Samstag"),
                ("So.", "Sonntag")]

s = 'Montag, 11. April 2016 19:35:57'
note_date = parser.parse(s, parserinfo=GermanParserInfo())

您需要对其进行扩展以使其也适用于其他值,例如月份名称。

在另一个回答中,我回答了一个简单的Locale aware parseinfo class。这不是世界上所有语言的完整解决方案,但解决了我所有的本地化问题。

这里是:

import calendar
from dateutil import parser
    
class LocaleParserInfo(parser.parserinfo):
    WEEKDAYS = list(zip(calendar.day_abbr, calendar.day_name))
    MONTHS = list(zip(calendar.month_abbr, calendar.month_name))[1:]

你可以使用:

In [1]: import locale;locale.setlocale(locale.LC_ALL, "pt_BR.utf8")
In [2]: from localeparserinfo import LocaleParserInfo                                   

In [3]: from dateutil.parser import parse                                                

In [4]: parse("Ter, 01 Out 2013 14:26:00 -0300", parserinfo=LocaleParserInfo())              
Out[4]: datetime.datetime(2013, 10, 1, 14, 26, tzinfo=tzoffset(None, -10800))

测试一下,看看原始parseinfo中的class个变量,特别是HMS变量。也许需要声明其他变量。