Python 的 re.Scanner 中的 IGNORECASE 错误?

IGNORECASE errors in Python's re.Scanner?

re模块

中有隐藏但众所周知的functionality
import re

def s_ident(scanner, token): return token
def s_operator(scanner, token): return "op%s" % token
def s_float(scanner, token): return float(token)
def s_int(scanner, token): return int(token)

scanner = re.Scanner([
    (r"[a-zA-Z]\w*", s_ident),
    (r"\d+\.\d*", s_float),
    (r"\d+", s_int),
    (r"=|\+|-|\*|/", s_operator),
    (r"\s+", None),
    ])

print scanner.scan("Sum = 3*foo + 312.50 + bar")
# (['Sum', 'op=', 3, 'op*', 'foo', 'op+', 312.5, 'op+', 'bar'], '')

我想在这里使用 IGNORECASE 标志,但它似乎不起作用:

import re

def s_ident(scanner, token): return token
def s_operator(scanner, token): return "op%s" % token
def s_float(scanner, token): return float(token)
def s_int(scanner, token): return int(token)

scanner = re.Scanner([
    (r"(?i)[a-z]\w*", s_ident),
    (r"\d+\.\d*", s_float),
    (r"\d+", s_int),
    (r"=|\+|-|\*|/", s_operator),
    (r"\s+", None),
    ])

print scanner.scan("Sum = 3*foo + 312.50 + bar")
# ([], 'Sum = 3*foo + 312.50 + bar')

是扫描仪的问题还是我的代码有问题? 是否可以使用 Scanner 实现不区分大小写的匹配?

此问题最初在 Python 2.7.9 上重现。

期望值: (['Sum', 'op=', 3, 'op*', 'foo', 'op+', 312.5, 'op+', 'bar'], ' ')

实际值: ([], 'Sum = 3*foo + 312.50 + bar')

您可以将flags参数传递给构造函数。

scanner = re.Scanner([
    (r"[a-z]\w*", s_ident),
    (r"\d+\.\d*", s_float),
    (r"\d+", s_int),
    (r"=|\+|-|\*|/", s_operator),
    (r"\s+", None),
    ], flags=re.IGNORECASE)

Scanner 的来源:https://github.com/python/cpython/blob/master/Lib/re.py#L345