BestMatch findall 如何决定 return 有多少结果?
How does BestMatch findall decide how many results to return?
我正在努力预测在启用 BESTMATCH
的 Python 中使用 regex
时 findall()
会 return 多少模糊匹配:
>>> regex.findall(r'(?b)(North\ West){i<=0,s<=2,d<=1}', "South west South West North West", regex.V1)
['North West']
根本不匹配South West
>>> regex.findall(r'(?b)(North\ West){i<=0,s<=2,d<=1}', "South west South West North West North west South West", regex.V1)
['North West', 'North west', 'South West']
匹配 South West
我不清楚这是错误还是有意为之?
我想我有一个部分的解释:
行为似乎是:
- Return 所有完美匹配(没有替换、插入或删除)
- Return 上次完美匹配后的所有不完美匹配
似乎(通过测试)BESTMATCH
的行为与普通搜索一样,除非它找到完美匹配,它会丢弃之前的任何不完美匹配。因此,它的行为是 returns 一系列完美匹配,然后是零个或多个不完美匹配。
一些例子:
>>> regex.findall(r'(?b)(abc){i<=0,s<=2,d<=1}', "abc abd abd aaa abc", regex.V1)
['abc', 'abc']
>>> regex.findall(r'(?b)(abc){i<=0,s<=2,d<=1}', "abc abd abd aaa abc abb", regex.V1)
['abc', 'abc', 'abb']
>>> regex.findall(r'(?b)(ab[cd]){i<=0,s<=2,d<=1}', "abc abd abd aaa abc abb", regex.V1)
['abc', 'abd', 'abd', 'abc', 'abb']
>>> regex.findall(r'(?b)(ab[cd]){i<=0,s<=2,d<=1}', "abc abd abd aaa abc abb abc", regex.V1)
['abc', 'abd', 'abd', 'abc', 'abc']
我正在努力预测在启用 BESTMATCH
的 Python 中使用 regex
时 findall()
会 return 多少模糊匹配:
>>> regex.findall(r'(?b)(North\ West){i<=0,s<=2,d<=1}', "South west South West North West", regex.V1)
['North West']
根本不匹配South West
>>> regex.findall(r'(?b)(North\ West){i<=0,s<=2,d<=1}', "South west South West North West North west South West", regex.V1)
['North West', 'North west', 'South West']
匹配 South West
我不清楚这是错误还是有意为之?
我想我有一个部分的解释:
行为似乎是:
- Return 所有完美匹配(没有替换、插入或删除)
- Return 上次完美匹配后的所有不完美匹配
似乎(通过测试)BESTMATCH
的行为与普通搜索一样,除非它找到完美匹配,它会丢弃之前的任何不完美匹配。因此,它的行为是 returns 一系列完美匹配,然后是零个或多个不完美匹配。
一些例子:
>>> regex.findall(r'(?b)(abc){i<=0,s<=2,d<=1}', "abc abd abd aaa abc", regex.V1)
['abc', 'abc']
>>> regex.findall(r'(?b)(abc){i<=0,s<=2,d<=1}', "abc abd abd aaa abc abb", regex.V1)
['abc', 'abc', 'abb']
>>> regex.findall(r'(?b)(ab[cd]){i<=0,s<=2,d<=1}', "abc abd abd aaa abc abb", regex.V1)
['abc', 'abd', 'abd', 'abc', 'abb']
>>> regex.findall(r'(?b)(ab[cd]){i<=0,s<=2,d<=1}', "abc abd abd aaa abc abb abc", regex.V1)
['abc', 'abd', 'abd', 'abc', 'abc']