正则表达式。 \b 西里尔符号

RegEx. \b for Cyrillic symbols

请告诉我,可以用什么代替 \b 来突出显示西里尔文本中的单词?

我在 SQLite 数据库列中有一个文本“текст”。

正在运行:

select * from myTable where text REGEXP 'текст'

它不工作:

select * from myTable where text REGEXP '\bтекст\b'

原来你的 SQLite REGEXP 实现是基于 PCRE 的。

您可以使用 (*UCP) PCRE 动词使 \b Unicode 可识别:

'(*UCP)\bтекст\b'

pcrepattern man page:

有一些关于动词的细节

Another special sequence that may appear at the start of a pattern is (*UCP). This has the same effect as setting the PCRE_UCP option: it causes sequences such as \d and \w to use Unicode properties to determine character types, instead of recognizing only characters with codes less than 128 via a lookup table.

以后:

Note also that PCRE_UCP affects \b, and \B because they are defined in terms of \w and \W. Matching these sequences is noticeably slower when PCRE_UCP is set.

嗯,它会更慢,因为它现在必须处理整个 Unicode table。