正则表达式。 \b 西里尔符号
RegEx. \b for Cyrillic symbols
请告诉我,可以用什么代替 \b 来突出显示西里尔文本中的单词?
我在 SQLite 数据库列中有一个文本“текст”。
正在运行:
select * from myTable where text REGEXP 'текст'
它不工作:
select * from myTable where text REGEXP '\bтекст\b'
原来你的 SQLite REGEXP
实现是基于 PCRE 的。
您可以使用 (*UCP)
PCRE 动词使 \b
Unicode 可识别:
'(*UCP)\bтекст\b'
有一些关于动词的细节
Another special sequence that may appear at the start of a pattern is (*UCP)
. This has the same effect as setting the PCRE_UCP
option: it causes sequences such as \d
and \w
to use Unicode properties to determine character types, instead of recognizing only characters with codes less than 128 via a lookup table.
以后:
Note also that PCRE_UCP
affects \b
, and \B
because they are defined in terms of \w
and \W
. Matching these sequences is noticeably slower when PCRE_UCP
is set.
嗯,它会更慢,因为它现在必须处理整个 Unicode table。
请告诉我,可以用什么代替 \b 来突出显示西里尔文本中的单词?
我在 SQLite 数据库列中有一个文本“текст”。
正在运行:
select * from myTable where text REGEXP 'текст'
它不工作:
select * from myTable where text REGEXP '\bтекст\b'
原来你的 SQLite REGEXP
实现是基于 PCRE 的。
您可以使用 (*UCP)
PCRE 动词使 \b
Unicode 可识别:
'(*UCP)\bтекст\b'
有一些关于动词的细节
Another special sequence that may appear at the start of a pattern is
(*UCP)
. This has the same effect as setting thePCRE_UCP
option: it causes sequences such as\d
and\w
to use Unicode properties to determine character types, instead of recognizing only characters with codes less than 128 via a lookup table.
以后:
Note also that
PCRE_UCP
affects\b
, and\B
because they are defined in terms of\w
and\W
. Matching these sequences is noticeably slower whenPCRE_UCP
is set.
嗯,它会更慢,因为它现在必须处理整个 Unicode table。