如何找到文本中匹配的单词索引?
How to find the index of words matched in a text?
我正在提取 this regex 中匹配的单词索引。它使用正则表达式匹配文本中所有必需的单词,但它也匹配正则表达式左侧的 space。它不是在左侧文本中限制匹配的字符串,而是使用 \b
限制匹配字符串的右侧
正则表达式:
(price|rs)?\s*(\d+[\s\d.]*\s*?(pkg|k|m|(?:la(?:c|kh|k)|crore|cr)s?|l)\b\.?)
输入文字:
This should matchprice 5.6 lacincluding price(i.e price 5.6 lac) and rs 56 m. including rs (i.e rs 56 k rs 56 m) .
It will match normally if there is no price or rs written for example or 56 k or 8.8 crs. are correct matching but its should bound the matched string from left side as well just like its not matching sapce after end of the matched string.
It should not match the spaces left of 8.5 in this 8.5 lac ould not match eitherrs 6 lac asas there is no spaces before 5.6
How can I modify above regex to bound the matched word in the left side as well?
您可以将 \s*
移动到可选的非捕获组中:
(?:\b(price|rs)\s*)?(\d+[\s\d.]*\s*?(pkg|k|m|(?:la(?:c|kh|k)|crore|cr)s?|l)\b\.?)
^^^^^^^^^^^^^^^^^^^^
查看 regex demo
(?:\b(price|rs)\s*)?
模式将匹配单词边界,后跟 price
或 rs
后跟 0+ 个空白字符,整个模式将被尝试一次,并且由于 ?
修饰符,该模式是可选的(整个模式序列可以匹配 1 次或 0 次)
我正在提取 this regex 中匹配的单词索引。它使用正则表达式匹配文本中所有必需的单词,但它也匹配正则表达式左侧的 space。它不是在左侧文本中限制匹配的字符串,而是使用 \b
正则表达式:
(price|rs)?\s*(\d+[\s\d.]*\s*?(pkg|k|m|(?:la(?:c|kh|k)|crore|cr)s?|l)\b\.?)
输入文字:
This should matchprice 5.6 lacincluding price(i.e price 5.6 lac) and rs 56 m. including rs (i.e rs 56 k rs 56 m) .
It will match normally if there is no price or rs written for example or 56 k or 8.8 crs. are correct matching but its should bound the matched string from left side as well just like its not matching sapce after end of the matched string.
It should not match the spaces left of 8.5 in this 8.5 lac ould not match eitherrs 6 lac asas there is no spaces before 5.6
How can I modify above regex to bound the matched word in the left side as well?
您可以将 \s*
移动到可选的非捕获组中:
(?:\b(price|rs)\s*)?(\d+[\s\d.]*\s*?(pkg|k|m|(?:la(?:c|kh|k)|crore|cr)s?|l)\b\.?)
^^^^^^^^^^^^^^^^^^^^
查看 regex demo
(?:\b(price|rs)\s*)?
模式将匹配单词边界,后跟 price
或 rs
后跟 0+ 个空白字符,整个模式将被尝试一次,并且由于 ?
修饰符,该模式是可选的(整个模式序列可以匹配 1 次或 0 次)