PCRE 正则表达式：是否可以只检查字符串的前 X 个字符是否匹配

Question

PCRE 正则表达式：正则表达式是否可以在仅字符串的前 X 个字符中检查模式匹配，而忽略字符串的其他部分超过那个点？

我的正则表达式：

我有一个正则表达式：

/\S+V\s*/

这会检查字符串中是否有尾随 'V' 的非空白字符，然后是空白字符或字符串结尾。

这行得通。例如：

示例 A：

 SEBSTI FMDE OPORV AWEN STEM students into STEM 

// Match found in 'OPORV' (correct)

示例 B：

 ARKFE SSETE BLMI EDSF BRNT CARFR (name removed) Academy Networking Event 
      
//Match not found (correct).

回复：每个字母的大写文本和字母位置在源数据中都有含义。接下来是供人类阅读的通用信息（“学院社交活动”等）

我的问题：

理论上有时会出现包含罗马数字的名称，例如：

示例 C：

 ARKFE SSETE BLME CARFR Academy IV Networking Event 
      
//Match found (incorrect).

我希望上面的正则表达式只检查字符串的前 X 个字符。

这可以在 PCRE Regex 本身中完成吗？我在 Regex 中找不到任何关于长度计数的参考，我怀疑这不容易实现。字符串长度是完全任意的。（我们无法控制源数据）。

意向：

/\S+V\s*/{check within first 25 characters only}

 ARKFE SSETE BLME CARFR Academy IV Networking Event 
                         ^
                         \-  Cut off point. Not found so far so stop. 

//Match not found (correct).

解决方法：

正则表达式在 PHP 中，我目前的解决方案是剪切 PHP 中的字符串，只检查前 X 个字符，通常是前 20 个字符，但我很好奇是否有是在 Regex 中执行此操作的一种方法，无需直接在 PHP?

中操作字符串

$valueSubstring = substr($coreRow['value'],0,20); /* first 20 characters only */
$virtualCount = preg_match_all('/\S+V\s*/',$valueSubstring);

Answer 1

您可以在 X 个字符后找到您的模式并跳过整个字符串，否则匹配您的模式。所以，如果 X=25:

^.{25,}\S+V.*(*SKIP)(*F)|\S+V\s*

参见regex demo。详情:

^.{25,}\S+V.*(*SKIP)(*F) - 字符串开头，25 个或更多字符（换行符除外），尽可能多，然后是一个或多个非空格和 V，然后是其余字符字符串，匹配失败并跳过
| - 或
\S+V\s* - 匹配一个或多个非空白字符，V 和零个或多个空白字符。

Answer 2

任何以前 25 个位置结尾的 V

^.{1,24}V\s

见regex

前 25 个位置以 V 结尾的任何单词

^.{1,23}[A-Z]V\s

Answer 3

诀窍是先行捕获前 25 个字符后的行尾，并检查它是否遵循子模式的最终匹配项：

$pattern = '~^(?=.{0,25}(.*)).*?\K\S+V\b(?=.*)~m';

demo

详情：

^ # start of the line

(?= # open a lookahead assertion
    .{0,25} # the twenty first chararcters
    (.*) # capture the end of the line
) # close the lookahead

.*? # consume lazily the characters

\K # the match result starts here

\S+V    # your pattern
\b      # a word boundary (that matches between a letter and a white-space
        # or the end of the string)

(?=.*) # check that the end of the line follows with a reference to
         # the capture group 1 content.

请注意，您还可以像这样以更具可读性的方式编写模式：

$pattern = '~^
    (*positive_lookahead: .{0,20} (?<line_end> .* ) )
    .*?    \K    \S+ V \b
    (*positive_lookahead: .*? \g{line_end} )   ~xm';

（替代语法 (*positive_lookahead: ...) 自 PHP 7.3 起可用）

PCRE 正则表达式：是否可以只检查字符串的前 X 个字符是否匹配

PCRE Regex: Is it possible to check within only the first X characters of a string for a match

regex

pcre

我的正则表达式：

我的问题：

意向：

解决方法：