自然语言处理的正则表达式 "does not contain" 模式问题

Regex "does not contain" pattern issue for Natural Language Processing

Link 到 regex101

I am implementing a simple NLP algo. I have already implemented the solution looping over the raw string to assist the regex, but now I want to see if I can do it in pure regex.

我不知道如何让 'build' 小组尊重负面展望。我正在尝试捕获 [“自然语言处理”算法] 非常感谢任何帮助,谢谢

$subject_string = <<<'subject_string'
Projects I've built & Plan to build. HackMatch.io (May 2020 onward), 
As of October 2020, I intend to start implementing "Natural Language Processing" algorithms 
in PHP when I have time. I'll then use PHP to upload the results to big data tech (e.g. BigQuery) 
to create some data visualizations.
subject_string;

$pattern = <<<'pattern'
/\b(?'verb'build|make|implementing)
(?'build'.+?(?!build|make|implementing)) 
(?=\bin\b|\bon\b)
(?:build|make|implementing)??/ix
pattern;

preg_match_all($pattern, $subject_string, $matches)

您可以使用

/\b(?'verb'build|make|implementing)\s*
(?'build'(?:(?!(?&verb)).)*?) 
(?=\s*\b(?:in|on)\b)/ixs

参见regex demo详情:

  • \b - 单词边界
  • (?'verb'build|make|implementing) - 组“动词”:括号内的单词之一
  • \s* - 零个或更多空格
  • (?'build'(?:(?!(?&verb)).)*?) - “构建”组:任何字符,零次或多次出现,但尽可能少,不启动“动词”组中定义的任何字符序列
  • \s* - 零个或更多空格
  • (?=\b(?:in|on)\b) - 正向前瞻匹配紧跟整个单词 inon.
  • 的位置