在 python 中提取位置或持续时间旁边的词

Extracting words next to a location or Duration in python

如何提取位置或持续时间旁边的字词？ python 中执行此操作的最佳正则表达式是什么？

示例：-

Kathick Kumar，班加罗尔，他是一个伟大的人，生活于 1980 年 3 月 29 日至 2014 年 12 月 21 日。

在上面的示例中，我想提取位置之前的词和持续时间之前的词。这里的位置和持续时间不固定，python 中最好的正则表达式是什么？或者我们可以使用 nltk 来做到这一点吗？

期望的输出：-

输出 1：Karthick Kumar（这里的关键字是位置）

Output-2：谁是一个伟大的人并且生活在（这里的关键字是持续时间）

我建议使用 Lookaheads。

在您的示例中，假设您想要 Bangalore 和 1980 年 3 月 29 日 - 2014 年 12 月 21 日 之前的词，您可以使用前瞻（和回顾）以获得相关匹配。

我使用了这个正则表达式：(.*)(?>Bangalore)(.+)(?=29th March 1980 - 21 Dec 2014) 并捕获了括号中的文本，可以使用 </code> 和 <code>.

访问它