Python 用于提取电子邮件的正则表达式

Python Regex to extract email

我正在尝试使用 python 来提取 sentence that contains the email

sample_str = "This is a random sentence. This one is also random sentence but it contains an email address ---@---.com"

我看到的所有示例都提取了电子邮件,示例:

import re
lst = re.findall('\S+@\S+', sample_str) 

但是无论如何都可以提取sentence that contains the email。在这种情况下

op = "This one is also random sentence but it contains an email address ---@---.com"

可以标明句子的开头,中间不匹配句子的结尾。

但这可能很棘手,并且绝对不是通用的解决方案,因为句子不必以字符 [A-Z] 开头,并且可能以与 . [=15 不同的字符结尾=] ?

作为给定示例的想法,您可以使用:

(?<!\S)[A-Z](?:(?![!?.](?!\S)).)*[^\s@]+@[^\s@]+

说明

  • (?<!\S) 断言左侧空白边界
  • [A-Z] 匹配字符 A-Z
  • (?:(?![!?.](?!\S)).)* 匹配任何字符,除了 ! ?. 紧跟空白边界
  • [^\s@]+@[^\s@]+ 匹配类似格式的电子邮件

Regex demo | Python demo

例子

import re
 
sample_str = "This is a random sentence. This one is also random sentence but it contains an email address ---@---.com"
lst = re.findall('(?<!\S)[A-Z](?:(?![!?.](?!\S)).)*[^\s@]+@[^\s@]+', sample_str) 
 
print(lst)

输出

['This one is also random sentence but it contains an email address ---@---.com']