使用 Python 从电子邮件正文中删除某些文本

Getting rid of certain text from the body of an email using Python

我正在尝试使用以下 Python 代码

解析转发电子邮件的正文
import imapclient
import os
import pprint
import pyzmail
import email

#my email info
EMAIL_ADRESS = os.environ.get('DB_USER')
EMAIL_PASSWORD = os.environ.get('PYTHON_PASS')

#login to my email
imap0bj =  imapclient.IMAPClient('imap.gmail.com', ssl = True)
imap0bj.login(EMAIL_ADRESS, EMAIL_PASSWORD )
print("ok")


pprint.pprint(imap0bj.list_folders())
#Selecting my Inbox
imap0bj.select_folder('INBOX', readonly = True)

#Getting UIDs from Inbox
UIDs = imap0bj.search(['SUBJECT', 'Contact FB Applicant', 'ON', '16-Oct-2020'])
print(UIDs)


rawMessages = imap0bj.fetch(UIDs, ['BODY[]'])
message = pyzmail.PyzMessage.factory(rawMessages[9999][b'BODY[]'])

message.text_part != None
#Body of the email returned as a string
msg = message.text_part.get_payload().decode(message.text_part.charset)

print(msg)

imap0bj.logout()

此代码输出类似于此的字符串

   ---------- Forwarded message ---------
    From: Someone <Mail@mail.biz>
    Date: Wed, Oct 14, 2020 at 1:23 PM
    Subject: Fwd:  Contact FB Applicant
    To: <mail@mail.com>
    
    
    
    
   ---------- Forwarded message ---------
    From: Someone <Mail@mail.biz>
    Date: Wed, Oct 14, 2020 at 1:23 PM
    Subject: Fwd:  Contact FB Applicant
    To: <mail@mail.com>
    
    
    The following applicant filled out the form via Facebook.  Contact
    immediately.
    
    Some Guy
    999999999999
    mail@mail.com

但我不想要“转发的消息”部分。我只想从“以下申请人...”开始,这是我关心的信息。我如何摆脱其他东西?我真的很感激你的帮助。谢谢!

您可以使用io.StringIO

下面是您将如何使用它。

from io import StringIO

# your code goes here
...
...

msg = message.text_part.get_payload().decode(message.text_part.charset)

sio = StringIO(msg)

sio.seek(msg.index('The following applicant'))

for line in sio:
  print(line)

工作原理:

StringIO 允许您将字符串视为流(文件)。 StringIO.seek 将流位置移动到特定位置。 (0 是流的开始) str.index returns 字符串在字符串中的第一个位置。将它们放在一起:将流的开头移动到所需字符串的第一次出现处,然后从流中读取。

从这个格式来看,需要逐行阅读。 如果遇到以'---'开头的行,比如line[:3]='---' 你忽略它和它后面的行,直到你读到一个空行, 如果它再次以“---”开头,则重复该过程 那么第一个 non-empty 行应该是“The following applicant...”

您可以将此代码埋入无限循环中并中断,这里是pseudo-code

while True:
  line = read next line
  if length(line) ==0: continue
  if line[:3] = '---'
    while true:
      line = read next line
      if line:
        break
      else:
        continue
  else:
    break
read lines and print everthing from here

假设read line函数记录了它已经读取了多少行以及即将读取哪一行。