使用 Python 从电子邮件正文中删除某些文本
Getting rid of certain text from the body of an email using Python
我正在尝试使用以下 Python 代码
解析转发电子邮件的正文
import imapclient
import os
import pprint
import pyzmail
import email
#my email info
EMAIL_ADRESS = os.environ.get('DB_USER')
EMAIL_PASSWORD = os.environ.get('PYTHON_PASS')
#login to my email
imap0bj = imapclient.IMAPClient('imap.gmail.com', ssl = True)
imap0bj.login(EMAIL_ADRESS, EMAIL_PASSWORD )
print("ok")
pprint.pprint(imap0bj.list_folders())
#Selecting my Inbox
imap0bj.select_folder('INBOX', readonly = True)
#Getting UIDs from Inbox
UIDs = imap0bj.search(['SUBJECT', 'Contact FB Applicant', 'ON', '16-Oct-2020'])
print(UIDs)
rawMessages = imap0bj.fetch(UIDs, ['BODY[]'])
message = pyzmail.PyzMessage.factory(rawMessages[9999][b'BODY[]'])
message.text_part != None
#Body of the email returned as a string
msg = message.text_part.get_payload().decode(message.text_part.charset)
print(msg)
imap0bj.logout()
此代码输出类似于此的字符串
---------- Forwarded message ---------
From: Someone <Mail@mail.biz>
Date: Wed, Oct 14, 2020 at 1:23 PM
Subject: Fwd: Contact FB Applicant
To: <mail@mail.com>
---------- Forwarded message ---------
From: Someone <Mail@mail.biz>
Date: Wed, Oct 14, 2020 at 1:23 PM
Subject: Fwd: Contact FB Applicant
To: <mail@mail.com>
The following applicant filled out the form via Facebook. Contact
immediately.
Some Guy
999999999999
mail@mail.com
但我不想要“转发的消息”部分。我只想从“以下申请人...”开始,这是我关心的信息。我如何摆脱其他东西?我真的很感激你的帮助。谢谢!
您可以使用io.StringIO
下面是您将如何使用它。
from io import StringIO
# your code goes here
...
...
msg = message.text_part.get_payload().decode(message.text_part.charset)
sio = StringIO(msg)
sio.seek(msg.index('The following applicant'))
for line in sio:
print(line)
工作原理:
StringIO
允许您将字符串视为流(文件)。 StringIO.seek
将流位置移动到特定位置。 (0 是流的开始)
str.index
returns 字符串在字符串中的第一个位置。将它们放在一起:将流的开头移动到所需字符串的第一次出现处,然后从流中读取。
从这个格式来看,需要逐行阅读。
如果遇到以'---'开头的行,比如line[:3]='---'
你忽略它和它后面的行,直到你读到一个空行,
如果它再次以“---”开头,则重复该过程
那么第一个 non-empty 行应该是“The following applicant...”
您可以将此代码埋入无限循环中并中断,这里是pseudo-code
while True:
line = read next line
if length(line) ==0: continue
if line[:3] = '---'
while true:
line = read next line
if line:
break
else:
continue
else:
break
read lines and print everthing from here
假设read line函数记录了它已经读取了多少行以及即将读取哪一行。
我正在尝试使用以下 Python 代码
解析转发电子邮件的正文import imapclient
import os
import pprint
import pyzmail
import email
#my email info
EMAIL_ADRESS = os.environ.get('DB_USER')
EMAIL_PASSWORD = os.environ.get('PYTHON_PASS')
#login to my email
imap0bj = imapclient.IMAPClient('imap.gmail.com', ssl = True)
imap0bj.login(EMAIL_ADRESS, EMAIL_PASSWORD )
print("ok")
pprint.pprint(imap0bj.list_folders())
#Selecting my Inbox
imap0bj.select_folder('INBOX', readonly = True)
#Getting UIDs from Inbox
UIDs = imap0bj.search(['SUBJECT', 'Contact FB Applicant', 'ON', '16-Oct-2020'])
print(UIDs)
rawMessages = imap0bj.fetch(UIDs, ['BODY[]'])
message = pyzmail.PyzMessage.factory(rawMessages[9999][b'BODY[]'])
message.text_part != None
#Body of the email returned as a string
msg = message.text_part.get_payload().decode(message.text_part.charset)
print(msg)
imap0bj.logout()
此代码输出类似于此的字符串
---------- Forwarded message ---------
From: Someone <Mail@mail.biz>
Date: Wed, Oct 14, 2020 at 1:23 PM
Subject: Fwd: Contact FB Applicant
To: <mail@mail.com>
---------- Forwarded message ---------
From: Someone <Mail@mail.biz>
Date: Wed, Oct 14, 2020 at 1:23 PM
Subject: Fwd: Contact FB Applicant
To: <mail@mail.com>
The following applicant filled out the form via Facebook. Contact
immediately.
Some Guy
999999999999
mail@mail.com
但我不想要“转发的消息”部分。我只想从“以下申请人...”开始,这是我关心的信息。我如何摆脱其他东西?我真的很感激你的帮助。谢谢!
您可以使用io.StringIO
下面是您将如何使用它。
from io import StringIO
# your code goes here
...
...
msg = message.text_part.get_payload().decode(message.text_part.charset)
sio = StringIO(msg)
sio.seek(msg.index('The following applicant'))
for line in sio:
print(line)
工作原理:
StringIO
允许您将字符串视为流(文件)。 StringIO.seek
将流位置移动到特定位置。 (0 是流的开始)
str.index
returns 字符串在字符串中的第一个位置。将它们放在一起:将流的开头移动到所需字符串的第一次出现处,然后从流中读取。
从这个格式来看,需要逐行阅读。 如果遇到以'---'开头的行,比如line[:3]='---' 你忽略它和它后面的行,直到你读到一个空行, 如果它再次以“---”开头,则重复该过程 那么第一个 non-empty 行应该是“The following applicant...”
您可以将此代码埋入无限循环中并中断,这里是pseudo-code
while True:
line = read next line
if length(line) ==0: continue
if line[:3] = '---'
while true:
line = read next line
if line:
break
else:
continue
else:
break
read lines and print everthing from here
假设read line函数记录了它已经读取了多少行以及即将读取哪一行。