从 IMAP 帐户中提取 pdf 附件 -- python 3.5.2
Extracting pdf attachment from IMAP account -- python 3.5.2
好的,所以我尝试将发送到特定帐户的 pdf 附件保存到特定的网络文件夹,但我卡在了附件部分。我有以下代码来提取看不见的消息,但我不确定如何让 "parts" 保持完整。我想如果我能弄清楚如何保持电子邮件的完整性,我也许能弄清楚这一点。我从来没有超过 "Made it to walk" 输出。此帐户中的所有测试电子邮件均包含 pdf 附件。提前致谢。
import imaplib
import email
import regex
import re
user = 'some_user'
password = 'gimmeAllyerMoney'
server = imaplib.IMAP4_SSL('mail.itsstillmonday.com', '993')
server.login(user, password)
server.select('inbox')
msg_ids=[]
resp, messages = server.search(None, 'UNSEEN')
for message in messages[0].split():
typ, data = server.fetch(message, '(RFC822)')
msg= email.message_from_string(str(data[0][1]))
#looking for 'Content-Type: application/pdf
for part in msg.walk():
print("Made it to walk")
if part.is_multipart():
print("made it to multipart")
if part.get_content_maintype() == 'application/pdf':
print("made it to content")
您可以使用 part.get_content_type() 获取完整内容类型,使用 part.get_payload() 获取有效负载,如下所示:
for part in msg.walk():
if part.get_content_type() == 'application/pdf':
# When decode=True, get_payload will return None if part.is_multipart()
# and the decoded content otherwise.
payload = part.get_payload(decode=True)
# Default filename can be passed as an argument to get_filename()
filename = part.get_filename()
# Save the file.
if payload and filename:
with open(filename, 'wb') as f:
f.write(payload)
请注意,正如 tripleee 指出的那样,对于内容类型为 "application/pdf" 的部分,您有:
>>> part.get_content_type()
"application/pdf"
>>> part.get_content_maintype()
"application"
>>> part.get_content_subtype()
"pdf"
好的,所以我尝试将发送到特定帐户的 pdf 附件保存到特定的网络文件夹,但我卡在了附件部分。我有以下代码来提取看不见的消息,但我不确定如何让 "parts" 保持完整。我想如果我能弄清楚如何保持电子邮件的完整性,我也许能弄清楚这一点。我从来没有超过 "Made it to walk" 输出。此帐户中的所有测试电子邮件均包含 pdf 附件。提前致谢。
import imaplib
import email
import regex
import re
user = 'some_user'
password = 'gimmeAllyerMoney'
server = imaplib.IMAP4_SSL('mail.itsstillmonday.com', '993')
server.login(user, password)
server.select('inbox')
msg_ids=[]
resp, messages = server.search(None, 'UNSEEN')
for message in messages[0].split():
typ, data = server.fetch(message, '(RFC822)')
msg= email.message_from_string(str(data[0][1]))
#looking for 'Content-Type: application/pdf
for part in msg.walk():
print("Made it to walk")
if part.is_multipart():
print("made it to multipart")
if part.get_content_maintype() == 'application/pdf':
print("made it to content")
您可以使用 part.get_content_type() 获取完整内容类型,使用 part.get_payload() 获取有效负载,如下所示:
for part in msg.walk():
if part.get_content_type() == 'application/pdf':
# When decode=True, get_payload will return None if part.is_multipart()
# and the decoded content otherwise.
payload = part.get_payload(decode=True)
# Default filename can be passed as an argument to get_filename()
filename = part.get_filename()
# Save the file.
if payload and filename:
with open(filename, 'wb') as f:
f.write(payload)
请注意,正如 tripleee 指出的那样,对于内容类型为 "application/pdf" 的部分,您有:
>>> part.get_content_type()
"application/pdf"
>>> part.get_content_maintype()
"application"
>>> part.get_content_subtype()
"pdf"