正在从安全邮件中心下载 Excel 份报告

Question

一直在编写脚本以自动执行工作职责的新程序员。

问题范围：
我通过电子邮件收到来自外部供应商的双月 excel 报告。该供应商使用 ZixMail 进行加密，而我的公司并未利用这一点。因此，我必须使用我的用户名和密码通过安全邮件中心访问这些电子邮件，才能登录该邮件中心网站。我正在尝试与该服务器建立连接并下载附件。

我尝试过的：

尝试将 IMAP 连接到“服务器”（我不确定该网站是否是邮件服务器）
- 打了很多次，因为我永远无法连接（如果有建议请分享）
正在使用会话通过 HTTP 访问站点。
- 我可以连接到该站点，但是当我转到 .get 和 .write 文件时，我的 excel 文件 returns 空白且已损坏。
- 在邮件 Center/website 上，当我单击 link/url 时，它会自动下载文件。我不确定为什么这必须如此具有挑战性？

您下载文件的网站的源代码如下所示：
a rel="external" href="/s/attachment?name=Random Letters and Numbers=emdeon" title="文件 Title.xlsx"

href 看起来一点也不像普通的 URL 并且不像我见过的大多数示例那样以 .xlsx 或任何其他类型的文件结尾。

我想我真的只是在寻找任何想法、想法、帮助解决方案。

这是我的 HTTP 连接代码

import requests
import urllib.request
import shutil
import os

#Fill in your details here to be posted to the login form.
payload = {
    'em': 'Username',
    'passphrase': 'Password',
    'validationKey': 'Key'
}

#This reads your URL and returns if the file is downloadable
def is_downloadable(URL_D):
    h = requests.head(URL_D, allow_redirects=True)
    header = h.headers
    content_type = header.get('content-type')
    if 'text' in content_type.lower():
        return False
    if 'html' in content_type.lower():
        return False
    return True

def download_file(URL_D):
    with requests.get(URL_D, stream=True) as r:
        r.raise_for_status()
        with open(FileName, 'wb') as f:
            for chunk in r.iter_content(chunk_size=None): 
                if chunk: 
                    f.write(chunk)
        f.close()
    return FileName

def Main():
    with requests.Session() as s:
        p = s.post(URL, data=payload, allow_redirects=True )
        print(is_downloadable(URL_D))
        download_file(URL_D)


if __name__ == '__main__':
    Path = "<path>"
    FileName = os.path.join(Path,"Testing File.xlsx")
    URL = 'login URL'
    URL_D = 'Attachment URL"
    Main()

is_downloadable(URL_D) returns 为假且 excel 文件为空且已损坏

这是我的 IMAP 尝试代码：

import email
import imaplib
import os 

class FetchEmail():

    connection = None
    error = None
    

    def __init__(self, mail_server, username, password):
        self.connection = imaplib.IMAP4_SSL(mail_server,port=993)
        self.connection.login(username, password)
        self.connection.select('inbox',readonly=False) # so we can mark mails as read

    def close_connection(self):
        """
        Close the connection to the IMAP server
        """
        self.connection.close()

    def save_attachment(self, msg, download_folder):

        att_path = "No attachment found."
        for part in msg.walk():
            if part.get_content_maintype() == 'multipart':
                continue
            if part.get('Content-Disposition') is None:
                continue

            filename = part.get_filename()
            att_path = os.path.join(download_folder, filename)

            if not os.path.isfile(att_path):
                fp = open(att_path, 'wb')
                fp.write(part.get_payload(decode=True))
                fp.close()
        return att_path

    def fetch_messages(self):
    
        emails = []
        (result, messages) = self.connection.search(None, "(ON 20-Nov-2020)")
        if result == "OK":
            for message in messages[0].split(' '):
                try: 
                    ret, data = self.connection.fetch(message,'(RFC822)')
                except:
                    print ("No emails to read for date.")
                    self.close_connection()
                    exit()

                msg = email.message_from_bytes(data[0][1])
                if isinstance(msg, str) == False:
                    emails.append(msg)
                response, data = self.connection.store(message, '+FLAGS','\Seen')

            return emails

        self.error = "Failed to retreive emails."
        return emails

def Main():
    p = FetchEmail(mail_server,username,password)
    msg = p.fetch_messages()
    p.save_attachment(msg, download_folder)
    p.close_connection()

if __name__ == "__main__":
    mail_server = "Server"
    username = "username"
    password = "password"
    download_folder= Path
    Main()

错误信息：TimeoutError：[WinError 10060]连接尝试失败，因为连接方在一段时间后没有正确响应，或者建立的连接失败，因为连接的主机没有响应

即使我写错了 IMAP 脚本，我也尝试通过 cmd 提示进行 IMAP 连接，结果相同。

总而言之，我正在寻找的是解决此问题的一些建议和想法。谢谢！

Answer 1

对于因类似问题偶然发现此问题的任何人。可能不是，因为我有一个非常奇怪的习惯，就是让一切变得简单、复杂。但是

我能够通过使用 selenium webdriver 登录网站并使用“点击”机制导航来解决问题。这是我能够成功下载报告的唯一方法。

import time
import os
import re
import datetime
from selenium.webdriver import Chrome
from selenium.webdriver.chrome.options import Options

today = datetime.date.today()
first = today.replace(day=1)
year = today.strftime('%Y')
month = today.strftime('%B')
lastMonth = (first - datetime.timedelta(days=1)).strftime('%b')


def Main():
    chrome_options = Options()
    chrome_options.add_experimental_option("detach", True)
    s = Chrome(executable_path=path to chrome extension)
    s.get("Website login page")
    s.find_element_by_id("loginname").send_keys('username')
    s.find_element_by_id("password").send_keys('password')
    s.find_element_by_class_name("button").click()
    for i in range(50):
        s.get("landing page post login")
        n = str(i)
        subject = ("mailsubject"+n)
        sent = ("mailsent"+n)
        title = s.find_element_by_id(subject).text
        date = s.find_element_by_id(sent).text
        regex = "Bi Monthly"
        regex_pr = "PR"
        match = re.search(regex,title)
        match_pr = re.search(regex_pr,title)
        if match and not match_pr:
            match_m = re.search(r"(\D{3})",date)
            match_d = re.search(r"(\d{1,2})",date)
            day = int(match_d.group())
            m = (match_m.group(1))
            if (day <= 15) and (m == lastMonth):
                print("All up to date files have been dowloaded")
                break 
            else:
                name = ("messageItem"+n)
                s.find_element_by_id(name).click()
           s.find_element_by_partial_link_text("xlsx").click() #This should be under the else but its not formatting right on here
        else:
            continue
    time.sleep(45)

if __name__ == "__main__":
    Main()

正在从安全邮件中心下载 Excel 份报告

Downloading Excel Reports From a Secure Mail Center

python

email

excel

imap

http