在 C# 中将 MSG 电子邮件转换为 PDF 文件

Convert MSG email to PDF file in C#

我正在使用 GemBox.Email and GemBox.Document 将电子邮件转换为 PDF。

这是我的代码:

static void Main()
{
    MailMessage message = MailMessage.Load("input.eml");
    DocumentModel document = new DocumentModel();

    if (!string.IsNullOrEmpty(message.BodyHtml))
        document.Content.LoadText(message.BodyHtml, LoadOptions.HtmlDefault);
    else
        document.Content.LoadText(message.BodyText, LoadOptions.TxtDefault);

    document.Save("output.pdf");
}

该代码适用于 EML 文件,但不适用于 MSG(MailMessage.BodyHtmlMailMessage.BodyText)都是空的。

我怎样才能使它也适用于 MSG?

问题出现在 RTF body 中没有 HTML 内容的特定 MSG 文件中,它们具有原始 RTF body.

名为“Body.rtf”的MailMessage class currently doesn't expose API for the RTF body (only plain and HTML body). Nevertheless, you can retrieve it as an Attachment

另外,仅供参考,您遇到的另一个问题是电子邮件 HTML body 中的图像未内联,因此,在导出为 PDF 时会丢失它们。

无论如何,请尝试使用以下方法:

static void Main()
{
    // Load an email (or retrieve it with POP or IMAP).
    MailMessage message = MailMessage.Load("input.msg");

    // Create a new document.
    DocumentModel document = new DocumentModel();

    // Import the email's body to the document.
    LoadBody(message, document);

    // Save the document as PDF.
    document.Save("output.pdf");
}

static void LoadBody(MailMessage message, DocumentModel document)
{
    if (!string.IsNullOrEmpty(message.BodyHtml))
    {
        var htmlOptions = LoadOptions.HtmlDefault;
        // Replace attached CID images to inlined DATA urls.
        var htmlBody = ReplaceEmbeddedImages(message.BodyHtml, message.Attachments);
        // Load HTML body to the document.
        document.Content.End.LoadText(htmlBody, htmlOptions);
    }
    else if (message.Attachments.Any(a => a.FileName == "Body.rtf"))
    {
        var rtfAttachment = message.Attachments.First(a => a.FileName == "Body.rtf");
        var rtfOptions = LoadOptions.RtfDefault;
        // Get RTF body from the attachment.
        var rtfBody = rtfOptions.Encoding.GetString(rtfAttachment.Data.ToArray());
        // Load RTF body to the document.
        document.Content.End.LoadText(rtfBody, rtfOptions);
    }
    else
    {
        // Load TXT body to the document.
        document.Content.End.LoadText(message.BodyText, LoadOptions.TxtDefault);
    }
}

static string ReplaceEmbeddedImages(string htmlBody, AttachmentCollection attachments)
{
    var srcPattern =
        "(?<=<img.+?src=[\"'])" +
        "(.+?)" +
        "(?=[\"'].*?>)";

    // Iterate through the "src" attributes from HTML images in reverse order.
    foreach (var match in Regex.Matches(htmlBody, srcPattern, RegexOptions.IgnoreCase).Cast<Match>().Reverse())
    {
        var imageId = match.Value.Replace("cid:", "");
        Attachment attachment = attachments.FirstOrDefault(a => a.ContentId == imageId);

        if (attachment != null)
        {
            // Create inlined image data. E.g. "..."
            ContentEntity entity = attachment.MimeEntity;
            var embeddedImage = entity.Charset.GetString(entity.Content);
            var embeddedSrc = $"data:{entity.ContentType};{entity.TransferEncoding},{embeddedImage}";

            // Replace the "src" attribute with the inlined image.
            htmlBody = $"{htmlBody.Substring(0, match.Index)}{embeddedSrc}{htmlBody.Substring(match.Index + match.Length)}";
        }
    }

    return htmlBody;
}

有关详细信息(例如如何添加电子邮件 headers 和附件以输出 PDF),请查看 Convert Email to PDF 示例。