使用 MimeKit 从数据库中读取电子邮件的丰富字符时被误译

Emails' rich characters are mistranslated when read from database using MimeKit

我有一个用 VB.Net 编写的 windows 服务,它将电子邮件下载到 MimeMessage 对象中,删除它们的附件,然后将电子邮件的剩余部分写入 SQL 服务器数据库。一个单独的 ASP.Net 应用程序(使用 VB.Net)将电子邮件读回 MimeMessage 对象,并根据请求 returns 将其发送给用户。

在此过程中发生了一些事情,导致输出中出现奇怪的字符。

这个问题()看起来很有希望,但是将字符编码从 ASCII 更改为 UTF8 等并没有解决它。

这是将电子邮件保存到数据库的代码:

Sub ImportEmail(exConnectionString As String)
    Dim oClient As New Pop3Client()
    ' … email connection code removed …
    Dim message = oClient.GetMessage(0)
    Dim strippedMessage As MimeMessage = message
    ' … code to remove attachments removed …
    Dim mem As New MemoryStream
    strippedMessage.WriteTo(mem)
    Dim bytes = mem.ToArray
    Dim con As New SqlConnection(exConnectionString)
    con.Open()
    Dim com As New SqlCommand("INSERT INTO Emails (Body) VALUES (@RawDocument)", con)
    com.CommandType = CommandType.Text
    com.Parameters.AddWithValue("@RawDocument", bytes)
    com.ExecuteNonQuery()
    con.Close()
End Sub

这里是 ASP.Net 将其读回给用户的代码:


Private Sub OutputEmail(exConnectionString As String)
    Dim BlobString As String = ""
    Dim Sql As String = "SELECT Body FROM Emails WHERE Id = @id"    
    Dim com As New SqlClient.SqlCommand(Sql)
    com.CommandType = CommandType.Text
    com.Parameters.AddWithValue("@id", ViewState("email_id")) 

    Dim con As New SqlConnection(exConnectionString)
    con.Open()
    com.Connection = con
    Dim da As New SqlClient.SqlDataAdapter(com)
    Dim dt As New DataTable()
    da.Fill(dt)
    con.Close()

    If dt.Rows.Count > 0 Then
        Dim Row = dt.Rows(0)
        BlobString = Row(0).ToString()

        Dim MemStream As MemoryStream = GetMemoryStreamFromASCIIEncodedString(BlobString)
        Dim message As MimeMessage = MimeMessage.Load(MemStream)

        BodyBuilder.HtmlBody = message.HtmlBody
        BodyBuilder.TextBody = message.TextBody
        message.Body = BodyBuilder.ToMessageBody()

        Response.ContentType = "message/rfc822"
        Response.AddHeader("Content-Disposition", "attachment;filename=""" & Left(message.Subject, 35) & ".eml""")
        Response.Write(message)
        Response.End()
    End If
End Sub

Private Function GetMemoryStreamFromASCIIEncodedString(ByVal BlobString As String) As MemoryStream
    Dim BlobStream As Byte() = Encoding.ASCII.GetBytes(BlobString) ' **
    Dim MemStream As MemoryStream = New MemoryStream()
    MemStream.Write(BlobStream, 0, BlobStream.Length)
    MemStream.Position = 0
    Return MemStream
End Function

例如,假设以下文本出现在原始电子邮件中:

“So long and thanks for all the fish” (fancy quotes)

回读时显示如下:

†So long and thanks for all the fishâ€

其他字符替换如下:

–(长破折号)变为 –

•(子弹)变成•

问题在于以下代码段:

If dt.Rows.Count > 0 Then
    Dim Row = dt.Rows(0)
    BlobString = Row(0).ToString() ' <-- the ToString() is the problem

    Dim MemStream As MemoryStream = GetMemoryStreamFromASCIIEncodedString(BlobString)
    Dim message As MimeMessage = MimeMessage.Load(MemStream)

要修复数据损坏,您需要做的是:

If dt.Rows.Count > 0 Then
    Dim Row = dt.Rows(0)
    Dim BlobString as Byte() = Row(0)

    Dim MemStream As MemoryStream = new MemoryStream (BlobString, False)
    Dim message As MimeMessage = MimeMessage.Load(MemStream)

您也可以删除 GetMemoryStreamFromASCIIEncodedString 函数。

(注意:我不知道VB.NET,所以我只是猜测语法,但应该很接近正确)