如何解码 Ruby 中的字符串

How to decode a string in Ruby

我正在使用 Mandrill 入站电子邮件 API,当一封电子邮件的附件在其文件名中包含一个或多个空格时,文件名会以一种我不知道的格式编码如何解码。

这是我收到的文件名字符串示例:=?UTF-8?B?TWlzc2lvbmFyecKgRmFpdGjCoFByb21pc2XCoGFuZMKgQ2FzaMKgUmVjZWlwdHPCoFlURMKgMjUzNQ==?= =?UTF-8?B?OTnCoEp1bHktMjAxNS5jc3Y=?=

我试过 Base64.decode64(#{encoded_value}) 但那不是 return 可读的文本。

如何将该值解码为可读字符串?

感谢 @Yevgeniy-Anfilofyev 的评论,他为我指明了正确的方向,我能够编写以下方法正确解析编码值并返回 ASCII 字符串。

def self.decode(value)
  # It turns out the value is made up of multiple encoded parts
  # so we first need to split each part so we can decode them seperately
  encoded_parts = name.split('=?UTF-8?B?').
                       map{|x| x.sub(/\?.*$/, '') }.
                       delete_if{|x| x.blank? }

  encoded_parts.map{|x| Base64.decode64(x)}. # decode each part
                join(''). # join the parts together
                force_encoding('utf-8'). # force UTF-8 encoding
                gsub("\xC2\xA0", " ") # remove the UTF-8 encoded spaces with an ASCII space
end

这是MIME encoded-word syntax as defined in RFC-2822。来自维基百科:

The form is: "=?charset?encoding?encoded text?=".

  • charset may be any character set registered with IANA. Typically it would be the same charset as the message body.
  • encoding can be either "Q" denoting Q-encoding that is similar to the quoted-printable encoding, or "B" denoting base64 encoding.
  • encoded text is the Q-encoded or base64-encoded text.

幸运的是,您不需要为此编写解码器。 Mail gem comes with a Mail::Encodings.value_decode method that works perfectly and is very well-tested:

subject = "=?UTF-8?B?TWlzc2lvbmFyecKgRmFpdGjCoFByb21pc2XCoGFuZMKgQ2FzaMKgUmVjZWlwdHPCoFlURMKgMjUzNQ==?= =?UTF-8?B?OTnCoEp1bHktMjAxNS5jc3Y=?="
Mail::Encodings.value_decode(subject)
# => "Missionary Faith Promise and Cash Receipts YTD 253599 July-2015.csv"

它可以优雅地处理许多您可能不会想到的边缘情况(直到您的应用试图处理它们并失败):

subject = "Re:[=?iso-2022-jp?B?GyRCJTAlayE8JV0lcyEmJTglYyVRJXMzdDwwMnEbKEI=?=\n =?iso-2022-jp?B?GyRCPFIbKEI=?=] =?iso-2022-jp?B?GyRCSlY/LiEnGyhC?=\n  =?iso-2022-jp?B?GyRCIVolMCVrITwlXSVzIVskKkxkJCQ5ZyRvJDsbKEI=?=\n =?iso-2022-jp?B?GyRCJE43byRLJEQkJCRGIUolaiUvJSglOSVIGyhC?=#1056273\n =?iso-2022-jp?B?GyRCIUsbKEI=?="
Mail::Encodings.value_decode(subject)
# => "Re:[グルーポン・ジャパン株式会社] 返信:【グルーポン】お問い合わせの件について(リクエスト#1056273\n )"

如果您正在使用 Rails,那么您已经拥有邮件 gem。否则只需将 gem "mail" 添加到您的 Gemfile,然后添加 bundle install,然后在您的脚本中添加 require "mail".