如何解码原始格式的 Gmail API 获取邮件以提取链接

How to decode raw formatted Gmail API get messages to extract links

正在尝试获取给定邮件 ID 的 Gmail 邮件。我将消息作为原始类型检索并尝试在 base64 中对其进行解码,然后使用电子邮件中包含的 links。该消息似乎已被解码,但似乎有无用的字节,而且 URL 仍然无效。

def GetMessageWithId(service, user_id, msg_id, format):
    try:
        message = service.users().messages().get(userId=user_id,
                                                 id=msg_id,
                                                 format=format).execute()
        msg_str = str(base64.urlsafe_b64decode(message["raw"].encode("ASCII")))
        return msg_str
    except errors.HttpError as error:
        print("An error occurred: %s" % error)

当我检查 msg_str 时,我可以看到 link 所在的位置,但是如果我尝试复制它们并将它们放入浏览器中,它们将无效。

然后我尝试使用美丽的汤来定位 msg_str 中的 href 标签。但是,找到的 link 看起来像这样:

3D"https://post.pinterest=\r\n.com/f/a/WRi5L7G_wfTW1BovkyUGuw~~/AAAAAQA~/RgRe6WEYPwRXCXBpbnRlcmVzdEIKABwY=\r\n3AZdrwvFllIXdHVyZ2VvbmNocmlzM0BnbWFpbC5jb21YBAAAAAA~?target=3Dhttps%3A%2F%2=\r\nFwww.pinterest.com%2Fsecure%2Fautologin%2F%3Fod%3DFux7G1fLpQxdgu%252FAlq7%2=\r\n52FO0wnXhG3mrIvODBVUav9ko5yjUdnc84zWzwWN%252BPJyxYElh86K0WCnm9Th%252F6kUWW%=\r\n252FfcKmC7yJz0qo50Ss4EaaUahZGfo19MQS%252BIeP4Dlvz0hgCjvxIS4R%252BPMAF%252FG=\r\nl9BpWrQ%253D%253D%26user_id%3DNjEwMDk3MjE4MTc4OTE0MjA0%26next%3D%252Fpin%25=\r\n2F806707351985179613%252F%253Futm_campaign%253Dpopular_pins%2526e_t%253De5a=\r\nb90da0abf493b944b3c27261acfe3%2526utm_content%253D806707351985179613%2526ut=\r\nm_source%253D31%2526utm_term%253D1%2526utm_medium%253D2012

我希望整个原始电子邮件能够被解码为 html,但似乎只有部分电子邮件是。我将在 Gmail 的文档中附加一个 link 以获得此消息 https://developers.google.com/gmail/api/v1/reference/users/messages/get

这看起来像 quoted-printable 编码的文本,已经过 urlencoded。

import quopri
from urllib import parse

s = 'href%3D"https://post.pinterest=\r\n.com/f/a/WRi5L7G_wfTW1BovkyUGuw~~/AAAAAQA~/RgRe6WEYPwRXCXBpbnRlcmVzdEIKABwY=\r\n3AZdrwvFllIXdHVyZ2VvbmNocmlzM0BnbWFpbC5jb21YBAAAAAA~?target=3Dhttps%3A%2F%2=\r\nFwww.pinterest.com%2Fsecure%2Fautologin%2F%3Fod%3DFux7G1fLpQxdgu%252FAlq7%2=\r\n52FO0wnXhG3mrIvODBVUav9ko5yjUdnc84zWzwWN%252BPJyxYElh86K0WCnm9Th%252F6kUWW%=\r\n252FfcKmC7yJz0qo50Ss4EaaUahZGfo19MQS%252BIeP4Dlvz0hgCjvxIS4R%252BPMAF%252FG=\r\nl9BpWrQ%253D%253D%26user_id%3DNjEwMDk3MjE4MTc4OTE0MjA0%26next%3D%252Fpin%25=\r\n2F806707351985179613%252F%253Futm_campaign%253Dpopular_pins%2526e_t%253De5a=\r\nb90da0abf493b944b3c27261acfe3%2526utm_content%253D806707351985179613%2526ut=\r\nm_source%253D31%2526utm_term%253D1%2526utm_medium%253D2012'

parse.unquote_plus(quopri.decodestring(s).decode('utf-8'))
'href="https://post.pinterest.com/f/a/WRi5L7G_wfTW1BovkyUGuw~~/AAAAAQA~/RgRe6WEYPwRXCXBpbnRlcmVzdEIKABwY3AZdrwvFllIXdHVyZ2VvbmNocmlzM0BnbWFpbC5jb21YBAAAAAA~?target=https://www.pinterest.com/secure/autologin/?od=Fux7G1fLpQxdgu%2FAlq7%2FO0wnXhG3mrIvODBVUav9ko5yjUdnc84zWzwWN%2BPJyxYElh86K0WCnm9Th%2F6kUWW%2FfcKmC7yJz0qo50Ss4EaaUahZGfo19MQS%2BIeP4Dlvz0hgCjvxIS4R%2BPMAF%2FGl9BpWrQ%3D%3D&user_id=NjEwMDk3MjE4MTc4OTE0MjA0&next=%2Fpin%2F806707351985179613%2F%3Futm_campaign%3Dpopular_pins%26e_t%3De5ab90da0abf493b944b3c27261acfe3%26utm_content%3D806707351985179613%26utm_source%3D31%26utm_term%3D1%26utm_medium%3D2012'