如何写 Python 来获取所有 Gmail 邮件 ID?

How to write Python to get all Gmail message ID's?

我想列出来自使用 Gmail API 的 Gmail 帐户的所有邮件 ID。到目前为止,我已经能够列出消息 ID 的第一页和第二页。我知道我必须使用 pageToken 才能进入下一页结果,但我不知道如何重组我的代码,所以我没有使用 1、2、3 等变量来调用每一页。源代码如下。

get_email_ids.py:

from __future__ import print_function
import os.path
from collections import Counter
from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials

# If modifying these scopes, delete the file token.json.
SCOPES = ['https://www.googleapis.com/auth/gmail.readonly']

def main():
    """Shows basic usage of the Gmail API.
    """
    creds = None
    user_id = "me"
    # The file token.json stores the user's access and refresh tokens, and is
    # created automatically when the authorization flow completes for the first
    # time.
    if os.path.exists('token.json'):
        creds = Credentials.from_authorized_user_file('token.json', SCOPES)
    # If there are no (valid) credentials available, let the user log in.
    if not creds or not creds.valid:
        if creds and creds.expired and creds.refresh_token:
            creds.refresh(Request())
        else:
            flow = InstalledAppFlow.from_client_secrets_file(
                'credentials.json', SCOPES)
            creds = flow.run_local_server(port=0)
        # Save the credentials for the next run
        with open('token.json', 'w') as token:
            token.write(creds.to_json())

    service = build('gmail', 'v1', credentials=creds)

    ### Call the Gmail API

    ### Show messages

    token = ''
    messages = service.users().messages().list(userId=user_id,pageToken=token).execute().get('messages', [])
    token = service.users().messages().list(userId=user_id,pageToken=token).execute().get('nextPageToken', [])
    print(messages,token)

    messages2 = service.users().messages().list(userId=user_id,pageToken=token).execute().get('messages', [])
    token2 = service.users().messages().list(userId=user_id,pageToken=token).execute().get('nextPageToken', [])
    print(messages2,token2) 


if __name__ == '__main__':
    main()

get_email_ids.py 的结果(缩短):

[{'id': '179ed5ae720de1f6', 'threadId': '179ed5ae720de1f6'}, ... {'id': '179ba226644a079a', 'threadId': '17972318184138fa'}] 09573475999783117733
[{'id': '179b9f8852d3b09d', 'threadId': '179b9f8852d3b09d'}, ... {'id': '1797fa390caa3454', 'threadId': '1797fa390caa3454'}] 07601624978802434502

我无法测试它,但我会使用相同的变量 messagestoken 而没有 1,2,3,结果我会添加到包含所有消息的同一个列表中。我会 运行 它在某个循环中。

像这样

all_messages = []

token = ''

while True:
    messages = service.users().messages().list(userId=user_id, pageToken=token).execute().get('messages', [])
    token = service.users().messages().list(userId=user_id, pageToken=token).execute().get('nextPageToken', [])
    print(messages, token)

    if not messages:
        break
    
    #all_messages.extend(messages)  # `extend` or `+=`, not `append`
    all_messages += messages        # `extend` or `+=`, not `append`
    

我只是不知道 API 是如何通知没有更多消息的 - 可能是 returns 空列表或者它可能给出空标记,或者它可能会引发错误。


编辑:

其他用户的信息:如评论中提到的@emmalynnh

When there are no more messages it gives an empty token 
and the API will return a 400 error if you try to request.

可以在@furas 上制作一个改进版本。

all_messages = []
token = ''

while True:
    service_messages = service.users().messages()
    messages = service_messages.list(userId=user_id, pageToken=token).execute().get('messages', [])
    token = service_messages.list(userId=user_id, pageToken=token).execute().get('nextPageToken', [])
    if not messages:
        break
    all_messages += messages

print(all_messages)