如何使用 YouTube API 提取所有 YouTube 评论？ (Python)

Question

假设我有一个 video_id 有 8487 个评论。
此代码 returns 仅 4309 条评论。

def get_comments(youtube, video_id, comments=[], token=''):

  video_response=youtube.commentThreads().list(part='snippet',
                                               videoId=video_id,
                                               pageToken=token).execute()
  for item in video_response['items']:
        comment = item['snippet']['topLevelComment']
        text = comment['snippet']['textDisplay']
        comments.append(text)
  if "nextPageToken" in video_response: 
    return get_comments(youtube, video_id, comments, video_response['nextPageToken'])
  else:
    return comments

youtube = build('youtube', 'v3',developerKey=api_key)
comment_threads = get_comments(youtube,video_id)
print(len(comment_threads))

> 4309

如何提取所有 8487 条评论？

Answer 1

根据 commentThreads 的回答，您必须添加 replies 参数才能检索评论可能有的回复。

因此，您的请求应如下所示：

video_response=youtube.commentThreads().list(part='id,snippet,replies',
                                               videoId=video_id,
                                               pageToken=token).execute()

然后，相应地修改您的代码以阅读 replies 的评论。

在此 example I made 中，使用文档中提供的 try-it 功能，您可以检查回复是否包含顶部评论及其回复。

编辑 (08/04/2022):

创建一个新变量，其中包含 topLevelComment 可能具有的 totalReplyCount。

类似于：

def get_comments(youtube, video_id, comments=[], token=''):

  # Stores the total reply count a top level commnet has.
  totalReplyCount = 0
  
  # Replies of the top-level comment might have.
  replies=[]

  video_response=youtube.commentThreads().list(part='snippet',
                                               videoId=video_id,
                                               pageToken=token).execute()
      for item in video_response['items']:
            comment = item['snippet']['topLevelComment']
            text = comment['snippet']['textDisplay']

            # Get the total reply count: 
            totalReplyCount = item['snippet']['totalReplyCount']
            
            # Check if the total reply count is greater than zero, 
            # if so,call the new function "getAllTopLevelCommentReplies(topCommentId, replies, token)" 
            # and extend the "comments" returned list.
            if (totalReplyCount > 0): 
               comments.extend(getAllTopLevelCommentReplies(comment['id'], replies, None)) 
            else: 
               comments.append(text)
               
            # Clear variable - just in case - not sure if need due "get_comments" function initializes the variable.
            replies = []

      if "nextPageToken" in video_response: 
        return get_comments(youtube, video_id, comments, video_response['nextPageToken'])
      else:
        return comments

然后，如果 totalReplyCount 的值大于零，则使用 comment.list 进行另一个调用以获取顶级评论的回复。对于这个新调用，您必须传递顶级评论的 id。

示例（未测试）：

# Returns all replies the top-level comment has: 
# topCommentId = it's the id of the top-level comment you want to retrieve its replies.
# replies = array of replies returned by this function. 
# token = the comments.list might return moren than 100 comments, if so, use the nextPageToken for retrieve the next batch of results.
def getAllTopLevelCommentReplies(topCommentId, replies, token): 
    replies_response=youtube.comments().list(part='snippet',
                                               maxResults=100,
                                               parentId=topCommentId
                                               pageToken=token).execute()

  for item in replies_response['items']:
        # Append the reply's text to the 
        replies.append(item['snippet']['textDisplay'])

  if "nextPageToken" in replies_response: 
    return getAllTopLevelCommentReplies(topCommentId, replies, replies_response['nextPageToken'])
  else:
    return replies

编辑 (11/04/2022):

我添加了我根据您的代码修改的 Google Colab example，它适用于我的视频示例 (ouf0ozwnU84) = 它带来了 130 条评论，但是，用你的视频示例 (BaGgScV4NN8) 我得到了 3359 中的 3300。

这可能是一些评论可能在 approval/moderation 下或我遗漏的其他内容，或者可能评论太旧，需要额外的过滤器，或者API 有问题 - see here some other questions related to troubles facing with the pagination using the API - I suggest you to check this tutorial 显示代码，您可以更改它。

如何使用 YouTube API 提取所有 YouTube 评论？ (Python)

How to extract all YouTube comments using YouTube API? (Python)

python

youtube

list

youtube-api