YouTube 数据 API 使用 pageToken 获取所有评论
YouTube Data API get all comments using pageToken
我正在尝试使用 pageToken 获取所有评论。
这是我的代码
import requests
import json
link = 'https://www.googleapis.com/youtube/v3/commentThreads?part=snippet&maxResults=100&videoId={videoId}&key={key}'
link_pageToken = 'https://www.googleapis.com/youtube/v3/commentThreads?part=snippet&maxResults=100&pageToken={pageToken}&videoId={videoId}&key={key}'
key ='...'
videoId = 'ydDn_TFkzi4'
comments = []
data = requests.get(link.format(videoId = videoId, key = key)).json()
for i in range(len(data['items'])):
comments.append(data['items'][i]['snippet']['topLevelComment']['snippet']['textOriginal'])
while 'nextPageToken' in data:
data = requests.get(link_pageToken.format(videoId = videoId, key = key, pageToken = data['nextPageToken']))
data = data.json()
for i in range(len(data['items'])):
comments.append(data['items'][i]['snippet']['topLevelComment']['snippet']['textOriginal'])
这段代码工作正常,但有点多余。所以我尝试将代码修复如下
import requests
import json
link_pageToken = 'https://www.googleapis.com/youtube/v3/commentThreads?part=snippet&maxResults=100&pageToken={pageToken}&videoId={videoId}&key={key}'
key ='...'
videoId = 'ydDn_TFkzi4'
comments = []
data = requests.get(link_pageToken.format(videoId = videoId, key = key)).json()
while 'nextPageToken' in data:
data = requests.get(link_pageToken.format(videoId = videoId, key = key, pageToken = data['nextPageToken']))
data = data.json()
for i in range(len(data['items'])):
comments.append(data['items'][i]['snippet']['topLevelComment']['snippet']['textOriginal'])
但是,下面的代码会引发 KeyError: 'pageToken'
。
我的猜测是,我首先需要找出是否有pageToken并获取pageToken,然后将其插入URL。
我该怎么做?
谢谢
我尝试了 furas 的第二个答案。这是代码
import requests
import json
link_pageToken = 'https://www.googleapis.com/youtube/v3/commentThreads?part=snippet&maxResults=100&pageToken={pageToken}&videoId={videoId}&key={key}'
key ='...'
videoId = 'ydDn_TFkzi4'
comments = []
data = requests.get(link_pageToken.format(videoId = videoId, key = key, pageToken="")).json()
while 'nextPageToken' in data:
data = requests.get(link_pageToken.format(videoId = videoId, key = key, pageToken = data['nextPageToken']))
data = data.json()
for i in range(len(data['items'])):
comments.append(data['items'][i]['snippet']['topLevelComment']['snippet']['textOriginal'])
由于某种原因,与第一个代码相比,它收集的评论较少。
第一个代码收集了 309 条,但这段代码只收集了 209 条评论。这是为什么?
在新版本中你使用相同的 link_pageToken
在两个 get()
中,但它期望 pageToken
你在第一个 format()
中没有
尝试 "{pageToken}".format()
,你会得到同样的错误。
首先 get()
你应该使用旧的 link
(没有 {pageToken}
)
r = requests.get(link.format(videoId=videoId, key=key))
data = r.json()
或者至少你应该在 format()
中使用 pageToken=""
r = requests.get(link_pageToken.format(videoId=videoId, key=key, pageToken=""))
data = r.json()
编辑:
如果你想使用一个link那么你可以
link = 'https://www.googleapis.com/youtube/v3/commentThreads?part=snippet&maxResults=100&pageToken={pageToken}&videoId={videoId}&key={key}'
r = requests.get(link.format(videoId=videoId, key=key)
data = r.json()
以后
link_pageToken = link + "&pageToken={pageToken}"
r = requests.get(link.format(videoId=videoId, key=key, pageToken=pageToken)
data = r.json()
或使用字典 - 可读性更好
url = 'https://www.googleapis.com/youtube/v3/commentThreads'
payload = {
"part": "snippet",
"maxResults": 100,
"videoId": videoId,
"key": key,
}
r = requests.get(url, params=payload)
data = r.json()
稍后添加令牌
payload["pageToken"] = pageToken
r = requests.get(url, params=payload)
data = r.json()
我正在尝试使用 pageToken 获取所有评论。
这是我的代码
import requests
import json
link = 'https://www.googleapis.com/youtube/v3/commentThreads?part=snippet&maxResults=100&videoId={videoId}&key={key}'
link_pageToken = 'https://www.googleapis.com/youtube/v3/commentThreads?part=snippet&maxResults=100&pageToken={pageToken}&videoId={videoId}&key={key}'
key ='...'
videoId = 'ydDn_TFkzi4'
comments = []
data = requests.get(link.format(videoId = videoId, key = key)).json()
for i in range(len(data['items'])):
comments.append(data['items'][i]['snippet']['topLevelComment']['snippet']['textOriginal'])
while 'nextPageToken' in data:
data = requests.get(link_pageToken.format(videoId = videoId, key = key, pageToken = data['nextPageToken']))
data = data.json()
for i in range(len(data['items'])):
comments.append(data['items'][i]['snippet']['topLevelComment']['snippet']['textOriginal'])
这段代码工作正常,但有点多余。所以我尝试将代码修复如下
import requests
import json
link_pageToken = 'https://www.googleapis.com/youtube/v3/commentThreads?part=snippet&maxResults=100&pageToken={pageToken}&videoId={videoId}&key={key}'
key ='...'
videoId = 'ydDn_TFkzi4'
comments = []
data = requests.get(link_pageToken.format(videoId = videoId, key = key)).json()
while 'nextPageToken' in data:
data = requests.get(link_pageToken.format(videoId = videoId, key = key, pageToken = data['nextPageToken']))
data = data.json()
for i in range(len(data['items'])):
comments.append(data['items'][i]['snippet']['topLevelComment']['snippet']['textOriginal'])
但是,下面的代码会引发 KeyError: 'pageToken'
。
我的猜测是,我首先需要找出是否有pageToken并获取pageToken,然后将其插入URL。
我该怎么做?
谢谢
我尝试了 furas 的第二个答案。这是代码
import requests
import json
link_pageToken = 'https://www.googleapis.com/youtube/v3/commentThreads?part=snippet&maxResults=100&pageToken={pageToken}&videoId={videoId}&key={key}'
key ='...'
videoId = 'ydDn_TFkzi4'
comments = []
data = requests.get(link_pageToken.format(videoId = videoId, key = key, pageToken="")).json()
while 'nextPageToken' in data:
data = requests.get(link_pageToken.format(videoId = videoId, key = key, pageToken = data['nextPageToken']))
data = data.json()
for i in range(len(data['items'])):
comments.append(data['items'][i]['snippet']['topLevelComment']['snippet']['textOriginal'])
由于某种原因,与第一个代码相比,它收集的评论较少。 第一个代码收集了 309 条,但这段代码只收集了 209 条评论。这是为什么?
在新版本中你使用相同的 link_pageToken
在两个 get()
中,但它期望 pageToken
你在第一个 format()
尝试 "{pageToken}".format()
,你会得到同样的错误。
首先 get()
你应该使用旧的 link
(没有 {pageToken}
)
r = requests.get(link.format(videoId=videoId, key=key))
data = r.json()
或者至少你应该在 format()
pageToken=""
r = requests.get(link_pageToken.format(videoId=videoId, key=key, pageToken=""))
data = r.json()
编辑:
如果你想使用一个link那么你可以
link = 'https://www.googleapis.com/youtube/v3/commentThreads?part=snippet&maxResults=100&pageToken={pageToken}&videoId={videoId}&key={key}'
r = requests.get(link.format(videoId=videoId, key=key)
data = r.json()
以后
link_pageToken = link + "&pageToken={pageToken}"
r = requests.get(link.format(videoId=videoId, key=key, pageToken=pageToken)
data = r.json()
或使用字典 - 可读性更好
url = 'https://www.googleapis.com/youtube/v3/commentThreads'
payload = {
"part": "snippet",
"maxResults": 100,
"videoId": videoId,
"key": key,
}
r = requests.get(url, params=payload)
data = r.json()
稍后添加令牌
payload["pageToken"] = pageToken
r = requests.get(url, params=payload)
data = r.json()