如何从单个 Youtube 视频而不是随机视频中提取评论
How to pull comments from individual Youtube videos instead of random videos
我正在使用 Youtube API 构建一个项目,我想在其中从特定 youtuber 频道中的每个视频中提取评论。但是,当我提取评论时,它只从每个视频中提取一条评论,而不是像我希望的那样从每个视频中提取 100 条评论。
def get_video_comments(video_id):
url_video_comments ='https://www.googleapis.com/youtube/v3/commentThreads?part=snippet&videoId='+video_id+'&maxResults=100&key='+API_KEY
response_video_comments = requests.get(url_video_comments).json()
for comment in response_video_comments['items']:
if comment['kind'] == 'youtube#commentThread':
comment_text = comment['snippet']['topLevelComment']['snippet']['textOriginal']
comment_likes = comment['snippet']['topLevelComment']['snippet']['likeCount']
return comment_text, comment_likes
def get_videos(df,df_comments,pageToken):
url = 'https://www.googleapis.com/youtube/v3/search?key='+API_KEY+'&channelId='+channel_id+'&pageToken='+pageToken+'&part=snippet,id&order=date&maxResults=10000'
response = requests.get(url).json()
# wait 1 second before executing for loop in order to get all of the data
time.sleep(1)
# create loop to get information on every video
for video in response['items']:
# makes sure that we only get information on videos
if video['id']['kind'] == 'youtube#video':
video_id = video['id']['videoId']
video_title = video['snippet']['title']
upload_date = video['snippet']['publishedAt']
upload_date = str(upload_date).split('T')[0]
view_count, like_count, comment_count = get_video_details(video_id)
comment_text, comment_likes = get_video_comments(video_id)
# save data in pandas DF
df = df.append({'video_id':video_id,'video_title':video_title, 'upload_date':upload_date, 'view_count':view_count,
'like_count':like_count,'comment_count':comment_count}
, ignore_index=True)
df_comments = df_comments.append({'video_id':video_id, 'comment_text':comment_text,
'comment_likes':comment_likes},ignore_index=True)
return df, df_comments
下面是输出。
video_id comment_text comment_likes
0 0faCad2kKeg Basically, this video describes how you are ab... 0
1 9dnN82DsQ2k thanks for using metric 0
2 Y413Czri6qw "Another way to fight climate change is to eat... 0
3 W3qZIPiWKc4 6:00 0
4 ggUduBmvQ_4 should be illegal to pull loans for speculation 0
5 xhxo2oXRiio Man, this is really getting to me :( 0
6 8d5d_HXGeMA Not sure how you can fail to bring up powerful... 0
7 8egszLpKMWU africa is not the next anything. 0
8 WNrobOYWZQE For space exploration to happen, war needs to ... 0
9 ZZ3F3zWiEmc Taxes themselves are Legalized theft. Moreover... 0
10 V16GdzRvhRU Saudi Arabia and the other Islamic oil kingdom... 0
11 o4tuhWvKduU This video makes getting out of Afganistan see... 0
12 1-uNMj57Y4c Meanwhile, I am not in the air travel market a... 0
13 SR7BA3xEmDo rather trains...... 0
14 iO5mfbpq16A Because the Southern hemisphere is mostly wate... 0
15 B3FKtBNEBRc The annoying playground cytomorphologically sm... 0
16 VJtFgte1GKc 10:36 the ruthless pursuit of a fish in sea 0
17 J5PLyYVIEpg "Indian nation" Indians are in India 1
18 LHhJuAOK3CI This is really great, but I hope you do more p... 0
19 aH4b3sAs-l8 What would happen if lightning hit it mid flight? 0
20 b1JlYZQG3lI You actually can't be more wrong. We shut down... 1
21 BNpk_OGEGlA We control the supply chain by how involved we... 0
22 4p0fRlCHYyg Hey did you see the “new” united supersonic pl... 0
23 N4dOCfWlgBw I hope the cruise industry dies 0
24 3CuPqeIJr3U You westerners your days are up..Just get lost 0
25 DlTq8DbRs4k Maggie's dead now :) 0
26 VjiH3mpxyrQ What you’ve missed about ‘Crisis’ stress testi... 0
27 pLcqJ2DclEg If I were to buy a Tesla Today, I would also h... 0
28 3gdCH1XUIlE I’m glad I live in a civilized country where t... 0
29 2qanMpnYsjk I just had this recommended to me and I can al... 0
30 7R7jNWHp0D0 Another unequal biased Biden loving Trump hat... 0
31 GIFV_Z7Y9_w Hello from Kazakhstan 0
32 KXRtNwUju5g This is sexist, why was it women that throws t... 0
33 fTyUE162lrw What's the point of these subscription based l... 0
34 ZAEydOjNWyQ So if Covid is stoping it, just get everybody ... 0
35 _BCY0SPOFpE Great observation 0
36 v_rXhuaI0W8 13:50 Such a lie. 1) The two have nothing to d... 0
37 Ongqf93rAcM This works but it’s still gambling so don’t ex... 0
38 byW1GExQB84 Can we have an update?\nAstra Zeneca seems rea... 1
39 DTIDCA7mjZs My dad be like: Gracias por las instrucciones ... 1
40 H_akzwzghWQ Why are you showing us graphs of "percent devi... 0
41 7C1fPocIFgU There is a (partial) solution in the middle. C... 0
42 3J06af5xHD0 Her speech bought me to tears. 0
43 YgiMqePRp0Y There is a problem with pooled testing. The th... 0
44 Rtmhv5qEBg0 Great work good game 0
45 6GMoUmvw8kU 9:48 yo, wtf is wrong with that man`s face? 0
46 QlPrAKtegFQ Follow along as a west bound autonomous 80,000... 0
47 uAG4zCsiA_w development of tiltrotor aircraft has almost s... 1
48 NtX-Ibi21tU Wait there is a risk of 1:55 000 by walking ou... 0
49 r2oPk20OHBE 2:41 The street where I live was the last thin... 0
video_id 匹配我要从中提取评论的视频。我只是在努力理解如何才能发表不止一条评论。任何帮助将不胜感激。
你的 get_video_comments
函数似乎用 for 循环 for comment in response_video_comments['items']:
设计得很好,但是在每次迭代中你都会覆盖 comment_text
你的 get_video_comments
中的先前评论] 函数应该改为 returns 一个 comment_text
和 comment_likes
的数组,并且在每次迭代中你应该碰巧这个数组当前工作 comment
.
从视频中抓取所有评论的算法很常见,如果仍然卡住,请加深搜索。
我正在使用 Youtube API 构建一个项目,我想在其中从特定 youtuber 频道中的每个视频中提取评论。但是,当我提取评论时,它只从每个视频中提取一条评论,而不是像我希望的那样从每个视频中提取 100 条评论。
def get_video_comments(video_id):
url_video_comments ='https://www.googleapis.com/youtube/v3/commentThreads?part=snippet&videoId='+video_id+'&maxResults=100&key='+API_KEY
response_video_comments = requests.get(url_video_comments).json()
for comment in response_video_comments['items']:
if comment['kind'] == 'youtube#commentThread':
comment_text = comment['snippet']['topLevelComment']['snippet']['textOriginal']
comment_likes = comment['snippet']['topLevelComment']['snippet']['likeCount']
return comment_text, comment_likes
def get_videos(df,df_comments,pageToken):
url = 'https://www.googleapis.com/youtube/v3/search?key='+API_KEY+'&channelId='+channel_id+'&pageToken='+pageToken+'&part=snippet,id&order=date&maxResults=10000'
response = requests.get(url).json()
# wait 1 second before executing for loop in order to get all of the data
time.sleep(1)
# create loop to get information on every video
for video in response['items']:
# makes sure that we only get information on videos
if video['id']['kind'] == 'youtube#video':
video_id = video['id']['videoId']
video_title = video['snippet']['title']
upload_date = video['snippet']['publishedAt']
upload_date = str(upload_date).split('T')[0]
view_count, like_count, comment_count = get_video_details(video_id)
comment_text, comment_likes = get_video_comments(video_id)
# save data in pandas DF
df = df.append({'video_id':video_id,'video_title':video_title, 'upload_date':upload_date, 'view_count':view_count,
'like_count':like_count,'comment_count':comment_count}
, ignore_index=True)
df_comments = df_comments.append({'video_id':video_id, 'comment_text':comment_text,
'comment_likes':comment_likes},ignore_index=True)
return df, df_comments
下面是输出。
video_id comment_text comment_likes
0 0faCad2kKeg Basically, this video describes how you are ab... 0
1 9dnN82DsQ2k thanks for using metric 0
2 Y413Czri6qw "Another way to fight climate change is to eat... 0
3 W3qZIPiWKc4 6:00 0
4 ggUduBmvQ_4 should be illegal to pull loans for speculation 0
5 xhxo2oXRiio Man, this is really getting to me :( 0
6 8d5d_HXGeMA Not sure how you can fail to bring up powerful... 0
7 8egszLpKMWU africa is not the next anything. 0
8 WNrobOYWZQE For space exploration to happen, war needs to ... 0
9 ZZ3F3zWiEmc Taxes themselves are Legalized theft. Moreover... 0
10 V16GdzRvhRU Saudi Arabia and the other Islamic oil kingdom... 0
11 o4tuhWvKduU This video makes getting out of Afganistan see... 0
12 1-uNMj57Y4c Meanwhile, I am not in the air travel market a... 0
13 SR7BA3xEmDo rather trains...... 0
14 iO5mfbpq16A Because the Southern hemisphere is mostly wate... 0
15 B3FKtBNEBRc The annoying playground cytomorphologically sm... 0
16 VJtFgte1GKc 10:36 the ruthless pursuit of a fish in sea 0
17 J5PLyYVIEpg "Indian nation" Indians are in India 1
18 LHhJuAOK3CI This is really great, but I hope you do more p... 0
19 aH4b3sAs-l8 What would happen if lightning hit it mid flight? 0
20 b1JlYZQG3lI You actually can't be more wrong. We shut down... 1
21 BNpk_OGEGlA We control the supply chain by how involved we... 0
22 4p0fRlCHYyg Hey did you see the “new” united supersonic pl... 0
23 N4dOCfWlgBw I hope the cruise industry dies 0
24 3CuPqeIJr3U You westerners your days are up..Just get lost 0
25 DlTq8DbRs4k Maggie's dead now :) 0
26 VjiH3mpxyrQ What you’ve missed about ‘Crisis’ stress testi... 0
27 pLcqJ2DclEg If I were to buy a Tesla Today, I would also h... 0
28 3gdCH1XUIlE I’m glad I live in a civilized country where t... 0
29 2qanMpnYsjk I just had this recommended to me and I can al... 0
30 7R7jNWHp0D0 Another unequal biased Biden loving Trump hat... 0
31 GIFV_Z7Y9_w Hello from Kazakhstan 0
32 KXRtNwUju5g This is sexist, why was it women that throws t... 0
33 fTyUE162lrw What's the point of these subscription based l... 0
34 ZAEydOjNWyQ So if Covid is stoping it, just get everybody ... 0
35 _BCY0SPOFpE Great observation 0
36 v_rXhuaI0W8 13:50 Such a lie. 1) The two have nothing to d... 0
37 Ongqf93rAcM This works but it’s still gambling so don’t ex... 0
38 byW1GExQB84 Can we have an update?\nAstra Zeneca seems rea... 1
39 DTIDCA7mjZs My dad be like: Gracias por las instrucciones ... 1
40 H_akzwzghWQ Why are you showing us graphs of "percent devi... 0
41 7C1fPocIFgU There is a (partial) solution in the middle. C... 0
42 3J06af5xHD0 Her speech bought me to tears. 0
43 YgiMqePRp0Y There is a problem with pooled testing. The th... 0
44 Rtmhv5qEBg0 Great work good game 0
45 6GMoUmvw8kU 9:48 yo, wtf is wrong with that man`s face? 0
46 QlPrAKtegFQ Follow along as a west bound autonomous 80,000... 0
47 uAG4zCsiA_w development of tiltrotor aircraft has almost s... 1
48 NtX-Ibi21tU Wait there is a risk of 1:55 000 by walking ou... 0
49 r2oPk20OHBE 2:41 The street where I live was the last thin... 0
video_id 匹配我要从中提取评论的视频。我只是在努力理解如何才能发表不止一条评论。任何帮助将不胜感激。
你的 get_video_comments
函数似乎用 for 循环 for comment in response_video_comments['items']:
设计得很好,但是在每次迭代中你都会覆盖 comment_text
你的 get_video_comments
中的先前评论] 函数应该改为 returns 一个 comment_text
和 comment_likes
的数组,并且在每次迭代中你应该碰巧这个数组当前工作 comment
.
从视频中抓取所有评论的算法很常见,如果仍然卡住,请加深搜索。