无法从 tweepy 中检索推文(没有 error/columns,没有输出结果)
Unable to retrieve tweets from tweepy (No error/columns with no result in output)
apikey = '2238c8h8E25gSVU1WW28ti7fS7'
apisecretkey = 'ssLG9s4rt4QwLo6PFyMSpLVRT1IoQ3f1EwrrgzTg6TRJLUTeI5e'
accesstoken = '33347844103627698178-3HuOoCCFuMWHwLTmhswKUtJSvG22et'
accesstokensecret = '2s8tAcatrjTHgh81Oo7dw6rvWGGRFZoSrPDa5eInY22Q3c'
auth = tw.OAuthHandler(apikey,apisecretkey) #calling OAuthHandler required for authantication with Twitter
auth.set_access_token(accesstoken,accesstokensecret)
api = tw.API(auth,wait_on_rate_limit=True)
search_word = '#IndvsAus' or '#AusvsInd'
date_since = '2021-01-10'
date_until = '2021-01-11'
tweets = tw.Cursor(api.search,q = search_word+' -filter:retweets',\
lang ='en',tweet_mode='extended',since='date_since',until='date_until').items(100)
tweet_details = [[tweet.id,tweet.source,tweet.full_text,tweet.user.location,tweet.user.created_at,tweet.user.verified,tweet.created_at]for tweet in tweets]
import pandas as pd
tweet100_df = pd.DataFrame(data = tweet_details,columns=['tweet_id','source','Full_text','User_location','User_created_at','User_verified','tweet_timestamp',])
pd.set_option('max_colwidth',800)
tweet100_df.head(20)
输出:tweet_id源Full_textUser_locationUser_created_atUser_verifiedtweet_timestamp
输出没有显示推文,只有列标题。我哪里错了?
为了将 Tweepy 的 CursorAPI 的输出转储到 Pandas DataFrame 中,您需要传递 pd.DataFrame
字典列表以及您感兴趣的字段作为列名。
Tweepy 具有将来自 Cursor items()
方法的数据结构化为字典的方法。
你的情况:
tweets = tw.Cursor(api.search,q = search_word+' -filter:retweets',\
lang ='en',tweet_mode='extended',since='date_since',until='date_until').items(100)
list_of_dicts = []
for each_json_tweet in tweets:
list_of_dicts.append(tweets._json)
然后你可以做:
tweet100_df = pd.DataFrame(data=list_of_dicts,columns=['tweet_id','source','Full_text','User_location','User_created_at','User_verified','tweet_timestamp'])
apikey = '2238c8h8E25gSVU1WW28ti7fS7'
apisecretkey = 'ssLG9s4rt4QwLo6PFyMSpLVRT1IoQ3f1EwrrgzTg6TRJLUTeI5e'
accesstoken = '33347844103627698178-3HuOoCCFuMWHwLTmhswKUtJSvG22et'
accesstokensecret = '2s8tAcatrjTHgh81Oo7dw6rvWGGRFZoSrPDa5eInY22Q3c'
auth = tw.OAuthHandler(apikey,apisecretkey) #calling OAuthHandler required for authantication with Twitter
auth.set_access_token(accesstoken,accesstokensecret)
api = tw.API(auth,wait_on_rate_limit=True)
search_word = '#IndvsAus' or '#AusvsInd'
date_since = '2021-01-10'
date_until = '2021-01-11'
tweets = tw.Cursor(api.search,q = search_word+' -filter:retweets',\
lang ='en',tweet_mode='extended',since='date_since',until='date_until').items(100)
tweet_details = [[tweet.id,tweet.source,tweet.full_text,tweet.user.location,tweet.user.created_at,tweet.user.verified,tweet.created_at]for tweet in tweets]
import pandas as pd
tweet100_df = pd.DataFrame(data = tweet_details,columns=['tweet_id','source','Full_text','User_location','User_created_at','User_verified','tweet_timestamp',])
pd.set_option('max_colwidth',800)
tweet100_df.head(20)
输出:tweet_id源Full_textUser_locationUser_created_atUser_verifiedtweet_timestamp
输出没有显示推文,只有列标题。我哪里错了?
为了将 Tweepy 的 CursorAPI 的输出转储到 Pandas DataFrame 中,您需要传递 pd.DataFrame
字典列表以及您感兴趣的字段作为列名。
Tweepy 具有将来自 Cursor items()
方法的数据结构化为字典的方法。
你的情况:
tweets = tw.Cursor(api.search,q = search_word+' -filter:retweets',\
lang ='en',tweet_mode='extended',since='date_since',until='date_until').items(100)
list_of_dicts = []
for each_json_tweet in tweets:
list_of_dicts.append(tweets._json)
然后你可以做:
tweet100_df = pd.DataFrame(data=list_of_dicts,columns=['tweet_id','source','Full_text','User_location','User_created_at','User_verified','tweet_timestamp'])