从推文中提取日期(Tweepy,Python)
Extract date from tweets (Tweepy, Python)
我是 Python 的新手,所以我在这方面遇到了一些困难。基本上,下面的代码获取带有标签比特币的推文文本,我想提取日期和作者以及文本。我尝试过不同的东西,但卡住了 rn。
非常感谢对此的任何帮助。
import pandas as pd
import numpy as np
import tweepy
api_key = '*'
api_secret_key = '*'
access_token = '*'
access_token_secret = '*'
authentication = tweepy.OAuthHandler(consumer_key, consumer_secret_key)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(authentication, wait_on_rate_limit=True)
#Get tweets about Bitcoin and filter out any retweets
search_term = '#bitcoin -filter:retweets'
tweets = tweepy.Cursor(api.search_tweets, q=search_term, lang='en', since='2018-11-01', tweet_mode='extended').items(50)
all_tweets = [tweet.full_text for tweet in tweets]
df = pd.DataFrame(all_tweets, columns=['Tweets'])
df.head()
如果你使用 dir(tweet)
那么你会看到对象 tweet
中的所有变量和函数
author
contributors
coordinates
created_at
destroy
display_text_range
entities
extended_entities
favorite
favorite_count
favorited
full_text
geo
id
id_str
in_reply_to_screen_name
in_reply_to_status_id
in_reply_to_status_id_str
in_reply_to_user_id
in_reply_to_user_id_str
is_quote_status
lang
metadata
parse
parse_list
place
possibly_sensitive
retweet
retweet_count
retweeted
retweets
source
source_url
truncated
user
还有created_at
all_tweets = []
for tweet in tweets:
#print('\n'.join(dir(tweet)))
all_tweets.append( [tweet.full_text, tweet.created_at] )
df = pd.DataFrame(all_tweets, columns=['Tweets', 'Created At'])
df.head()
结果:
Tweets Created At
0 @Ralvero Of course $KAWA ready for 100x #ETH ... 2022-03-26 13:51:06+00:00
1 Pairs:1INCHUSDT \n SELL:1.58500\n Time :3/26/2... 2022-03-26 13:51:06+00:00
2 @hotcrosscom @iSafePal First LIVE Dapp: Cylu... 2022-03-26 13:51:04+00:00
3 @Justdoitalex @Isabel_Schnabel Finally a truth... 2022-03-26 13:51:03+00:00
4 #Bitcoin has rejected for the fourth time the ... 2022-03-26 13:50:55+00:00
但您的代码在 since
方面存在问题,因为它似乎已在 3.8 版
中删除
参见:Collect tweets in a specific time period in Tweepy, until and since doesn't work
我是 Python 的新手,所以我在这方面遇到了一些困难。基本上,下面的代码获取带有标签比特币的推文文本,我想提取日期和作者以及文本。我尝试过不同的东西,但卡住了 rn。 非常感谢对此的任何帮助。
import pandas as pd
import numpy as np
import tweepy
api_key = '*'
api_secret_key = '*'
access_token = '*'
access_token_secret = '*'
authentication = tweepy.OAuthHandler(consumer_key, consumer_secret_key)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(authentication, wait_on_rate_limit=True)
#Get tweets about Bitcoin and filter out any retweets
search_term = '#bitcoin -filter:retweets'
tweets = tweepy.Cursor(api.search_tweets, q=search_term, lang='en', since='2018-11-01', tweet_mode='extended').items(50)
all_tweets = [tweet.full_text for tweet in tweets]
df = pd.DataFrame(all_tweets, columns=['Tweets'])
df.head()
如果你使用 dir(tweet)
那么你会看到对象 tweet
author
contributors
coordinates
created_at
destroy
display_text_range
entities
extended_entities
favorite
favorite_count
favorited
full_text
geo
id
id_str
in_reply_to_screen_name
in_reply_to_status_id
in_reply_to_status_id_str
in_reply_to_user_id
in_reply_to_user_id_str
is_quote_status
lang
metadata
parse
parse_list
place
possibly_sensitive
retweet
retweet_count
retweeted
retweets
source
source_url
truncated
user
还有created_at
all_tweets = []
for tweet in tweets:
#print('\n'.join(dir(tweet)))
all_tweets.append( [tweet.full_text, tweet.created_at] )
df = pd.DataFrame(all_tweets, columns=['Tweets', 'Created At'])
df.head()
结果:
Tweets Created At
0 @Ralvero Of course $KAWA ready for 100x #ETH ... 2022-03-26 13:51:06+00:00
1 Pairs:1INCHUSDT \n SELL:1.58500\n Time :3/26/2... 2022-03-26 13:51:06+00:00
2 @hotcrosscom @iSafePal First LIVE Dapp: Cylu... 2022-03-26 13:51:04+00:00
3 @Justdoitalex @Isabel_Schnabel Finally a truth... 2022-03-26 13:51:03+00:00
4 #Bitcoin has rejected for the fourth time the ... 2022-03-26 13:50:55+00:00
但您的代码在 since
方面存在问题,因为它似乎已在 3.8 版
参见:Collect tweets in a specific time period in Tweepy, until and since doesn't work