python 中使用 Tweepy 的正则表达式错误
Regex errors utilizing Tweepy in python
我对下面显示的代码有问题。当我只是推送推文信息时,我的原始代码起作用了。一旦我编辑它以在文本中提取 URL ,它就开始给我带来问题。没有打印任何内容,我收到这些错误。
Traceback (most recent call last):
File "C:\Users\Evan\PycharmProjects\DiscordBot1\main.py", line 22, in <module>
get_tweets(api, "cnn")
File "C:\Users\Evan\PycharmProjects\DiscordBot1\main.py", line 18, in get_tweets
url2 = re.findall('http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+',
text)
File "C:\Users\Evan\AppData\Local\Programs\Python\Python39\lib\re.py", line 241, in findall
return _compile(pattern, flags).findall(string)
TypeError: cannot use a string pattern on a bytes-like object
在我 运行 它之前我没有收到任何错误,所以我非常困惑为什么这不起作用。它可能会很简单,因为我不熟悉使用 Tweepy 和 Regex。
import tweepy
import re
TWITTER_APP_SECRET = 'hidden'
TWITTER_APP_KEY = 'hidden'
auth = tweepy.OAuthHandler(TWITTER_APP_KEY, TWITTER_APP_SECRET)
api = tweepy.API(auth)
def get_tweets(api, username):
page = 1
while True:
tweets = api.user_timeline(username, page=page)
for tweet in tweets:
text = tweet.text.encode("utf-8")
url2 = re.findall('http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-
F]))+', text)
print(url2)
get_tweets(api, "cnn")
再次出错:
Traceback (most recent call last):
File "C:\Users\Evan\PycharmProjects\DiscordBot1\main.py", line 22, in <module>
get_tweets(api, "cnn")
File "C:\Users\Evan\PycharmProjects\DiscordBot1\main.py", line 18, in get_tweets
url2 = re.findall('http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+',
text)
File "C:\Users\Evan\AppData\Local\Programs\Python\Python39\lib\re.py", line 241, in findall
return _compile(pattern, flags).findall(string)
TypeError: cannot use a string pattern on a bytes-like object
Process finished with exit code 1
如果您需要更多信息来帮助我,请告诉我,感谢您的帮助,在此先感谢。
您收到该错误是因为您对已通过 encode()
.
转换为字节对象的字符串使用字符串模式(您的正则表达式)
尝试 运行 您的模式直接针对 tweet.text
而不对其进行编码。
我对下面显示的代码有问题。当我只是推送推文信息时,我的原始代码起作用了。一旦我编辑它以在文本中提取 URL ,它就开始给我带来问题。没有打印任何内容,我收到这些错误。
Traceback (most recent call last):
File "C:\Users\Evan\PycharmProjects\DiscordBot1\main.py", line 22, in <module>
get_tweets(api, "cnn")
File "C:\Users\Evan\PycharmProjects\DiscordBot1\main.py", line 18, in get_tweets
url2 = re.findall('http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+',
text)
File "C:\Users\Evan\AppData\Local\Programs\Python\Python39\lib\re.py", line 241, in findall
return _compile(pattern, flags).findall(string)
TypeError: cannot use a string pattern on a bytes-like object
在我 运行 它之前我没有收到任何错误,所以我非常困惑为什么这不起作用。它可能会很简单,因为我不熟悉使用 Tweepy 和 Regex。
import tweepy
import re
TWITTER_APP_SECRET = 'hidden'
TWITTER_APP_KEY = 'hidden'
auth = tweepy.OAuthHandler(TWITTER_APP_KEY, TWITTER_APP_SECRET)
api = tweepy.API(auth)
def get_tweets(api, username):
page = 1
while True:
tweets = api.user_timeline(username, page=page)
for tweet in tweets:
text = tweet.text.encode("utf-8")
url2 = re.findall('http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-
F]))+', text)
print(url2)
get_tweets(api, "cnn")
再次出错:
Traceback (most recent call last):
File "C:\Users\Evan\PycharmProjects\DiscordBot1\main.py", line 22, in <module>
get_tweets(api, "cnn")
File "C:\Users\Evan\PycharmProjects\DiscordBot1\main.py", line 18, in get_tweets
url2 = re.findall('http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+',
text)
File "C:\Users\Evan\AppData\Local\Programs\Python\Python39\lib\re.py", line 241, in findall
return _compile(pattern, flags).findall(string)
TypeError: cannot use a string pattern on a bytes-like object
Process finished with exit code 1
如果您需要更多信息来帮助我,请告诉我,感谢您的帮助,在此先感谢。
您收到该错误是因为您对已通过 encode()
.
尝试 运行 您的模式直接针对 tweet.text
而不对其进行编码。