Tweepy 的 Streaming API 无法识别来自不同设备的推文

Question

这是一个非常奇怪和具体的问题。我正在开发一个 Twitter 机器人，用户可以向其发送引述推文，而该机器人将依次接收这些推文并制作一张鼓舞人心的图片来配合他们的引述。因此，例如，假设我要发推文：@fake_quotes_bot "I'm gonna starve myself until they listen" - Ghandi。那么它会使用连字符旁边的那句话和那个人并生成图像。

从一般性出发，我刚刚编写了一个引用过滤器，以确保机器人能够以最有效的方式获取引用。因此，例如，这将没有用：@fake_quotes_bot "Hello'this is" ' "a " quote" - person。在这个引用过滤器中，如果用户错误引用他们的推文（如图所示），我的机器人将自动回复有关如何正确构建他们的推文的说明。在运行我桌面上的 PyCharm 中的机器人，然后使用不同的帐户在机器人上发推文，一切都很好。错误消息是满足, 如果正确构造了推文, 它会批准推文。但是, 当我从另一个设备而不是机器人运行所在的台式计算机发送推文时, 问题就出现了。逻辑当从桌面发送推文时，它似乎工作得很好，但现在当通过 iPhone 接收推文时，它的表面就变得平淡无奇了，无论我在那里抛出什么推文，机器人都会收到相同的错误消息。

这是我的代码：

import tweepy
import json

consumer_key, consumer_secret = ###, ###
access_token, access_token_secret = ###, ###

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)

api = tweepy.API(auth)


def Data_Analysis(tweet, tweet_data):
    def Data_Write():
        print("DATA FOR THIS TWEET:", tweet_data, "\n")

    def Quote_Filter():

        print("INCOMING TWEET: " + "  > " + str(tweet) + " <   " + " FROM: @" +
              str(tweet_data.get('user', '').get('screen_name', '')) + "/" + tweet_data.get('user', '').get('name', ''))

        def Profanity_Filter():
            pass

        def Person_Filter():
            #WIP for now
            print("Filtering people...", end=" ")
            print("SUCCESSFUL")
            print("APPROVED TWEET: " + tweet)
            print("APPROVED TWEET DATA:", tweet_data, "\n")

        def Quotation_Marks_Filter():

            print("Filtering quotation marks...", end=" ")

            # Filters out tweets that contain quotes
            if '"' in tweet or "'" in tweet:
                double_quote_count = tweet.count('"')
                single_quote_count = tweet.count("'")

                # Double Quotes quote
                if double_quote_count > 0 and single_quote_count == 0:
                    if double_quote_count > 2:
                        api.update_status("@" + str(tweet_data.get('user', '').get('screen_name', '')) +
                                          " ERROR: Please refrain from using too many quotation marks.",
                                          tweet_data.get('id'))

                        print("ERROR: Please refrain from using too many quotation marks. \n")
                    elif double_quote_count == 1:
                        api.update_status("@" + str(tweet_data.get('user', '').get('screen_name', '')) +
                                          " ERROR: Only a singular quote was entered.",
                                          tweet_data.get('id'))

                        print("ERROR: Only a singular quote was entered. \n")
                    # Pass through to other filter
                    else:
                        print("SUCCESSFUL")
                        Person_Filter()

                # Single quotes quote
                elif double_quote_count == 0 and single_quote_count > 0:
                    if single_quote_count > 2:
                        api.update_status("@" + str(tweet_data.get('user', '').get('screen_name', '')) +
                                          " ERROR: Please refrain from using too many quotation marks.",
                                          tweet_data.get('id'))

                        print("ERROR: Please refrain from using too many quotation marks. \n")
                    elif single_quote_count == 1:
                        api.update_status("@" + str(tweet_data.get('user', '').get('screen_name', '')) +
                                          " ERROR: Only a singular quote was entered.",
                                          tweet_data.get('id'))

                        print("ERROR: Only a singular quote was entered. \n")
                    # Pass through to other filter
                    else:
                        print("SUCCESSFUL")
                        Person_Filter()

                # If a quote has two types of quotes
                else:
                    # Filter if there are too many quotes per character
                    if double_quote_count > 2:
                        api.update_status("@" + str(tweet_data.get('user', '').get('screen_name', '')) +
                                          " ERROR: If you are implementing a quote within a quote or are abbreviating,"
                                          "please refrain from using more than two instances of a double quote."
                                          , tweet_data.get('id'))

                        print("ERROR: If you are implementing a quote within a quote or are abbreviating,"
                              "please refrain from using more than two instances of a double quote. \n")
                    elif double_quote_count == 1:
                        api.update_status("@" + str(tweet_data.get('user', '').get('screen_name', '')) +
                                          " ERROR: Could not identify the quote. If you are implementing a quote "
                                          "within a quote or are abbreviating,  please use two instances of the "
                                          "double quote.",
                                          tweet_data.get('id'))

                        print("ERROR: Could not identify the quote. If you are implementing a quote "
                              "within a quote or are abbreviating,  please use two instances of the "
                              "double quote. \n")

                    # If it's correct in its number, then figure out its beginning and ending quotes to pull text
                    else:
                        quote_indexes = []
                        quote_chars = []

                        indices = [index for index, value in enumerate(tweet) if value == '"']
                        for i in indices:
                            quote_indexes.append(i)
                            quote_chars.append('"')

                        indices = [index for index, value in enumerate(tweet) if value == "'"]
                        for i in indices:
                            quote_indexes.append(i)
                            quote_chars.append("'")

                        beginning_quote = quote_indexes.index(min(quote_indexes))
                        ending_quote = quote_indexes.index(max(quote_indexes))

                        # If the starting and ending quotes are similar (I.E. " and ") then pass through to other filter
                        if quote_chars[beginning_quote] == quote_chars[ending_quote]:
                            print("SUCCESSFUL")
                            Person_Filter()

                        # Do not align
                        else:
                            api.update_status("@" + str(tweet_data.get('user', '').get('screen_name', '')) +
                                              " ERROR: The beginning and endings quotes do not align.",
                                              tweet_data.get('id'))

                            print("ERROR: The beginning and endings quotes do not align. \n")

            # No quote found
            elif '"' or "'" not in tweet:
                grab_user = tweet_data.get('user', '').get('screen_name', '')

                if grab_user == "fake_quotes_bot":
                    # If I were to test this on my own twitter handle, it would get stuck in an auto-reply loop.
                    # Which will probably ban me.
                    print("PASSING UNDER MY OWN SCREEN NAME... \n")

                if grab_user != "fake_quotes_bot":
                    api.update_status("@" + str(tweet_data.get('user', '').get('screen_name', '')) +
                                      " ERROR: This tweet does not contain a quote. Be sure to use quotation marks.",
                                      tweet_data.get('id'))

                    print("ERROR: This tweet does not contain a quote. Be sure to use quotation marks. \n")

        def Retweet_Filter():

            print("Filtering retweets...", end=" ")

            # Filters out tweets that are retweets
            if "RT" in tweet[0:3]:
                print("RETWEET. SKIPPING... \n")
            else:
                print("SUCCESSFUL")
                Quotation_Marks_Filter()

        Retweet_Filter()

    Quote_Filter()


class StreamListener(tweepy.StreamListener):

    def on_data(self, data):
        tweet_data = json.loads(data)
        if "extended_tweet" in tweet_data:
            tweet = tweet_data['extended_tweet']['full_text']
            Data_Analysis(tweet, tweet_data)

        else:
            try:
                tweet = tweet_data['text']
                Data_Analysis(tweet, tweet_data)
            except KeyError:
                print("ERROR: Failed to retrieve tweet. \n")


print("BOT IS NOW RUNNING. SEARCHING FOR TWEETS...\n")

Listener = StreamListener()
Stream = tweepy.Stream(auth=api.auth, listener=Listener, tweet_mode='extended')
Stream.filter(track=['@fake_quotes_bot'])

来自同一桌面的推文输出：

INCOMING TWEET:   > @fake_quotes_bot "hello, Whosebug!" <    FROM: @bulletinaction/BulletInAction
Filtering retweets... SUCCESSFUL
Filtering quotation marks... SUCCESSFUL
Filtering people... SUCCESSFUL
APPROVED TWEET: @fake_quotes_bot "hello, Whosebug!"
APPROVED TWEET DATA: {###data###}

如果我要通过我的 phone 发送推文，输出：

INCOMING TWEET:   > @fake_quotes_bot “heyyo, Whosebug” <    FROM: @bulletinaction/BulletInAction
Filtering retweets... SUCCESSFUL
Filtering quotation marks... ERROR: This tweet does not contain a quote. Be sure to use quotation marks.

这是代码运行的 YouTube 视频，因为我确定这是一个非常奇怪的问题：https://www.youtube.com/watch?v=skErnva4ePc&feature=youtu.be

Answer 1

您的 phone 使用的是左右双引号而不是引号 :

"   U+0022 QUOTATION MARK
“   U+201C LEFT DOUBLE QUOTATION MARK
”   U+201D RIGHT DOUBLE QUOTATION MARK

（来自：Are there different types of double quotes in utf-8 (PHP, str_replace)?）

因此，只需在测试前对推文文本执行替换即可：

tweet = tweet_data['extended_tweet']['full_text'] # as you did, then :
tweet = tweet.replaceAll("[\u2018\u2019]", "'")
tweet = tweet.replaceAll("[\u201C\u201D]", "\"");

（来自：Converting MS word quotes and apostrophes）

Tweepy 的 Streaming API 无法识别来自不同设备的推文

Tweepy's Streaming API does not recognize tweets from different devices

python

twitter

streaming

device

tweepy