Telegram bot html parsemode 正在提供字符串而不是解析它

Telegram bot html parsemode is giving string instead of parsing it

我想发送一条消息,将推特名称作为与推文超链接的文本。我一直在尝试使用 html parsemode 但不是将我的字符串视为 HTML,而是简单地返回整个字符串。我写的代码如下。

import os
# from dotenv import load_dotenv
# load_dotenv()
import requests
import json
import tweepy
from nltk.tokenize import WordPunctTokenizer
import re
from bs4 import BeautifulSoup
import config
from textblob import TextBlob, Word, Blobber
from telegram import ParseMode


token= "5236830904:AAHjxyq08dXuJzKofXmPkJ30X5V1lNOUhIc"
consumer_key= config.api_key
consumer_secret= config.api_secret_key
access_token= config.access_token
access_token_secret= config.access_token_secret

botsUrl= "https://api.telegram.org/bot{}".format(token)
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth, wait_on_rate_limit=True)

def clean_tweets(twt):
    # twt = re.sub('#ethereum', 'ethereum', twt)
    # twt = re.sub('#Ethereum', 'Ethereum', twt)
    token = WordPunctTokenizer()  
    twt = re.sub('#[A-Za-z0-9]+ ','', twt) #removes any string with a '#' character
    twt = re.sub('\n', '', twt)
    twt = re.sub('&;','and',twt)
    twt = re.sub('@[A-Za-z0-9]+ ','', twt)
    twt = re.sub('https?:\/\/\S+','',twt) #Removes any hyperlinks
    regex_pattern = re.compile(pattern = "["
        u"\U0001F600-\U0001F64F"  # emoticons
        u"\U0001F300-\U0001F5FF"  # symbols & pictographs
        u"\U0001F680-\U0001F6FF"  # transport & map symbols
        u"\U0001F1E0-\U0001F1FF"  # flags (iOS)
                           "]+", flags = re.UNICODE)
    twt = re.sub(regex_pattern,'',twt)
    pattern = re.compile(r'(https?://)?(www\.)?(\w+\.)?(\w+)(\.\w+)(/.+)?')
    twt = re.sub(pattern,'',twt)
    re_list = ['@[A-Za-z0-9_]+', '#']
    combined_re = re.compile( '|'.join( re_list) )
    twt = re.sub(combined_re,'',twt)
    del_amp = BeautifulSoup(twt, 'lxml')
    del_amp_text = del_amp.get_text()
    del_link_mentions = re.sub(combined_re, '', del_amp_text)
    del_emoticons = re.sub(regex_pattern, '', del_link_mentions)
    lower_case = del_emoticons.lower()
    words = token.tokenize(lower_case)
    result_words = [x for x in words if len(x) > 2]
    return (" ".join(result_words)).strip()

def subjectivity(twt):
    return TextBlob(twt).sentiment.subjectivity

#Function to get the polarity

def getPolarity(twt):
    return TextBlob(twt).sentiment.polarity

def getSentiment(score):
    if score<0:
        return 'Negative'
    elif score == 0:
        return 'Neutral'
    else:
        return 'Positive'

def giveUpdate(offset=None):
    url = botsUrl+ "/getupdates?timeout=100"
    if offset:
        url = botsUrl+ "/getupdates?offset={}&timeout=100".format(offset+1)
    resp= requests.get(url)
    return json.loads(resp.content)

def sendMessage(msg, chat_id):
    url= botsUrl+ "/sendMessage?chat_id={}&text={}".format(chat_id,msg,parse_mode = ParseMode.HTML)
    resp= requests.get(url)
    return "sent message"


def getReply(msg):
    tweets= tweepy.Cursor(api.search, q= "#{} -filter:retweets".format(msg)).items(5)
    all_tweet= []
    for tw in tweets:
        screen_name = tw.user.screen_name
        text = tw.text
        id = str(tw.id)
        hyperlink =  "<a href='https://twitter.com/twitter/statuses/"+id+"'>"+screen_name+"</a>"
        sentiment = getSentiment(getPolarity(text))
        finalTweet = hyperlink+' - '+text+' -- '+sentiment
        all_tweet.append(finalTweet)
    return all_tweet



id_=None
while True:
    update= giveUpdate(offset=id_)
    update= update['result']

    if update:
        for item in update:
            id_= item['update_id']
            msg= item['message']['text']
            chat_id= item['message']['from']['id']
            if msg:
                reply= getReply(msg)
                for tw in reply:
                    print(sendMessage(tw, chat_id))

我在机器人上得到的回复如下:

解决方案

对于遇到相同情况的任何人,以下内容会有所帮助:

因此,每当我们使用一种方法来操作电报机器人时,机器人都会向服务器发送 url 请求,然后在聊天屏幕上更新该请求。这里,HTML 解析模式作为参数传递,但由于字符串格式,它永远不会作为参数传递。正如我之前解释的,我们可以在 url 字符串中手动发送 HTML parsemode 参数,该参数将被机器人接收。所以有效的变化是:

hyperlink =  "<a href='https://twitter.com/twitter/statuses/"+id+"'>"+screen_name+"</a>"+"&parse_mode=HTML"

首先要做的事情是:撤销您作为代码片段的一部分发布的机器人令牌。

关于您的问题:

url= botsUrl+ "/sendMessage?chat_id={}&text={}".format(chat_id,msg,parse_mode = ParseMode.HTML)

此处字符串中只有两个插槽 .format 可以填充,因此它将忽略 parse_mode 参数。

此外,我注意到您使用了 python-telegram-bot 包,但并没有真正使用它。您只使用 telegram.ParseMode.HTML,这实际上只是键入 'HTML' 的快捷方式。该包的真正价值在于您不需要自己实现 sendMessagegiveUpdate 等方法。 telegram.ext 包还可以为您持续获取更新。参见 here and here。 是否要使用此功能或手动向 Telegram 发出请求取决于您 - 在后一种情况下,我建议根本不要使用 python-telegram-bot


免责声明:我目前是 python-telegram-bot.

的维护者