用于 twitch 的原始 Python IRC 聊天机器人

Question

我目前正在为 Twitch.tv 开发 IRC 机器人，我想知道如何实施禁用词列表？这是我目前所知道的，但由于我对 python 的了解有限，我感到很困惑。到目前为止一切都很好，除了检查消息中是否包含禁用词。这是有问题的代码：

if bannedWords.split in message:
                sendMessage(s, "/ban " + user)
                break

我想检查一个列表，看看邮件是否包含列表中的任何内容？

bannedWords = ["badword1", "badword1"]

但我不确定..

import string
from Read import getUser, getMessage
from Socket import openSocket, sendMessage
from Initialize import joinRoom

s = openSocket()
joinRoom(s)
readbuffer = ""
bannedWords = ["badword1", "badword1"]
while True:
        readbuffer = readbuffer + s.recv(1024)
        temp = string.split(readbuffer, "\n")
        readbuffer = temp.pop()

        for line in temp:
            print(line)
            if "PING" in line:
                s.send(line.replace("PING", "PONG"))
                break
            user = getUser(line)
            message = getMessage(line)
            print user + " typed :" + message
            if bannedWords.split in message:
                sendMessage(s, "/ban " + user)
                break

提前致谢！！

Answer 1

假设message和bannedWords都是字符串：

if any(map(message.__contains__, bannedWords.split())):
    ...

如果另一方面 bannedWords 已经是一个列表，如您的代码示例中所示，请跳过拆分（实际上 list 类型没有方法 split）：

if any(map(message.__contains__, bannedWords)):
    ...

这将检查字符串的任何部分是否存在任何禁用词； "The grass is greener on the other side." 将匹配 "ass".

等禁用词

请注意 map 在两个主要 python 版本之间的行为不同：

在 Python 2 map 中创建了一个 list，它否定了 any 的短路行为所提供的优势。请改用生成器表达式：any(word in message for word in bannedWords).
In Python 3 map 创建一个迭代器，它将惰性地将函数应用于给定的可迭代对象。

P.s.

关于 bannedWords.split()，通常会看到在 python 中使用多行字符串文字生成的单词列表等：

bannedWords = """
banned
words
are
bad
mmkay
""".split()

Answer 2

如果你想要完全匹配，使用一组词，在字符串上调用 lower 并检查这组坏词是否不相交：

banned_set = {"badword1", "badword2"}
if banned_set.isdisjoint(message.lower().split())
   # no bad words

如果 "foo" 被禁止，而 "foobar" 完全有效，那么使用 in/__contains__ 将错误地过滤单词，因此您需要仔细决定要走的路。

如果 banned_set.isdisjoint(message.lower().split()) 评估为 True 则可以安全地继续：

In [3]: banned_set = {"badword1", "badword2"}

In [4]: banned_set.isdisjoint("foo bar".split())
Out[4]: True

In [5]: banned_set.isdisjoint("foo bar badword1".split())
Out[5]: False

用于 twitch 的原始 Python IRC 聊天机器人

Primitive Python IRC Chat bot for twitch

python

irc

bots

twitch