python 中套接字连接的多线程

Multithreading for a socket connection in python

我正试图从非常忙碌的抽搐聊天中抓取关键字,但有时套接字会停止一瞬间,但在那一瞬间,可以通过 5 条消息。我想过实现一些多线程,但在下面的代码中没有运气。似乎他们都没有抓住关键字,或者都成功了。任何帮助表示赞赏。代码如下:

import os
import time
from dotenv import load_dotenv
import socket
import logging
from emoji import demojize

import threading

# loading environment variables
load_dotenv()

# variables for socket
server = "irc.chat.twitch.tv"
port = 6667
nickname = "frankied003"
token = os.getenv("TWITCH_TOKEN")
channel = "#xqcow"

# creating the socket and connecting
sock = socket.socket()
sock.connect((server, port))
sock.send(f"PASS {token}\n".encode("utf-8"))
sock.send(f"NICK {nickname}\n".encode("utf-8"))
sock.send(f"JOIN {channel}\n".encode("utf-8"))

while True:
    consoleInput = input(
        "Enter correct answer to the question (use a ',' for multiple answers):"
    )

    # if console input is stop, the code will stop ofcourse lol
    if consoleInput == "stop":
        break

    # make array of all the correct answers
    correctAnswers = consoleInput.split(",")
    correctAnswers = [answer.strip().lower() for answer in correctAnswers]

    def threadingFunction():

        correctAnswerFound = False

        # while the correct answer is not found, the chats will keep on printing
        while correctAnswerFound is not True:

            while True:
                try:
                    resp = sock.recv(2048).decode(
                        "utf-8"
                    )  # sometimes this fails, hence retry until it succeeds
                except:
                    continue
                break

            if resp.startswith("PING"):
                sock.send("PONG\n".encode("utf-8"))

            elif len(resp) > 0:
                username = resp.split(":")[1].split("!")[0]
                message = resp.split(":")[2]
                strippedMessage = " ".join(message.split())

                # once the answer is found, the chats will stop, correct answer is highlighted in green, and onto next question
                if str(strippedMessage).lower() in correctAnswers:
                    print(bcolors.OKGREEN + username + " - " + message + bcolors.ENDC)
                    correctAnswerFound = True
                else:
                    if username == nickname:
                        print(bcolors.OKCYAN + username + " - " + message + bcolors.ENDC)
                    # else:
                        # print(username + " - " + message)
    
    t1 = threading.Thread(target=threadingFunction)
    t2 = threading.Thread(target=threadingFunction)
    t3 = threading.Thread(target=threadingFunction)

    t1.start()
    time.sleep(.3)
    t2.start()
    time.sleep(.3)
    t3.start()
    time.sleep(.3)

    t1.join()
    t2.join()
    t3.join()

首先,让 3 个线程在同一个套接字上并行读取没有多大意义,它只会导致混乱和竞争条件。

但主要问题是您假设单个 recv 将始终读取一条消息。但这不是 TCP 的工作方式。 TCP没有消息的概念,只是字节流。消息是应用程序级别的概念。单个 recv 可能包含一条消息、多条消息、部分消息...

所以你必须根据应用程序协议定义的语义实际解析你得到的数据,即

  1. 初始化一些缓冲区
  2. 从套接字获取一些数据并将它们添加到缓冲区 - 不要解码数据
  3. 从缓冲区中提取所有完整消息,分别解码和处理每条消息
  4. 将剩余的未完成消息留在缓冲区中
  5. 继续 #2

除此之外,不要在 recv(..).decode(..) 期间盲目丢弃错误。鉴于您使用的是阻塞套接字 recv 通常只有在连接出现致命问题时才会失败,在这种情况下重试将无济于事。问题很可能是因为您在不完整的消息上调用 decode,这也可能意味着无效的 utf-8 编码。但是由于您只是忽略了问题,所以您实际上丢失了消息。