稳健的连续 TCP 连接(python 套接字)

robust continuous TCP connection (python socket)

我的目标是在一台服务器和一台客户端之间建立一个连续且稳定的 TCP 连接。如果一侧出现故障,另一侧应该等到它恢复。

我根据 (that only asks for continuous, but not robust TCP connections and does not handle keepalive issues), this post和自己的经验写了下面的代码

我有两个问题:

  1. 如何让 keepalive 工作?如果服务器死机,客户端只有在尝试 send() 后才能识别它 - 这在没有 KEEPALIVE 选项的情况下也有效,因为这会导致连接重置。套接字是否有某种方式为死连接或我可以定期检查的某些保持活动功能发送中断?

  2. 这是处理连续 TCP 连接的可靠方法吗?拥有稳定、连续的 TCP 连接似乎是一个标准问题,但是,我找不到详细介绍这个的教程。必须有一些最佳实践

请注意,我可以在应用程序级别自行处理保持活动消息。但是,由于 TCP 已经在传输级别实现了这一点,因此最好依赖较低级别提供的此服务。

服务器:

from socket import *
serverPort = 12000

while True:
    # 1. Configure server socket
    serverSocket = socket(AF_INET, SOCK_STREAM)
    serverSocket.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
    serverSocket.bind(('127.0.0.1', serverPort))
    serverSocket.listen(1)
    print("waiting for client connecting...")
    connectionSocket, addr = serverSocket.accept()
    connectionSocket.setsockopt(SOL_SOCKET, SO_KEEPALIVE,1)
    print(connectionSocket.getsockopt(SOL_SOCKET,SO_KEEPALIVE))
    print("...connected.")
    serverSocket.close() # Destroy the server socket; we don't need it anymore since we are not accepting any connections beyond this point.

    # 2. communication routine
    while True:
        try:
            sentence = connectionSocket.recv(512).decode()
        except ConnectionResetError as e:
            print("Client connection closed")
            break
        if(len(sentence)==0): # close if client closed connection
            break 
        else:
            print("recv: "+str(sentence))

    # 3. proper closure
    connectionSocket.shutdown(SHUT_RDWR)
    connectionSocket.close()
    print("connection closed.")

客户:

from socket import *
import time

while True:
    # 1. configure socket dest.
    serverName = '127.0.0.1'
    serverPort = 12000
    clientSocket = socket(AF_INET, SOCK_STREAM)
    try:
        clientSocket.setsockopt(SOL_SOCKET, SO_KEEPALIVE,1)
        clientSocket.connect((serverName, serverPort))
        print(clientSocket.getsockopt(SOL_SOCKET,SO_KEEPALIVE))
    except ConnectionRefusedError as e:
        print("Server refused connection. retrying")
        time.sleep(1)
        continue

    # 2. communication routine
    while(1):
        sentence = input('input sentence: ')
        if(sentence == "close"):
            break
        try:
            clientSocket.send(sentence.encode())
        except ConnectionResetError as e:
            print("Server connection closed")
            break

    # 3. proper closure
    clientSocket.shutdown(SHUT_RDWR)
    clientSocket.close()

我尽量减少这个例子。但是考虑到鲁棒性的要求,还是比较长的。

我还尝试了一些套接字选项,如 TCP_KEEPIDLETCP_KEEPINTVLTCP_KEEPCNT

谢谢!

How can I make the keepalive work? If the server dies, the client only recognizes it after trying to send() - which worked also without the KEEPALIVE option as this results in a connection reset.

Keepalive在服务端或读取端更有用。它是一只狡猾的野兽。套接字根本不会通知你,除非你read/write。你可以查询它的状态(尽管我不确定这在标准 Python 中是否可行)但这仍然不能解决通知问题。无论如何,你需要定期检查状态。

Is there some way that the socket sends an interrupt for a connection that is dead or some keepalive function that I can check on a regular basis?

你听说过the Two Generals' Problem吗?没有可靠的方法来检测一侧是否已死亡。然而,我们可以通过 ping 和超时来接近。

Note, I could handle keep alive messages on my own at the application level. However, as TCP already implements this at transport level, it is better to rely on this service provided by the lower level.

不,不是更好。如果出于某种原因,服务器和客户端之间存在代理,则 TCP 功能将无济于事。因为按照设计,它们只控制一个连接,而使用代理时,您至少有两个连接。您不应该根据底层传输 (TCP) 来考虑您的连接。而是使用 ping 命令创建您自己的协议,服务器(或客户端或两者)定期发送超时。通过这种方式,您可以确保对等方在周期间隔内处于活动状态。

Is this a robust way of handling a continous TCP connection? Having a stable, continous TCP connection seems to be a standard problem, however, I couldn't find tutorials covering this in detail. There must be some best-practice.

您不会找到有关此问题的教程,因为该问题没有解决方案。大多数人使用 ping 和超时的组合来模拟 "I'm still alive"。

我会尽量回答这两个问题。

  1. ... 套接字是否可以通过某种方式为死连接发送中断...

    我知道none。 TCP_KEEPALIVE 仅 尝试 维持连接。如果网络流上的任何设备有超时,这将非常有用,因为它可以防止超时中止连接。但是,如果由于任何其他原因(即超时)导致连接断开,TCP_KEEPALIVE 将无能为力。理由是在必须交换某些内容之前无需恢复断开的非活动连接。

  2. 这是处理连续 TCP 连接的可靠方法吗?

    不是真的。

    稳健的方法是随时准备连接因任何原因而失败。因此,您应该准备好在发送消息时面对错误(您的代码是),如果发生这种情况,请尝试重新打开连接并再次发送消息(您当前的代码没有)。类似于:

    def connect(...):
        # establish and return a connection
        ...
        return clientSocket
    
    clientSocket = connect(...)
    while True:
        ...
        while True:
            try:
                clientSocket.send(message)
                break
            except OSError:
                clientSocket = connect()
        ...
    

无关:您的正常关机不正确。发起者(使用shutdown的部分)不应立即关闭套接字,而是启动一个读取循环,只有在所有内容都已接收和处理后才关闭。