TCP 连接中的 "backlog" 是什么?

What is "backlog" in TCP connections?

在下面,您会看到一个 python 程序充当服务器侦听端口 9999:

的连接请求
# server.py 
import socket                                         
import time

# create a socket object
serversocket = socket.socket(
            socket.AF_INET, socket.SOCK_STREAM) 

# get local machine name
host = socket.gethostname()                           

port = 9999                                           

# bind to the port
serversocket.bind((host, port))                                  

# queue up to 5 requests
serversocket.listen(5)                                           

while True:
    # establish a connection
    clientsocket,addr = serversocket.accept()      

    print("Got a connection from %s" % str(addr))
    currentTime = time.ctime(time.time()) + "\r\n"
    clientsocket.send(currentTime.encode('ascii'))
    clientsocket.close()

问题是socket.listen()方法(即5)的参数的作用是什么。

基于互联网上的教程:

The backlog argument specifies the maximum number of queued connections and should be at least 0; the maximum value is system-dependent (usually 5), the minimum value is forced to 0.

但是:

  1. 这些排队连接是什么?
  2. 这对客户请求有什么影响吗? (我的意思是 运行 socket.listen(5) 的服务器与 运行 socket.listen(1) 的服务器在接受连接请求或接收数据时有区别吗?)
  3. 为什么最小值为零?不应该至少是1吗?
  4. 有首选值吗?
  5. 这个 backlog 是仅为 TCP 连接定义的还是也适用于 UDP 和其他协议?

注意:Python没有任何背景知识,但问题与语言无关,待回答。

What are these queued connections?

简而言之,backlog 参数指定队列将保留的挂起连接数。

当多个客户端连接到服务器时,服务器会将传入的请求保存在一个队列中。客户端排列在队列中,服务器随着queue-member的进行,一个一个地处理他们的请求。这种连接的性质称为排队连接。

Does it make any difference for client requests? (I mean is the server that is running with socket.listen(5) different from the server that is running with socket.listen(1) in accepting connection requests or in receiving data?)

是的,两种情况不同。第一种情况只允许将 5 个客户安排到队列中;而在 backlog=1 的情况下,队列中只能保留 1 个连接,从而导致进一步的连接请求被丢弃!

Why is the minimum value zero? Shouldn't it be at least 1?

我不知道 Python,但是,as per this source,在 C 中,积压参数 0 可能允许套接字接受连接,在这种情况下,监听队列的长度可能设置为实现定义的最小值。

Is there a preferred value?

这个问题没有明确的答案。我想说这取决于您的应用程序的性质,以及硬件配置和软件配置。同样,根据消息来源,BackLog 被默默地限制在 1 到 5 之间,包括在内(再次根据 C)。

Is this backlog defined for TCP connections only or does it apply for UDP and other protocols too?

没有。请注意,对于未连接的数据报套接字 (UDP),不需要 listen() 或 accept()。这是使用未连接的数据报套接字的好处之一!

但是,请记住,还有基于 TCP 的数据报套接字实现(称为 TCPDatagramSocket)也有积压参数。

当建立 TCP 连接时,会执行所谓的三向握手。双方交换一些数据包,一旦他们这样做,这个连接就被称为完成,它就可以被应用程序使用了。

不过这三次握手需要一些时间。在那段时间里,连接被排队,这就是积压。所以你可以通过.listen(no)调用来设置不完全并联的最大数量(注意根据posix标准the value is only a hint,它可能会被完全忽略)。如果有人试图建立超过积压限制的连接,另一方将拒绝它。

所以积压限制是关于未建立的未决连接。

现在在大多数情况下,更高的积压限制会更好。请注意,最大限制取决于 OS,例如cat /proc/sys/net/core/somaxconn 在我的 Ubuntu 上给了我 128

该参数的功能似乎是限制服务器将保留在队列中的传入连接请求的数量,前提是它可以在合理的时间内为当前请求和少量排队的待处理请求提供服务,同时在高负载下。这是我反对的一个很好的段落,它为这个论点提供了一些背景......

Finally, the argument to listen tells the socket library that we want it to queue up as many as 5 connect requests (the normal max) before refusing outside connections. If the rest of the code is written properly, that should be plenty.

https://docs.python.org/3/howto/sockets.html#creating-a-socket

文档前面的文字建议客户端应该进出服务器,这样您就不会首先建立一长串请求...

When the connect completes, the socket s can be used to send in a request for the text of the page. The same socket will read the reply, and then be destroyed. That’s right, destroyed. Client sockets are normally only used for one exchange (or a small set of sequential exchanges).

链接的 HowTo 指南是加快使用套接字进行网络编程的必读内容。它确实使人们关注了一些关于它的大局主题。现在,就实现细节而言,服务器套接字如何管理这个队列是另一回事,可能是一个有趣的故事。我想这个设计的动机更能说明问题,没有它,造成 denial of service attack 的障碍会非常非常低。

至于最小值 0 vs 1 的原因,我们应该记住 0 仍然是一个有效值,这意味着什么都不排队。这本质上就是说不要有请求队列,如果服务器套接字当前正在为连接服务,则直接拒绝连接。在这种情况下,应始终牢记正在服务的当前活动连接的要点,这是首先对队列感兴趣的唯一原因。

这将我们带到下一个关于首选值 的问题。这都是设计决定,你要不要排队请求?如果是这样,您可以根据我认为的预期流量和已知硬件资源选择一个您认为合理的值。我怀疑在选择一个值时有什么公式化的。这让我想知道一个请求是多么轻量级,你在服务器上排队时会面临惩罚。


更新

我想证实 user207421 的评论,然后去查找 python 来源。不幸的是,从哈希 530f506 开始,在 sockets.py source but rather in socketmodule.c#L3351-L3382 中找不到这种详细程度。

评论非常有启发性,我将在下面逐字复制源代码,并在此处挑出非常有启发性的澄清评论...

We try to choose a default backlog high enough to avoid connection drops for common workloads, yet not too high to limit resource usage.

If backlog is specified, it must be at least 0 (if it is lower, it is set to 0); it specifies the number of unaccepted connections that the system will allow before refusing new connections. If not specified, a default reasonable value is chosen.

/* s.listen(n) method */

static PyObject *
sock_listen(PySocketSockObject *s, PyObject *args)
{
    /* We try to choose a default backlog high enough to avoid connection drops
     * for common workloads, yet not too high to limit resource usage. */
    int backlog = Py_MIN(SOMAXCONN, 128);
    int res;

    if (!PyArg_ParseTuple(args, "|i:listen", &backlog))
        return NULL;

    Py_BEGIN_ALLOW_THREADS
    /* To avoid problems on systems that don't allow a negative backlog
     * (which doesn't make sense anyway) we force a minimum value of 0. */
    if (backlog < 0)
        backlog = 0;
    res = listen(s->sock_fd, backlog);
    Py_END_ALLOW_THREADS
    if (res < 0)
        return s->errorhandler();
    Py_RETURN_NONE;
}

PyDoc_STRVAR(listen_doc,
"listen([backlog])\n\
\n\
Enable a server to accept connections.  If backlog is specified, it must be\n\
at least 0 (if it is lower, it is set to 0); it specifies the number of\n\
unaccepted connections that the system will allow before refusing new\n\
connections. If not specified, a default reasonable value is chosen.");

进一步深入到外部,我从 socketmodule 追踪以下来源...

 res = listen(s->sock_fd, backlog);

此来源在 socket.h and socket.c 结束,使用 linux 作为讨论的具体平台背景。

/* Maximum queue length specifiable by listen.  */
#define SOMAXCONN   128
extern int __sys_listen(int fd, int backlog);

在手册页中可以找到更多信息

http://man7.org/linux/man-pages/man2/listen.2.html

int listen(int sockfd, int backlog);

以及对应的docstring

listen() marks the socket referred to by sockfd as a passive socket, that is, as a socket that will be used to accept incoming connection requests using accept(2).

The sockfd argument is a file descriptor that refers to a socket of type SOCK_STREAM or SOCK_SEQPACKET.

The backlog argument defines the maximum length to which the queue of pending connections for sockfd may grow. If a connection request arrives when the queue is full, the client may receive an error with an indication of ECONNREFUSED or, if the underlying protocol supports retransmission, the request may be ignored so that a later reattempt at connection succeeds.

另外一个 source 将内核标识为负责积压队列。

The second argument backlog to this function specifies the maximum number of connections the kernel should queue for this socket.

他们继续简要说明未接受/排队的连接如何在积压中进行分区(链接源中包含一个有用的数字)。

To understand the backlog argument, we must realize that for a given listening socket, the kernel maintains two queues:

An incomplete connection queue, which contains an entry for each SYN that has arrived from a client for which the server is awaiting completion of the TCP three-way handshake. These sockets are in the SYN_RCVD state (Figure 2.4).

A completed connection queue, which contains an entry for each client with whom the TCP three-way handshake has completed. These sockets are in the ESTABLISHED state (Figure 2.4). These two queues are depicted in the figure below:

When an entry is created on the incomplete queue, the parameters from the listen socket are copied over to the newly created connection. The connection creation mechanism is completely automatic; the server process is not involved.