SocketIO 在 250 个并发连接时挂起

SocketIO hangs at 250 concurrent connections

问题基本上是 socketio 在 250 个并发时挂起。无论是node app.js –nouse-idle-notification –max-old-space-size=8192 ulimit -n 2048 --expose-gc 还是pm2,都是一样的效果。

我是 运行centOS7。我有 30GB 的 ram,它是一个合适的 VPS.

我已经在本地和生产服务器上进行了测试,使用 pm2 和不使用 pm2,使用集群和不使用集群。它总是停在 252/256 并表示没有更多资源。我只是连接,没有发送任何其他内容。

这是我用过的最基本的例子。

import express from "express"
const app = express();
import http from "http"
const server = http.createServer(app);
import { Server } from "socket.io"
const io = new Server(server);

io.on('connection', (socket) => {
  console.log('a user connected');
});

server.listen(3000, () => {
  console.log('listening on *:3000');
});

我在 SSL 域上有一个 SocketIO 运行,带有反向代理和集群节点。服务器是apache。

问题很简单。一旦服务器达到 256 个连接,它就会关闭,这很奇怪。

日志显示 reason transport close 使用 pm2 logs > yourlogFile.txt &

我 运行 使用 npx artillery run my-scenario.yml 进行压力测试,yml 文件是 socket.io 文档中的默认文件,但将其设置为 websocket 仅作为传输。

app.js 有集群适配器的redis。我正在使用 admin-ui 来监控连接。它显示 6 servers created 然后一旦达到 256 个连接(1 个连接通过 admin-ui 和 255 个连接通过 artillery 压力测试),它就会关闭。

import {createServer} from "http";
const httpServer = createServer(app);
import {Server} from "socket.io";
import {createAdapter} from "@socket.io/redis-adapter";
import {setupWorker} from "@socket.io/sticky";
import { createClient } from "redis";
const pubClient = createClient({ host: 'localhost', port: 6379 });
const subClient = pubClient.duplicate();
const io = new Server(httpServer, {...})

Apache 的配置很简单:

SSLEngine on
      ProxyRequests off
      ProxyPass "/websocket/socket" balancer://nodes_ws/
      ProxyPassReverse "/websocket/socket" balancer://nodes_ws/
      ProxyTimeout 3

Header add Set-Cookie "BlazocketServer=sticky.%{BALANCER_WORKER_ROUTE}e; path=/" env=BALANCER_ROUTE_CHANGED
      <Proxy "balancer://nodes_polling">
        BalancerMember "https://localhost:3000" route=app01
        BalancerMember "https://localhost:3001" route=app02
        BalancerMember "https://localhost:3002" route=app03
        ProxySet stickysession=BlazocketServer
      </Proxy>

      <Proxy "balancer://nodes_ws">
          BalancerMember "ws://localhost:3000" route=app01
          BalancerMember "ws://localhost:3001" route=app02
          BalancerMember "ws://localhost:3002" route=app03
          ProxySet stickysession=BlazocketServer
      </Proxy>

      RewriteEngine On
      #RewriteCond %{QUERY_STRING} transport=polling
      #RewriteRule /(.*)$ http://localhost:3000/ [P]


      RewriteCond %{HTTP:Upgrade} =websocket [NC]
      RewriteRule /(.*) balancer://nodes_ws/ [P,L]

      RewriteCond %{QUERY_STRING} transport=polling
      RewriteRule /(.*) balancer://nodes_polling/ [P,L]

场景简单

config:
  target: "myurl"
  socketio:
      path: "mypath"
      transports: ["websocket"]
      
  phases:
    - duration: 60
      arrivalRate: 10
  engines:
   socketio-v3: {}

scenarios:
  - name: My sample scenario
    engine: socketio-v3
    flow:
      # wait for the WebSocket upgrade (optional)
      - think: 1

      # basic emit
      - emit:
          channel: "hello"
          data: "world"

      # emit an object
      - emit:
          channel: "hello"
          data:
            id: 42
            status: "in progress"
            tags:
              - "tag1"
              - "tag2"


      # emit with acknowledgement
      - emit:
          channel: "ping"
        acknowledge:
          match:
            value: "pong"

      # do nothing for 30 seconds then disconnect
      - think: 30

在手动检查以定时间隔连接到 for 循环后,我失败了:failed: Insufficient resources

尝试多路复用以减少套接字连接数

我认为问题出在 Apache 及其 MaxClients 设置上,默认情况下为 256。

关于 serverfault 的 thread 全面详细说明了如何更改设置。

要了解 Apache 上的线程如何转换为可服务的最大客户端数,documentation and this discussion 是很好的参考。