SocketIO 在 250 个并发连接时挂起
SocketIO hangs at 250 concurrent connections
问题基本上是 socketio 在 250 个并发时挂起。无论是node app.js –nouse-idle-notification –max-old-space-size=8192 ulimit -n 2048 --expose-gc
还是pm2
,都是一样的效果。
我是 运行centOS7。我有 30GB 的 ram,它是一个合适的 VPS.
我已经在本地和生产服务器上进行了测试,使用 pm2 和不使用 pm2,使用集群和不使用集群。它总是停在 252/256 并表示没有更多资源。我只是连接,没有发送任何其他内容。
这是我用过的最基本的例子。
import express from "express"
const app = express();
import http from "http"
const server = http.createServer(app);
import { Server } from "socket.io"
const io = new Server(server);
io.on('connection', (socket) => {
console.log('a user connected');
});
server.listen(3000, () => {
console.log('listening on *:3000');
});
我在 SSL 域上有一个 SocketIO 运行,带有反向代理和集群节点。服务器是apache。
问题很简单。一旦服务器达到 256 个连接,它就会关闭,这很奇怪。
日志显示 reason transport close
使用 pm2 logs > yourlogFile.txt &
我 运行 使用 npx artillery run my-scenario.yml
进行压力测试,yml
文件是 socket.io
文档中的默认文件,但将其设置为 websocket
仅作为传输。
app.js
有集群适配器的redis。我正在使用 admin-ui
来监控连接。它显示 6 servers created
然后一旦达到 256 个连接(1 个连接通过 admin-ui
和 255 个连接通过 artillery
压力测试),它就会关闭。
import {createServer} from "http";
const httpServer = createServer(app);
import {Server} from "socket.io";
import {createAdapter} from "@socket.io/redis-adapter";
import {setupWorker} from "@socket.io/sticky";
import { createClient } from "redis";
const pubClient = createClient({ host: 'localhost', port: 6379 });
const subClient = pubClient.duplicate();
const io = new Server(httpServer, {...})
Apache 的配置很简单:
SSLEngine on
ProxyRequests off
ProxyPass "/websocket/socket" balancer://nodes_ws/
ProxyPassReverse "/websocket/socket" balancer://nodes_ws/
ProxyTimeout 3
Header add Set-Cookie "BlazocketServer=sticky.%{BALANCER_WORKER_ROUTE}e; path=/" env=BALANCER_ROUTE_CHANGED
<Proxy "balancer://nodes_polling">
BalancerMember "https://localhost:3000" route=app01
BalancerMember "https://localhost:3001" route=app02
BalancerMember "https://localhost:3002" route=app03
ProxySet stickysession=BlazocketServer
</Proxy>
<Proxy "balancer://nodes_ws">
BalancerMember "ws://localhost:3000" route=app01
BalancerMember "ws://localhost:3001" route=app02
BalancerMember "ws://localhost:3002" route=app03
ProxySet stickysession=BlazocketServer
</Proxy>
RewriteEngine On
#RewriteCond %{QUERY_STRING} transport=polling
#RewriteRule /(.*)$ http://localhost:3000/ [P]
RewriteCond %{HTTP:Upgrade} =websocket [NC]
RewriteRule /(.*) balancer://nodes_ws/ [P,L]
RewriteCond %{QUERY_STRING} transport=polling
RewriteRule /(.*) balancer://nodes_polling/ [P,L]
场景简单
config:
target: "myurl"
socketio:
path: "mypath"
transports: ["websocket"]
phases:
- duration: 60
arrivalRate: 10
engines:
socketio-v3: {}
scenarios:
- name: My sample scenario
engine: socketio-v3
flow:
# wait for the WebSocket upgrade (optional)
- think: 1
# basic emit
- emit:
channel: "hello"
data: "world"
# emit an object
- emit:
channel: "hello"
data:
id: 42
status: "in progress"
tags:
- "tag1"
- "tag2"
# emit with acknowledgement
- emit:
channel: "ping"
acknowledge:
match:
value: "pong"
# do nothing for 30 seconds then disconnect
- think: 30
在手动检查以定时间隔连接到 for 循环后,我失败了:failed: Insufficient resources
尝试多路复用以减少套接字连接数
我认为问题出在 Apache 及其 MaxClients
设置上,默认情况下为 256。
关于 serverfault 的 thread 全面详细说明了如何更改设置。
要了解 Apache 上的线程如何转换为可服务的最大客户端数,documentation and this discussion 是很好的参考。
问题基本上是 socketio 在 250 个并发时挂起。无论是node app.js –nouse-idle-notification –max-old-space-size=8192 ulimit -n 2048 --expose-gc
还是pm2
,都是一样的效果。
我是 运行centOS7。我有 30GB 的 ram,它是一个合适的 VPS.
我已经在本地和生产服务器上进行了测试,使用 pm2 和不使用 pm2,使用集群和不使用集群。它总是停在 252/256 并表示没有更多资源。我只是连接,没有发送任何其他内容。
这是我用过的最基本的例子。
import express from "express"
const app = express();
import http from "http"
const server = http.createServer(app);
import { Server } from "socket.io"
const io = new Server(server);
io.on('connection', (socket) => {
console.log('a user connected');
});
server.listen(3000, () => {
console.log('listening on *:3000');
});
我在 SSL 域上有一个 SocketIO 运行,带有反向代理和集群节点。服务器是apache。
问题很简单。一旦服务器达到 256 个连接,它就会关闭,这很奇怪。
日志显示 reason transport close
使用 pm2 logs > yourlogFile.txt &
我 运行 使用 npx artillery run my-scenario.yml
进行压力测试,yml
文件是 socket.io
文档中的默认文件,但将其设置为 websocket
仅作为传输。
app.js
有集群适配器的redis。我正在使用 admin-ui
来监控连接。它显示 6 servers created
然后一旦达到 256 个连接(1 个连接通过 admin-ui
和 255 个连接通过 artillery
压力测试),它就会关闭。
import {createServer} from "http";
const httpServer = createServer(app);
import {Server} from "socket.io";
import {createAdapter} from "@socket.io/redis-adapter";
import {setupWorker} from "@socket.io/sticky";
import { createClient } from "redis";
const pubClient = createClient({ host: 'localhost', port: 6379 });
const subClient = pubClient.duplicate();
const io = new Server(httpServer, {...})
Apache 的配置很简单:
SSLEngine on
ProxyRequests off
ProxyPass "/websocket/socket" balancer://nodes_ws/
ProxyPassReverse "/websocket/socket" balancer://nodes_ws/
ProxyTimeout 3
Header add Set-Cookie "BlazocketServer=sticky.%{BALANCER_WORKER_ROUTE}e; path=/" env=BALANCER_ROUTE_CHANGED
<Proxy "balancer://nodes_polling">
BalancerMember "https://localhost:3000" route=app01
BalancerMember "https://localhost:3001" route=app02
BalancerMember "https://localhost:3002" route=app03
ProxySet stickysession=BlazocketServer
</Proxy>
<Proxy "balancer://nodes_ws">
BalancerMember "ws://localhost:3000" route=app01
BalancerMember "ws://localhost:3001" route=app02
BalancerMember "ws://localhost:3002" route=app03
ProxySet stickysession=BlazocketServer
</Proxy>
RewriteEngine On
#RewriteCond %{QUERY_STRING} transport=polling
#RewriteRule /(.*)$ http://localhost:3000/ [P]
RewriteCond %{HTTP:Upgrade} =websocket [NC]
RewriteRule /(.*) balancer://nodes_ws/ [P,L]
RewriteCond %{QUERY_STRING} transport=polling
RewriteRule /(.*) balancer://nodes_polling/ [P,L]
场景简单
config:
target: "myurl"
socketio:
path: "mypath"
transports: ["websocket"]
phases:
- duration: 60
arrivalRate: 10
engines:
socketio-v3: {}
scenarios:
- name: My sample scenario
engine: socketio-v3
flow:
# wait for the WebSocket upgrade (optional)
- think: 1
# basic emit
- emit:
channel: "hello"
data: "world"
# emit an object
- emit:
channel: "hello"
data:
id: 42
status: "in progress"
tags:
- "tag1"
- "tag2"
# emit with acknowledgement
- emit:
channel: "ping"
acknowledge:
match:
value: "pong"
# do nothing for 30 seconds then disconnect
- think: 30
在手动检查以定时间隔连接到 for 循环后,我失败了:failed: Insufficient resources
尝试多路复用以减少套接字连接数
我认为问题出在 Apache 及其 MaxClients
设置上,默认情况下为 256。
关于 serverfault 的 thread 全面详细说明了如何更改设置。
要了解 Apache 上的线程如何转换为可服务的最大客户端数,documentation and this discussion 是很好的参考。