PM2 不会在出错时重新启动集群
PM2 not restarting clusters on errors
当我使用 PM2 运行 多个进程(即集群模式)并且其中一个进程遇到未捕获的错误时,PM2 不会重新启动该进程。
为什么?
我如何让它以集群模式重启工作人员?
示例代码
// index.js
let counter = 0;
setInterval(function(){
if(counter >= 5) {
throw new Error('Worker crash. Why no restart?');
}
counter++;
console.log('Worker alive: ' + Date.now() );
},500);
运行 在命令行上
pm2 start index.js -i 4
pm2 log
最终所有的 worker 都崩溃了,再也没有重新启动。
如果只能在单个进程上完成,重新启动有什么意义。
pm2日志(合并成一个文件)
Worker alive: 1522937847186
Worker alive: 1522937847231
Worker alive: 1522937847276
Worker alive: 1522937847324
Worker alive: 1522937847691
Worker alive: 1522937847736
Worker alive: 1522937847781
Worker alive: 1522937847830
Worker alive: 1522937848193
Worker alive: 1522937848238
Worker alive: 1522937848283
Worker alive: 1522937848332
Worker alive: 1522937848693
Worker alive: 1522937848738
Worker alive: 1522937848783
Worker alive: 1522937848832
Worker alive: 1522937849194
Worker alive: 1522937849238
Worker alive: 1522937849284
Worker alive: 1522937849333
Error: Worker crash. Why no restart?
at Timeout._onTimeout (/home/usrname/docs/Projects_NodeJS/project/app/index.js:49:11)
at ontimeout (timers.js:466:11)
at tryOnTimeout (timers.js:304:5)
at Timer.listOnTimeout (timers.js:267:5)
Error: Worker crash. Why no restart?
at Timeout._onTimeout (/home/usrname/docs/Projects_NodeJS/project/app/index.js:49:11)
at ontimeout (timers.js:466:11)
at tryOnTimeout (timers.js:304:5)
at Timer.listOnTimeout (timers.js:267:5)
Error: Worker crash. Why no restart?
at Timeout._onTimeout (/home/usrname/docs/Projects_NodeJS/project/app/index.js:49:11)
at ontimeout (timers.js:466:11)
at tryOnTimeout (timers.js:304:5)
at Timer.listOnTimeout (timers.js:267:5)
Error: Worker crash. Why no restart?
at Timeout._onTimeout (/home/usrname/docs/Projects_NodeJS/project/app/index.js:49:11)
at ontimeout (timers.js:466:11)
at tryOnTimeout (timers.js:304:5)
at Timer.listOnTimeout (timers.js:267:5)
您可以试试下面的代码。如果有帮助,请告诉我。
const cluster = require('cluster');
const numOfCPUs = require('os').cpus().length;
if (cluster.isMaster) {
console.log(`Master ${process.pid} is running`);
// Fork workers.
for (let i = 0; i < numOfCPUs; i++) {
cluster.fork();
}
cluster.on('exit', (worker, code, signal) => {
console.log('worker %d died (%s). restarting...',
worker.process.pid, signal || code);
cluster.fork();
});
}
降级到 Node 版本 8 LTS 似乎解决了这个问题。
我安装了 Node 版本 9,Windows 和 Ubuntu 都出现了问题,但是当我降级到版本 8 时,一切正常。
当我使用 PM2 运行 多个进程(即集群模式)并且其中一个进程遇到未捕获的错误时,PM2 不会重新启动该进程。
为什么?
我如何让它以集群模式重启工作人员?
示例代码
// index.js
let counter = 0;
setInterval(function(){
if(counter >= 5) {
throw new Error('Worker crash. Why no restart?');
}
counter++;
console.log('Worker alive: ' + Date.now() );
},500);
运行 在命令行上
pm2 start index.js -i 4
pm2 log
最终所有的 worker 都崩溃了,再也没有重新启动。
如果只能在单个进程上完成,重新启动有什么意义。
pm2日志(合并成一个文件)
Worker alive: 1522937847186
Worker alive: 1522937847231
Worker alive: 1522937847276
Worker alive: 1522937847324
Worker alive: 1522937847691
Worker alive: 1522937847736
Worker alive: 1522937847781
Worker alive: 1522937847830
Worker alive: 1522937848193
Worker alive: 1522937848238
Worker alive: 1522937848283
Worker alive: 1522937848332
Worker alive: 1522937848693
Worker alive: 1522937848738
Worker alive: 1522937848783
Worker alive: 1522937848832
Worker alive: 1522937849194
Worker alive: 1522937849238
Worker alive: 1522937849284
Worker alive: 1522937849333
Error: Worker crash. Why no restart?
at Timeout._onTimeout (/home/usrname/docs/Projects_NodeJS/project/app/index.js:49:11)
at ontimeout (timers.js:466:11)
at tryOnTimeout (timers.js:304:5)
at Timer.listOnTimeout (timers.js:267:5)
Error: Worker crash. Why no restart?
at Timeout._onTimeout (/home/usrname/docs/Projects_NodeJS/project/app/index.js:49:11)
at ontimeout (timers.js:466:11)
at tryOnTimeout (timers.js:304:5)
at Timer.listOnTimeout (timers.js:267:5)
Error: Worker crash. Why no restart?
at Timeout._onTimeout (/home/usrname/docs/Projects_NodeJS/project/app/index.js:49:11)
at ontimeout (timers.js:466:11)
at tryOnTimeout (timers.js:304:5)
at Timer.listOnTimeout (timers.js:267:5)
Error: Worker crash. Why no restart?
at Timeout._onTimeout (/home/usrname/docs/Projects_NodeJS/project/app/index.js:49:11)
at ontimeout (timers.js:466:11)
at tryOnTimeout (timers.js:304:5)
at Timer.listOnTimeout (timers.js:267:5)
您可以试试下面的代码。如果有帮助,请告诉我。
const cluster = require('cluster');
const numOfCPUs = require('os').cpus().length;
if (cluster.isMaster) {
console.log(`Master ${process.pid} is running`);
// Fork workers.
for (let i = 0; i < numOfCPUs; i++) {
cluster.fork();
}
cluster.on('exit', (worker, code, signal) => {
console.log('worker %d died (%s). restarting...',
worker.process.pid, signal || code);
cluster.fork();
});
}
降级到 Node 版本 8 LTS 似乎解决了这个问题。
我安装了 Node 版本 9,Windows 和 Ubuntu 都出现了问题,但是当我降级到版本 8 时,一切正常。