如何为多个进程创建一个 monit 循环来监控?
How do I create a monit loop for multiple processes to monitor?
这个例子展示了如何监控一个resque队列
check process resque_worker_QUEUE
with pidfile /data/APP_NAME/current/tmp/pids/resque_worker_QUEUE.pid
start program = "/usr/bin/env HOME=/home/user RACK_ENV=production PATH=/usr/local/bin:/usr/local/ruby/bin:/usr/bin:/bin:$PATH /bin/sh -l -c 'cd /data/APP_NAME/current; nohup bundle exec rake environment resque:work RAILS_ENV=production QUEUE=queue_name VERBOSE=1 PIDFILE=tmp/pids/resque_worker_QUEUE.pid >> log/resque_worker_QUEUE.log 2>&1'" as uid deploy and gid deploy
stop program = "/bin/sh -c 'cd /data/APP_NAME/current && kill -9 $(cat tmp/pids/resque_worker_QUEUE.pid) && rm -f tmp/pids/resque_worker_QUEUE.pid; exit 0;'"
if totalmem is greater than 300 MB for 10 cycles then restart # eating up memory?
group resque_workers
其中 QUEUE 通常是队列的索引。 monit 本身是否有能力创建一个循环,以便 QUEUE 可以是索引或迭代器,所以如果我有 6 个工作人员要创建,我仍然可以在一个块中有一个配置代码块?还是我必须创建一个 monit 配置构建器来进行迭代以生成一组硬编码的工作监视器作为输出?
所以不用
check process resque_worker_0
with pidfile /data/APP_NAME/current/tmp/pids/resque_worker_0.pid
start program = "/usr/bin/env HOME=/home/user RACK_ENV=production PATH=/usr/local/bin:/usr/local/ruby/bin:/usr/bin:/bin:$PATH /bin/sh -l -c 'cd /data/APP_NAME/current; nohup bundle exec rake environment resque:work RAILS_ENV=production QUEUE=queue_name VERBOSE=1 PIDFILE=tmp/pids/resque_worker_0.pid >> log/resque_worker_0.log 2>&1'" as uid deploy and gid deploy
stop program = "/bin/sh -c 'cd /data/APP_NAME/current && kill -9 $(cat tmp/pids/resque_worker_0.pid) && rm -f tmp/pids/resque_worker_0.pid; exit 0;'"
if totalmem is greater than 300 MB for 10 cycles then restart # eating up memory?
group resque_workers
check process resque_worker_1
with pidfile /data/APP_NAME/current/tmp/pids/resque_worker_1.pid
start program = "/usr/bin/env HOME=/home/user RACK_ENV=production PATH=/usr/local/bin:/usr/local/ruby/bin:/usr/bin:/bin:$PATH /bin/sh -l -c 'cd /data/APP_NAME/current; nohup bundle exec rake environment resque:work RAILS_ENV=production QUEUE=queue_name VERBOSE=1 PIDFILE=tmp/pids/resque_worker_1.pid >> log/resque_worker_1.log 2>&1'" as uid deploy and gid deploy
stop program = "/bin/sh -c 'cd /data/APP_NAME/current && kill -9 $(cat tmp/pids/resque_worker_1.pid) && rm -f tmp/pids/resque_worker_1.pid; exit 0;'"
if totalmem is greater than 300 MB for 10 cycles then restart # eating up memory?
group resque_workers
我可以做这样的事情(我知道循环的伪代码)
[0..1].each |QUEUE|
check process resque_worker_QUEUE
with pidfile /data/APP_NAME/current/tmp/pids/resque_worker_QUEUE.pid
start program = "/usr/bin/env HOME=/home/user RACK_ENV=production PATH=/usr/local/bin:/usr/local/ruby/bin:/usr/bin:/bin:$PATH /bin/sh -l -c 'cd /data/APP_NAME/current; nohup bundle exec rake environment resque:work RAILS_ENV=production QUEUE=queue_name VERBOSE=1 PIDFILE=tmp/pids/resque_worker_QUEUE.pid >> log/resque_worker_QUEUE.log 2>&1'" as uid deploy and gid deploy
stop program = "/bin/sh -c 'cd /data/APP_NAME/current && kill -9 $(cat tmp/pids/resque_worker_QUEUE.pid) && rm -f tmp/pids/resque_worker_QUEUE.pid; exit 0;'"
if totalmem is greater than 300 MB for 10 cycles then restart # eating up memory?
group resque_workers
end
我找不到任何证据表明 monit 可以自行完成此操作,因此我编写了一个 ruby monit resque 配置文件生成器并插入到 capistrano 部署任务中。
在config/deploy/production.rb
set :resque_worker_count, 6
在 lib/capistrano/tasks/monit.rake
def build_entry(process_name,worker_pid_file,worker_config_file,start_command,stop_command)
<<-END_OF_ENTRY
check process #{process_name}
with pidfile #{worker_pid_file}
start program = \"#{start_command}\" with timeout 90 seconds
stop program = \"#{stop_command}\" with timeout 90 seconds
if totalmem is greater than 500 MB for 4 cycles then restart # eating up memory?
group resque
END_OF_ENTRY
end
namespace :monit do
desc "Build monit configuration file for monitoring resque workers"
task :build_resque_configuration_file do
on roles(:app) do |host|
# Setup the reusable variables across all worker entries
rails_env = fetch(:rails_env)
app_name = fetch(:application)
monit_resque_config_file_path = "#{shared_path}/config/monit/resque"
resque_control_script = "#{shared_path}/bin/resque-control"
monit_wrapper_script = "/usr/local/sbin/monit-wrapper"
config_file_content = []
(0..((fetch(:resque_worker_count)).to_i - 1)).each do |worker|
# Setup the variables for the worker entry
process_name = "resque_#{worker}"
worker_config_file = "resque_#{worker}.conf"
worker_pid_file = "/var/run/resque/#{app_name}/resque_#{worker}.pid"
start_command = "#{monit_wrapper_script} #{resque_control_script} #{app_name} start #{rails_env} #{worker_config_file}"
stop_command = "#{monit_wrapper_script} #{resque_control_script} #{app_name} stop #{rails_env} #{worker_config_file}"
# Build the config file entry for the worker
config_file_content << build_entry(process_name,worker_pid_file,worker_config_file,start_command,stop_command)
end
# Save the file locally for inspection (debugging)
temp_file = "/tmp/#{app_name}_#{rails_env}_resque"
File.delete(temp_file) if File.exist?(temp_file)
File.open(temp_file,'w+') {|f| f.write config_file_content.join("\n") }
# Upload the results to the server
upload! temp_file, monit_resque_config_file_path
end
end
end
这个例子展示了如何监控一个resque队列
check process resque_worker_QUEUE
with pidfile /data/APP_NAME/current/tmp/pids/resque_worker_QUEUE.pid
start program = "/usr/bin/env HOME=/home/user RACK_ENV=production PATH=/usr/local/bin:/usr/local/ruby/bin:/usr/bin:/bin:$PATH /bin/sh -l -c 'cd /data/APP_NAME/current; nohup bundle exec rake environment resque:work RAILS_ENV=production QUEUE=queue_name VERBOSE=1 PIDFILE=tmp/pids/resque_worker_QUEUE.pid >> log/resque_worker_QUEUE.log 2>&1'" as uid deploy and gid deploy
stop program = "/bin/sh -c 'cd /data/APP_NAME/current && kill -9 $(cat tmp/pids/resque_worker_QUEUE.pid) && rm -f tmp/pids/resque_worker_QUEUE.pid; exit 0;'"
if totalmem is greater than 300 MB for 10 cycles then restart # eating up memory?
group resque_workers
其中 QUEUE 通常是队列的索引。 monit 本身是否有能力创建一个循环,以便 QUEUE 可以是索引或迭代器,所以如果我有 6 个工作人员要创建,我仍然可以在一个块中有一个配置代码块?还是我必须创建一个 monit 配置构建器来进行迭代以生成一组硬编码的工作监视器作为输出?
所以不用
check process resque_worker_0
with pidfile /data/APP_NAME/current/tmp/pids/resque_worker_0.pid
start program = "/usr/bin/env HOME=/home/user RACK_ENV=production PATH=/usr/local/bin:/usr/local/ruby/bin:/usr/bin:/bin:$PATH /bin/sh -l -c 'cd /data/APP_NAME/current; nohup bundle exec rake environment resque:work RAILS_ENV=production QUEUE=queue_name VERBOSE=1 PIDFILE=tmp/pids/resque_worker_0.pid >> log/resque_worker_0.log 2>&1'" as uid deploy and gid deploy
stop program = "/bin/sh -c 'cd /data/APP_NAME/current && kill -9 $(cat tmp/pids/resque_worker_0.pid) && rm -f tmp/pids/resque_worker_0.pid; exit 0;'"
if totalmem is greater than 300 MB for 10 cycles then restart # eating up memory?
group resque_workers
check process resque_worker_1
with pidfile /data/APP_NAME/current/tmp/pids/resque_worker_1.pid
start program = "/usr/bin/env HOME=/home/user RACK_ENV=production PATH=/usr/local/bin:/usr/local/ruby/bin:/usr/bin:/bin:$PATH /bin/sh -l -c 'cd /data/APP_NAME/current; nohup bundle exec rake environment resque:work RAILS_ENV=production QUEUE=queue_name VERBOSE=1 PIDFILE=tmp/pids/resque_worker_1.pid >> log/resque_worker_1.log 2>&1'" as uid deploy and gid deploy
stop program = "/bin/sh -c 'cd /data/APP_NAME/current && kill -9 $(cat tmp/pids/resque_worker_1.pid) && rm -f tmp/pids/resque_worker_1.pid; exit 0;'"
if totalmem is greater than 300 MB for 10 cycles then restart # eating up memory?
group resque_workers
我可以做这样的事情(我知道循环的伪代码)
[0..1].each |QUEUE|
check process resque_worker_QUEUE
with pidfile /data/APP_NAME/current/tmp/pids/resque_worker_QUEUE.pid
start program = "/usr/bin/env HOME=/home/user RACK_ENV=production PATH=/usr/local/bin:/usr/local/ruby/bin:/usr/bin:/bin:$PATH /bin/sh -l -c 'cd /data/APP_NAME/current; nohup bundle exec rake environment resque:work RAILS_ENV=production QUEUE=queue_name VERBOSE=1 PIDFILE=tmp/pids/resque_worker_QUEUE.pid >> log/resque_worker_QUEUE.log 2>&1'" as uid deploy and gid deploy
stop program = "/bin/sh -c 'cd /data/APP_NAME/current && kill -9 $(cat tmp/pids/resque_worker_QUEUE.pid) && rm -f tmp/pids/resque_worker_QUEUE.pid; exit 0;'"
if totalmem is greater than 300 MB for 10 cycles then restart # eating up memory?
group resque_workers
end
我找不到任何证据表明 monit 可以自行完成此操作,因此我编写了一个 ruby monit resque 配置文件生成器并插入到 capistrano 部署任务中。
在config/deploy/production.rb
set :resque_worker_count, 6
在 lib/capistrano/tasks/monit.rake
def build_entry(process_name,worker_pid_file,worker_config_file,start_command,stop_command)
<<-END_OF_ENTRY
check process #{process_name}
with pidfile #{worker_pid_file}
start program = \"#{start_command}\" with timeout 90 seconds
stop program = \"#{stop_command}\" with timeout 90 seconds
if totalmem is greater than 500 MB for 4 cycles then restart # eating up memory?
group resque
END_OF_ENTRY
end
namespace :monit do
desc "Build monit configuration file for monitoring resque workers"
task :build_resque_configuration_file do
on roles(:app) do |host|
# Setup the reusable variables across all worker entries
rails_env = fetch(:rails_env)
app_name = fetch(:application)
monit_resque_config_file_path = "#{shared_path}/config/monit/resque"
resque_control_script = "#{shared_path}/bin/resque-control"
monit_wrapper_script = "/usr/local/sbin/monit-wrapper"
config_file_content = []
(0..((fetch(:resque_worker_count)).to_i - 1)).each do |worker|
# Setup the variables for the worker entry
process_name = "resque_#{worker}"
worker_config_file = "resque_#{worker}.conf"
worker_pid_file = "/var/run/resque/#{app_name}/resque_#{worker}.pid"
start_command = "#{monit_wrapper_script} #{resque_control_script} #{app_name} start #{rails_env} #{worker_config_file}"
stop_command = "#{monit_wrapper_script} #{resque_control_script} #{app_name} stop #{rails_env} #{worker_config_file}"
# Build the config file entry for the worker
config_file_content << build_entry(process_name,worker_pid_file,worker_config_file,start_command,stop_command)
end
# Save the file locally for inspection (debugging)
temp_file = "/tmp/#{app_name}_#{rails_env}_resque"
File.delete(temp_file) if File.exist?(temp_file)
File.open(temp_file,'w+') {|f| f.write config_file_content.join("\n") }
# Upload the results to the server
upload! temp_file, monit_resque_config_file_path
end
end
end