上帝监控:进程退出后延迟启动
God monitoring: Delay start after process exit
我正在跟大神一起监控一个ruby节目。当 ruby 程序退出时,我想等待 10 秒,直到它再次启动。我用grace
的时候,进程退出后,马上又启动了进程,但是god等了10秒的grace period,直到看进程。当进程现在在宽限期结束之前被杀死时,上帝不会再捡起它并且进程永远不会重新启动。
我希望上帝等待10秒,直到退出后开始命令运行。我该怎么做?
我尝试在手表的 :process_exits
上使用 transition
,但我很难找到在正确位置设置等待时间的方法。
编辑: 在查看了 god 的来源之后,我怀疑,一个可能的解决方案是添加一个在其 before_start
方法中等待的自定义行为。这听起来合理吗? (见下文)(完)
更多详情:
当我在 watch
中使用 grace
功能时,出现以下行为:
INFO: Loading simple.god
INFO: Syslog enabled.
INFO: Using pid file directory: /Users/fsc/.god/pids
INFO: Started on drbunix:///tmp/god.17165.sock
INFO: simple_god move 'unmonitored' to 'init'
DEBUG: driver schedule #<God::Conditions::ProcessRunning:0x007fe134dee140> in 0 seconds
INFO: simple_god moved 'unmonitored' to 'init'
INFO: simple_god [trigger] process is not running (ProcessRunning)
DEBUG: simple_god ProcessRunning [false] {true=>:up, false=>:start}
INFO: simple_god move 'init' to 'start'
INFO: simple_god start: ruby .../simple.rb
DEBUG: driver schedule #<God::Conditions::ProcessRunning:0x007fe134dedb00> in 0 seconds
INFO: simple_god moved 'init' to 'start'
INFO: simple_god [trigger] process is running (ProcessRunning)
DEBUG: simple_god ProcessRunning [true] {true=>:up}
INFO: simple_god move 'start' to 'up'
INFO: simple_god registered 'proc_exit' event for pid 42498
INFO: simple_god moved 'start' to 'up'
这里我杀进程
INFO: simple_god [trigger] process 42498 exited (ProcessExits)
DEBUG: simple_god ProcessExits [true] {true=>:start}
INFO: simple_god move 'up' to 'start'
INFO: simple_god deregistered 'proc_exit' event for pid 42498
INFO: simple_god start: ruby .../simple.rb
此时宽限期开始。此时流程已经开始。不过神表等到宽限期再看流程
下一个日志行出现在上面最后一个日志行之后 10 秒(宽限期):
DEBUG: driver schedule #<God::Conditions::ProcessRunning:0x007fe134dedb00> in 0 seconds
INFO: simple_god moved 'up' to 'start'
INFO: simple_god [trigger] process is running (ProcessRunning)
DEBUG: simple_god ProcessRunning [true] {true=>:up}
INFO: simple_god move 'start' to 'up'
INFO: simple_god registered 'proc_exit' event for pid 42501
INFO: simple_god moved 'start' to 'up'
编辑:
自定义行为:
module God
module Behaviors
class WaitBehavior < Behavior
attr_accessor :delay
def initialize
super
self.delay = 10
end
def valid?
valid = true
valid
end
def before_start
if delay>0 then
sleep delay
end
end
def test
true
end
end
end
end
使用 .god 配置中的行为:
w.behavior(:wait_behavior)
我认为它应该可以,WaitBehavior
class 可以更短。
module God
module Behaviors
class WaitBehavior < Behavior
attr_accessor :delay
def before_start
sleep delay.to_i if delay.to_i > 0
end
end
end
end
在 .god 配置中:
# .god
w.behavior(:wait_behavior) do |b|
b.delay = 10
end
另一种方式
类似于WaitBehavior
,我们可以定义一个StateFileBehavior
来touch一个文件after_stop。
require 'fileutils'
module God
module Behaviors
class StateFileBehavior < Behavior
attr_accessor :file
def after_stop
FileUtils.touch file
end
end
end
end
并在 .god
配置中
# .god
stop_timestamp_file = '/path/to/file'
w.behavior(:state_file_behavior) do |b|
b.file = stop_timestamp_file
end
w.start_if do |on|
on.condition(:file_mtime) do |c|
c.interval = 2
c.path = stop_timestamp_file
c.max_age = 10
end
end
注意:第二种方式无法与w.keepalive
配合使用
我遇到了同样的问题,两种解决方案都有效,但您也可以在不修改 god 的情况下使用 lambda 条件来解决。我刚刚创建了一个新的 :down
状态,当进程退出时,我将转换到 :down
状态,然后使用 lambda 延迟启动一段时间。
要创建新状态,您只需将其添加到 valid_states
这就是我的神文件的样子
God.watch do |w|
w.name = "brain"
w.start = "ruby yourthing.rb"
w.stop_grace = 30
w.valid_states = [:init, :up, :start, :restart, :down]
...
# down if process not running
w.transition(:up, :down) do |on|
on.condition(:process_exits)
end
# delay when down and move to start
w.transition(:down, :start) do |on|
on.condition(:lambda) do |c|
c.lambda = lambda do
puts "process exists, sleep 30 seconds"
sleep 30
true
end
end
end
我正在跟大神一起监控一个ruby节目。当 ruby 程序退出时,我想等待 10 秒,直到它再次启动。我用grace
的时候,进程退出后,马上又启动了进程,但是god等了10秒的grace period,直到看进程。当进程现在在宽限期结束之前被杀死时,上帝不会再捡起它并且进程永远不会重新启动。
我希望上帝等待10秒,直到退出后开始命令运行。我该怎么做?
我尝试在手表的 :process_exits
上使用 transition
,但我很难找到在正确位置设置等待时间的方法。
编辑: 在查看了 god 的来源之后,我怀疑,一个可能的解决方案是添加一个在其 before_start
方法中等待的自定义行为。这听起来合理吗? (见下文)(完)
更多详情:
当我在 watch
中使用 grace
功能时,出现以下行为:
INFO: Loading simple.god
INFO: Syslog enabled.
INFO: Using pid file directory: /Users/fsc/.god/pids
INFO: Started on drbunix:///tmp/god.17165.sock
INFO: simple_god move 'unmonitored' to 'init'
DEBUG: driver schedule #<God::Conditions::ProcessRunning:0x007fe134dee140> in 0 seconds
INFO: simple_god moved 'unmonitored' to 'init'
INFO: simple_god [trigger] process is not running (ProcessRunning)
DEBUG: simple_god ProcessRunning [false] {true=>:up, false=>:start}
INFO: simple_god move 'init' to 'start'
INFO: simple_god start: ruby .../simple.rb
DEBUG: driver schedule #<God::Conditions::ProcessRunning:0x007fe134dedb00> in 0 seconds
INFO: simple_god moved 'init' to 'start'
INFO: simple_god [trigger] process is running (ProcessRunning)
DEBUG: simple_god ProcessRunning [true] {true=>:up}
INFO: simple_god move 'start' to 'up'
INFO: simple_god registered 'proc_exit' event for pid 42498
INFO: simple_god moved 'start' to 'up'
这里我杀进程
INFO: simple_god [trigger] process 42498 exited (ProcessExits)
DEBUG: simple_god ProcessExits [true] {true=>:start}
INFO: simple_god move 'up' to 'start'
INFO: simple_god deregistered 'proc_exit' event for pid 42498
INFO: simple_god start: ruby .../simple.rb
此时宽限期开始。此时流程已经开始。不过神表等到宽限期再看流程
下一个日志行出现在上面最后一个日志行之后 10 秒(宽限期):
DEBUG: driver schedule #<God::Conditions::ProcessRunning:0x007fe134dedb00> in 0 seconds
INFO: simple_god moved 'up' to 'start'
INFO: simple_god [trigger] process is running (ProcessRunning)
DEBUG: simple_god ProcessRunning [true] {true=>:up}
INFO: simple_god move 'start' to 'up'
INFO: simple_god registered 'proc_exit' event for pid 42501
INFO: simple_god moved 'start' to 'up'
编辑:
自定义行为:
module God
module Behaviors
class WaitBehavior < Behavior
attr_accessor :delay
def initialize
super
self.delay = 10
end
def valid?
valid = true
valid
end
def before_start
if delay>0 then
sleep delay
end
end
def test
true
end
end
end
end
使用 .god 配置中的行为:
w.behavior(:wait_behavior)
我认为它应该可以,WaitBehavior
class 可以更短。
module God
module Behaviors
class WaitBehavior < Behavior
attr_accessor :delay
def before_start
sleep delay.to_i if delay.to_i > 0
end
end
end
end
在 .god 配置中:
# .god
w.behavior(:wait_behavior) do |b|
b.delay = 10
end
另一种方式
类似于WaitBehavior
,我们可以定义一个StateFileBehavior
来touch一个文件after_stop。
require 'fileutils'
module God
module Behaviors
class StateFileBehavior < Behavior
attr_accessor :file
def after_stop
FileUtils.touch file
end
end
end
end
并在 .god
配置中
# .god
stop_timestamp_file = '/path/to/file'
w.behavior(:state_file_behavior) do |b|
b.file = stop_timestamp_file
end
w.start_if do |on|
on.condition(:file_mtime) do |c|
c.interval = 2
c.path = stop_timestamp_file
c.max_age = 10
end
end
注意:第二种方式无法与w.keepalive
我遇到了同样的问题,两种解决方案都有效,但您也可以在不修改 god 的情况下使用 lambda 条件来解决。我刚刚创建了一个新的 :down
状态,当进程退出时,我将转换到 :down
状态,然后使用 lambda 延迟启动一段时间。
要创建新状态,您只需将其添加到 valid_states
这就是我的神文件的样子
God.watch do |w|
w.name = "brain"
w.start = "ruby yourthing.rb"
w.stop_grace = 30
w.valid_states = [:init, :up, :start, :restart, :down]
...
# down if process not running
w.transition(:up, :down) do |on|
on.condition(:process_exits)
end
# delay when down and move to start
w.transition(:down, :start) do |on|
on.condition(:lambda) do |c|
c.lambda = lambda do
puts "process exists, sleep 30 seconds"
sleep 30
true
end
end
end