上帝监控:进程退出后延迟启动

God monitoring: Delay start after process exit

我正在跟大神一起监控一个ruby节目。当 ruby 程序退出时,我想等待 10 秒,直到它再次启动。我用grace的时候,进程退出后,马上又启动了进程,但是god等了10秒的grace period,直到看进程。当进程现在在宽限期结束之前被杀死时,上帝不会再捡起它并且进程永远不会重新启动。

我希望上帝等待10秒,直到退出后开始命令运行。我该怎么做?

我尝试在手表的 :process_exits 上使用 transition,但我很难找到在正确位置设置等待时间的方法。

编辑: 在查看了 god 的来源之后,我怀疑,一个可能的解决方案是添加一个在其 before_start 方法中等待的自定义行为。这听起来合理吗? (见下文)(完)


更多详情:

当我在 watch 中使用 grace 功能时,出现以下行为:

 INFO: Loading simple.god
 INFO: Syslog enabled.
 INFO: Using pid file directory: /Users/fsc/.god/pids
 INFO: Started on drbunix:///tmp/god.17165.sock
 INFO: simple_god move 'unmonitored' to 'init'
DEBUG: driver schedule #<God::Conditions::ProcessRunning:0x007fe134dee140> in 0 seconds
 INFO: simple_god moved 'unmonitored' to 'init'
 INFO: simple_god [trigger] process is not running (ProcessRunning)
DEBUG: simple_god ProcessRunning [false] {true=>:up, false=>:start}
 INFO: simple_god move 'init' to 'start'
 INFO: simple_god start: ruby .../simple.rb
DEBUG: driver schedule #<God::Conditions::ProcessRunning:0x007fe134dedb00> in 0 seconds
 INFO: simple_god moved 'init' to 'start'
 INFO: simple_god [trigger] process is running (ProcessRunning)
DEBUG: simple_god ProcessRunning [true] {true=>:up}
 INFO: simple_god move 'start' to 'up'
 INFO: simple_god registered 'proc_exit' event for pid 42498
 INFO: simple_god moved 'start' to 'up'

这里我杀进程

 INFO: simple_god [trigger] process 42498 exited (ProcessExits)
DEBUG: simple_god ProcessExits [true] {true=>:start}
 INFO: simple_god move 'up' to 'start'
 INFO: simple_god deregistered 'proc_exit' event for pid 42498
 INFO: simple_god start: ruby .../simple.rb

此时宽限期开​​始。此时流程已经开始。不过神表等到宽限期再看流程

下一个日志行出现在上面最后一个日志行之后 10 秒(宽限期):

DEBUG: driver schedule #<God::Conditions::ProcessRunning:0x007fe134dedb00> in 0 seconds
 INFO: simple_god moved 'up' to 'start'
 INFO: simple_god [trigger] process is running (ProcessRunning)
DEBUG: simple_god ProcessRunning [true] {true=>:up}
 INFO: simple_god move 'start' to 'up'
 INFO: simple_god registered 'proc_exit' event for pid 42501
 INFO: simple_god moved 'start' to 'up'

编辑:

自定义行为:

module God
  module Behaviors

    class WaitBehavior < Behavior
      attr_accessor :delay

      def initialize
        super
        self.delay = 10
      end

      def valid?
        valid = true
        valid
      end

      def before_start
        if delay>0 then
          sleep delay
        end
      end

      def test
        true
      end
    end
  end
end

使用 .god 配置中的行为:

w.behavior(:wait_behavior)

我认为它应该可以,WaitBehavior class 可以更短。

module God
  module Behaviors
    class WaitBehavior < Behavior
      attr_accessor :delay

      def before_start
        sleep delay.to_i if delay.to_i > 0
      end
    end
  end
end

在 .god 配置中:

# .god
w.behavior(:wait_behavior) do |b|
  b.delay = 10
end

另一种方式

类似于WaitBehavior,我们可以定义一个StateFileBehavior来touch一个文件after_stop。

require 'fileutils'

module God
  module Behaviors
    class StateFileBehavior < Behavior
      attr_accessor :file

      def after_stop
        FileUtils.touch file
      end
    end
  end
end

并在 .god 配置中

# .god
stop_timestamp_file = '/path/to/file'

w.behavior(:state_file_behavior) do |b|
  b.file = stop_timestamp_file
end

w.start_if do |on|
  on.condition(:file_mtime) do |c|
    c.interval = 2
    c.path = stop_timestamp_file
    c.max_age = 10
  end
end

注意:第二种方式无法与w.keepalive

配合使用

我遇到了同样的问题,两种解决方案都有效,但您也可以在不修改 god 的情况下使用 lambda 条件来解决。我刚刚创建了一个新的 :down 状态,当进程退出时,我将转换到 :down 状态,然后使用 lambda 延迟启动一段时间。

要创建新状态,您只需将其添加到 valid_states

这就是我的神文件的样子

God.watch do |w|
  w.name = "brain"
  w.start = "ruby yourthing.rb"
  w.stop_grace = 30
  w.valid_states = [:init, :up, :start, :restart, :down]
...

  # down if process not running
  w.transition(:up, :down) do |on|
    on.condition(:process_exits) 
  end

  # delay when down and move to start
  w.transition(:down, :start) do |on|
    on.condition(:lambda) do |c|
      c.lambda = lambda do
        puts "process exists, sleep 30 seconds"
        sleep 30
        true
      end
    end
  end