Cygnus 自行关闭
Cygnus shutdown itself
我在周末进行了一些系统检查,发现 Cygnus 自行关闭,但日志文件中没有错误消息。
弗朗西斯科能否与我们分享您的想法?
非常感谢
Starting an ordered shutdown of Cygnus
Stopping sources
Starting an ordered shutdown of Cygnus
Stopping sources
Stopping http-source (lyfecycle state=START)
16/05/29 02:58:02 INFO lifecycle.LifecycleSupervisor: Stopping component: EventDrivenSourceRunner: { source:org.apache.flume.source.http.HTTPSource{name:http-source,state:START} }
16/05/29 02:58:02 INFO mortbay.log: Stopped SocketConnector@0.0.0.0:5050
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Component type: SOURCE, name: http-source stopped
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: SOURCE, name: http-source. source.start.time == 1464330902578
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: SOURCE, name: http-source. source.stop.time == 1464490683015
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: SOURCE, name: http-source. src.append-batch.accepted == 43990
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: SOURCE, name: http-source. src.append-batch.received == 43990
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: SOURCE, name: http-source. src.append.accepted == 0
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: SOURCE, name: http-source. src.append.received == 0
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: SOURCE, name: http-source. src.events.accepted == 43990
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: SOURCE, name: http-source. src.events.received == 43990
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: SOURCE, name: http-source. src.open-connection.count == 0
16/05/29 02:58:03 INFO http.HTTPSource: Http source http-source stopped. Metrics: SOURCE:http-source{src.events.accepted=43990, src.events.received=43990, src.append.accepted=0, src.append-batch.accepted=43990, src.open-connection.count=0, src.append-batch.received=43990, src.append.received=0}
All the channels are empty
Stopping channels
Stopping ckan-channel (lyfecycle state=START)
16/05/29 02:58:03 INFO lifecycle.LifecycleSupervisor: Stopping component: org.apache.flume.channel.MemoryChannel{name: ckan-channel}
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: ckan-channel stopped
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: ckan-channel. channel.start.time == 1464330902110
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: ckan-channel. channel.stop.time == 1464490683353
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: ckan-channel. channel.capacity == 1000
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: ckan-channel. channel.current.size == 0
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: ckan-channel. channel.event.put.attempt == 43990
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: ckan-channel. channel.event.put.success == 43990
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: ckan-channel. channel.event.take.attempt == 74296
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: ckan-channel. channel.event.take.success == 43990
Stopping hdfs-channel (lyfecycle state=START)
16/05/29 02:58:03 INFO lifecycle.LifecycleSupervisor: Stopping component: org.apache.flume.channel.MemoryChannel{name: hdfs-channel}
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: hdfs-channel stopped
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: hdfs-channel. channel.start.time == 1464330902110
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: hdfs-channel. channel.stop.time == 1464490683353
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: hdfs-channel. channel.capacity == 1000
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: hdfs-channel. channel.current.size == 0
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: hdfs-channel. channel.event.put.attempt == 43990
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: hdfs-channel. channel.event.put.success == 43990
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: hdfs-channel. channel.event.take.attempt == 67985
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: hdfs-channel. channel.event.take.success == 43990
Stopping sinks
Stopping ckan-sink (lyfecycle state=START)
16/05/29 02:58:03 INFO lifecycle.LifecycleSupervisor: Stopping component: SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@2c5d7ace counterGroup:{ name:null counters:{runner.backoffs.consecutive=1, runner.backoffs=30324} } }
Stopping hdfs-sink (lyfecycle state=START)
16/05/29 02:58:03 INFO lifecycle.LifecycleSupervisor: Stopping component: SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@2d298123 counterGroup:{ name:null counters:{runner.backoffs.consecutive=1, runner.backoffs=24009} } }
Cygnus 执行内部检查以搜索异常线程终止,甚至是 ctrl+c
组合键。发生这种情况时,它会关闭。可以看到相关代码here.
很可能为 enabling/disabling 这个功能设置一个标志是有用的,但目前这样的东西不存在(我会在下一个版本中添加它;))。或者,您可以编写一个 monit 进程以检测 Cygnus 关闭并自动重新启动它:
这样的 monit 可以通过专门的软件(例如 Peacemaker, maybe a load balancer 也是必需的)与高可用性 (HA) 架构相结合,以便拥有一对 active/passive Cygnus。这意味着主动 Cygnus 照常工作,而被动 Cygnus 只有在检测到主动 Cygnus 出现问题时才会开始工作。然后专用软件将所有流量重定向到被动 Cygnus,同时重新启动主动 Cygnus(通过 monit)。
我在周末进行了一些系统检查,发现 Cygnus 自行关闭,但日志文件中没有错误消息。
弗朗西斯科能否与我们分享您的想法?
非常感谢
Starting an ordered shutdown of Cygnus
Stopping sources
Starting an ordered shutdown of Cygnus
Stopping sources
Stopping http-source (lyfecycle state=START)
16/05/29 02:58:02 INFO lifecycle.LifecycleSupervisor: Stopping component: EventDrivenSourceRunner: { source:org.apache.flume.source.http.HTTPSource{name:http-source,state:START} }
16/05/29 02:58:02 INFO mortbay.log: Stopped SocketConnector@0.0.0.0:5050
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Component type: SOURCE, name: http-source stopped
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: SOURCE, name: http-source. source.start.time == 1464330902578
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: SOURCE, name: http-source. source.stop.time == 1464490683015
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: SOURCE, name: http-source. src.append-batch.accepted == 43990
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: SOURCE, name: http-source. src.append-batch.received == 43990
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: SOURCE, name: http-source. src.append.accepted == 0
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: SOURCE, name: http-source. src.append.received == 0
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: SOURCE, name: http-source. src.events.accepted == 43990
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: SOURCE, name: http-source. src.events.received == 43990
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: SOURCE, name: http-source. src.open-connection.count == 0
16/05/29 02:58:03 INFO http.HTTPSource: Http source http-source stopped. Metrics: SOURCE:http-source{src.events.accepted=43990, src.events.received=43990, src.append.accepted=0, src.append-batch.accepted=43990, src.open-connection.count=0, src.append-batch.received=43990, src.append.received=0}
All the channels are empty
Stopping channels
Stopping ckan-channel (lyfecycle state=START)
16/05/29 02:58:03 INFO lifecycle.LifecycleSupervisor: Stopping component: org.apache.flume.channel.MemoryChannel{name: ckan-channel}
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: ckan-channel stopped
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: ckan-channel. channel.start.time == 1464330902110
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: ckan-channel. channel.stop.time == 1464490683353
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: ckan-channel. channel.capacity == 1000
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: ckan-channel. channel.current.size == 0
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: ckan-channel. channel.event.put.attempt == 43990
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: ckan-channel. channel.event.put.success == 43990
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: ckan-channel. channel.event.take.attempt == 74296
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: ckan-channel. channel.event.take.success == 43990
Stopping hdfs-channel (lyfecycle state=START)
16/05/29 02:58:03 INFO lifecycle.LifecycleSupervisor: Stopping component: org.apache.flume.channel.MemoryChannel{name: hdfs-channel}
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: hdfs-channel stopped
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: hdfs-channel. channel.start.time == 1464330902110
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: hdfs-channel. channel.stop.time == 1464490683353
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: hdfs-channel. channel.capacity == 1000
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: hdfs-channel. channel.current.size == 0
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: hdfs-channel. channel.event.put.attempt == 43990
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: hdfs-channel. channel.event.put.success == 43990
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: hdfs-channel. channel.event.take.attempt == 67985
16/05/29 02:58:03 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: hdfs-channel. channel.event.take.success == 43990
Stopping sinks
Stopping ckan-sink (lyfecycle state=START)
16/05/29 02:58:03 INFO lifecycle.LifecycleSupervisor: Stopping component: SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@2c5d7ace counterGroup:{ name:null counters:{runner.backoffs.consecutive=1, runner.backoffs=30324} } }
Stopping hdfs-sink (lyfecycle state=START)
16/05/29 02:58:03 INFO lifecycle.LifecycleSupervisor: Stopping component: SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@2d298123 counterGroup:{ name:null counters:{runner.backoffs.consecutive=1, runner.backoffs=24009} } }
Cygnus 执行内部检查以搜索异常线程终止,甚至是 ctrl+c
组合键。发生这种情况时,它会关闭。可以看到相关代码here.
很可能为 enabling/disabling 这个功能设置一个标志是有用的,但目前这样的东西不存在(我会在下一个版本中添加它;))。或者,您可以编写一个 monit 进程以检测 Cygnus 关闭并自动重新启动它:
这样的 monit 可以通过专门的软件(例如 Peacemaker, maybe a load balancer 也是必需的)与高可用性 (HA) 架构相结合,以便拥有一对 active/passive Cygnus。这意味着主动 Cygnus 照常工作,而被动 Cygnus 只有在检测到主动 Cygnus 出现问题时才会开始工作。然后专用软件将所有流量重定向到被动 Cygnus,同时重新启动主动 Cygnus(通过 monit)。