为什么看门狗不踢?
Why watchdog is not kicking?
我正在尝试在 coreos 上配置看门狗。
服务是这样的。
[Unit]
Description=Watchdog example service
[Service]
Type=notify
Environment=NOTIFY_SOCKET=/run/%p.sock
Environment=WATCHDOG_USEC=1000000
ExecStartPre=-/usr/bin/docker kill %p
ExecStartPre=-/usr/bin/docker rm %p
ExecStart=/usr/libexec/sdnotify-proxy /run/%p.sock /usr/bin/docker run \
--env=NOTIFY_SOCKET=/run/%p.sock \
-v /run:/run \
--name %p pranav93/test_watchdogged python hello.py
ExecStop=/usr/bin/docker stop %p
WatchdogSec=1
[Install]
WantedBy=multi-user.target
python 文件 hello.py
类似于,
print 'Hello, in hello.py'
print 'ready sending'
x = sd_notifyd({'READY':1})
print str(x)
print 'watchdog sending'
x = sd_notifyd({'WATCHDOG':1})
print str(x)
print os.environ.get('WATCHDOG_USEC', None)
print 'lol, wait now for sometime'
import time
for i in range(3):
print i
time.sleep(1)
print 'finished'
尽管我没有向 sysd
发送 WATCHDOG=1
ping,它仍然没有被它停止并且服务没有将它移动到 'failed' 状态。背后的原因是什么?
日志是
Oct 06 09:33:19 core-01 systemd[1]: Starting Watchdog example service...
Oct 06 09:33:19 core-01 docker[2779]: watchdogged
Oct 06 09:33:19 core-01 docker[2790]: watchdogged
Oct 06 09:33:19 core-01 sdnotify-proxy[2800]: True
Oct 06 09:33:22 core-01 sdnotify-proxy[2800]: ready sending
Oct 06 09:33:22 core-01 sdnotify-proxy[2800]: <socket._socketobject object at 0x7fa3cc3c2440>
Oct 06 09:33:22 core-01 sdnotify-proxy[2800]: 1
Oct 06 09:33:22 core-01 sdnotify-proxy[2800]: watchdog sending
Oct 06 09:33:22 core-01 sdnotify-proxy[2800]: <socket._socketobject object at 0x7fa3cc3c2440>
Oct 06 09:33:22 core-01 sdnotify-proxy[2800]: 1
Oct 06 09:33:22 core-01 sdnotify-proxy[2800]: None
Oct 06 09:33:22 core-01 sdnotify-proxy[2800]: lol, wait now for someyime
Oct 06 09:33:22 core-01 sdnotify-proxy[2800]: 0
Oct 06 09:33:22 core-01 sdnotify-proxy[2800]: 1
Oct 06 09:33:22 core-01 sdnotify-proxy[2800]: 2
Oct 06 09:33:22 core-01 sdnotify-proxy[2800]: finished
Oct 06 09:33:22 core-01 docker[2851]: watchdogged
Oct 06 09:33:22 core-01 systemd[1]: Started Watchdog example service.
我注意到几件事。首先,Started Watchdog example service.
行晚了 3 秒,在程序退出后,表示未收到 READY=1
。看门狗监控仅在设备 "started".
后启动
此外,尝试使用 print >>os.stderr
进行记录,因为标准输出的输出是缓冲的并且很难看到时间。
你不应该
Environment=NOTIFY_SOCKET=/run/%p.sock
Environment=WATCHDOG_USEC=1000000
因为这些是由 systemd 设置的。您应该通过 --env
和 WATCHDOG_USEC
传递代理套接字,因为它将是 "lost" 否则:
ExecStart=/usr/libexec/sdnotify-proxy /run/%p.sock /usr/bin/docker run \
--env=NOTIFY_SOCKET=/run/%p.sock --env=WATCHDOG_USEC=1000000
我正在尝试在 coreos 上配置看门狗。 服务是这样的。
[Unit]
Description=Watchdog example service
[Service]
Type=notify
Environment=NOTIFY_SOCKET=/run/%p.sock
Environment=WATCHDOG_USEC=1000000
ExecStartPre=-/usr/bin/docker kill %p
ExecStartPre=-/usr/bin/docker rm %p
ExecStart=/usr/libexec/sdnotify-proxy /run/%p.sock /usr/bin/docker run \
--env=NOTIFY_SOCKET=/run/%p.sock \
-v /run:/run \
--name %p pranav93/test_watchdogged python hello.py
ExecStop=/usr/bin/docker stop %p
WatchdogSec=1
[Install]
WantedBy=multi-user.target
python 文件 hello.py
类似于,
print 'Hello, in hello.py'
print 'ready sending'
x = sd_notifyd({'READY':1})
print str(x)
print 'watchdog sending'
x = sd_notifyd({'WATCHDOG':1})
print str(x)
print os.environ.get('WATCHDOG_USEC', None)
print 'lol, wait now for sometime'
import time
for i in range(3):
print i
time.sleep(1)
print 'finished'
尽管我没有向 sysd
发送 WATCHDOG=1
ping,它仍然没有被它停止并且服务没有将它移动到 'failed' 状态。背后的原因是什么?
日志是
Oct 06 09:33:19 core-01 systemd[1]: Starting Watchdog example service...
Oct 06 09:33:19 core-01 docker[2779]: watchdogged
Oct 06 09:33:19 core-01 docker[2790]: watchdogged
Oct 06 09:33:19 core-01 sdnotify-proxy[2800]: True
Oct 06 09:33:22 core-01 sdnotify-proxy[2800]: ready sending
Oct 06 09:33:22 core-01 sdnotify-proxy[2800]: <socket._socketobject object at 0x7fa3cc3c2440>
Oct 06 09:33:22 core-01 sdnotify-proxy[2800]: 1
Oct 06 09:33:22 core-01 sdnotify-proxy[2800]: watchdog sending
Oct 06 09:33:22 core-01 sdnotify-proxy[2800]: <socket._socketobject object at 0x7fa3cc3c2440>
Oct 06 09:33:22 core-01 sdnotify-proxy[2800]: 1
Oct 06 09:33:22 core-01 sdnotify-proxy[2800]: None
Oct 06 09:33:22 core-01 sdnotify-proxy[2800]: lol, wait now for someyime
Oct 06 09:33:22 core-01 sdnotify-proxy[2800]: 0
Oct 06 09:33:22 core-01 sdnotify-proxy[2800]: 1
Oct 06 09:33:22 core-01 sdnotify-proxy[2800]: 2
Oct 06 09:33:22 core-01 sdnotify-proxy[2800]: finished
Oct 06 09:33:22 core-01 docker[2851]: watchdogged
Oct 06 09:33:22 core-01 systemd[1]: Started Watchdog example service.
我注意到几件事。首先,Started Watchdog example service.
行晚了 3 秒,在程序退出后,表示未收到 READY=1
。看门狗监控仅在设备 "started".
此外,尝试使用 print >>os.stderr
进行记录,因为标准输出的输出是缓冲的并且很难看到时间。
你不应该
Environment=NOTIFY_SOCKET=/run/%p.sock
Environment=WATCHDOG_USEC=1000000
因为这些是由 systemd 设置的。您应该通过 --env
和 WATCHDOG_USEC
传递代理套接字,因为它将是 "lost" 否则:
ExecStart=/usr/libexec/sdnotify-proxy /run/%p.sock /usr/bin/docker run \
--env=NOTIFY_SOCKET=/run/%p.sock --env=WATCHDOG_USEC=1000000