我的简单循环代码中是否存在死锁

Question

我有一个微服务，只有在不同的服务器启动时才需要执行作业。几个星期以来它工作得很好，如果服务器宕机了，微服务会睡一会儿而不做工作（应该如此），如果服务器启动了 - 工作就完成了。服务器永远不会停机超过几分钟（当然！服务器受到高度监控），因此该作业最多被跳过 2-3 次。

今天我进入了我的 Docker 容器，并在日志中注意到该作业现在甚至几周都没有尝试继续（我知道不进行监视是错误的选择），这表明，我认为发生了某种僵局。我还假设问题出在我的异常处理上，可以使用我独自工作的一些建议。

def is_server_healthy():
    url = "url" #correct url for health check path
    try:
        res = requests.get(url)
    except Exception as ex:
        LOGGER.error(f"Can't health check!{ex}")
    finally:
        pass

    return res

def init():
    while True:
        LOGGER.info(f"Sleeping for {SLEEP_TIME} Minutes")
        time.sleep(SLEEP_TIME*ONE_MINUTE)

        res = is_server_healthy()

        if res.status_code == 200:
            my_api.DoJob()
            LOGGER.info(f"Server is: {res.text}")
        else:
            LOGGER.info(f"Server is down... {res.status_code}")

（更改了变量名称以简化问题）

运行状况检查非常简单 - return "up" 如果正常。任何其他被认为已关闭的东西，所以除非状态 200 和 "up" 返回我认为服务器已关闭。

Answer 1

如果您的服务器出现故障，您会收到一个未捕获的错误：

NameError: name 'res' is not defined

为什么？参见：

def is_server_healthy():
    url = "don't care"
    try:
        raise Exception()  # simulate fail
    except Exception as ex:
        print(f"Can't health check!{ex}")
    finally:
        pass

    return res   ## name is not known ;o)

res = is_server_healthy()
if res.status_code == 200:   # here, next exception bound to happen
    my_api.DoJob()
    LOGGER.info(f"Server is: {res.text}")
else:
    LOGGER.info(f"Server is down... {res.status_code}")

即使您声明了名称，它也会尝试访问一些不存在的属性：

if res.status_code == 200:   # here - object has no attribute 'status_code'   
    my_api.DoJob()
    LOGGER.info(f"Server is: {res.text}")
else:
    LOGGER.info(f"Server is down... {res.status_code}")

会尝试访问根本不存在的成员 => 异常，然后进程消失。

您最好使用某种特定于系统的方式每分钟调用一次脚本（Cron 作业、任务计划程序），然后在 while True: 中闲置睡觉。

我的简单循环代码中是否存在死锁

is there a deadlock in my simple loop code

python

deadlock

while-loop

microservices

docker-container