如何调试使用 gunicorn 服务的 Django 应用程序的间歇性错误(可能的竞争条件)?
How to debug intermittent errors from Django app served with gunicorn (possible race condition)?
我有一个 Django 应用程序由 nginx
+gunicorn
和 3 gunicorn
个工作进程提供服务。偶尔(可能每 100 个请求一次左右),其中一个工作进程进入一种状态,它开始失败它所服务的大多数(但不是全部)请求,然后当它试图通过电子邮件向我发送有关它的信息时抛出异常。 gunicorn
错误日志如下所示:
[2015-04-29 10:41:39 +0000] [20833] [ERROR] Error handling request
Traceback (most recent call last):
File "/home/django/virtualenvs/homestead_django/local/lib/python2.7/site-packages/gunicorn/workers/sync.py", line 130, in handle
File "/home/django/virtualenvs/homestead_django/local/lib/python2.7/site-packages/gunicorn/workers/sync.py", line 171, in handle_request
File "/home/django/virtualenvs/homestead_django/local/lib/python2.7/site-packages/django/core/handlers/wsgi.py", line 206, in __call__
File "/home/django/virtualenvs/homestead_django/local/lib/python2.7/site-packages/django/core/handlers/base.py", line 196, in get_response
File "/home/django/virtualenvs/homestead_django/local/lib/python2.7/site-packages/django/core/handlers/base.py", line 226, in handle_uncaught_exception
File "/usr/lib/python2.7/logging/__init__.py", line 1178, in error
File "/usr/lib/python2.7/logging/__init__.py", line 1271, in _log
File "/usr/lib/python2.7/logging/__init__.py", line 1281, in handle
File "/usr/lib/python2.7/logging/__init__.py", line 1321, in callHandlers
File "/usr/lib/python2.7/logging/__init__.py", line 749, in handle
File "/home/django/virtualenvs/homestead_django/local/lib/python2.7/site-packages/django/utils/log.py", line 122, in emit
File "/home/django/virtualenvs/homestead_django/local/lib/python2.7/site-packages/django/utils/log.py", line 125, in connection
File "/home/django/virtualenvs/homestead_django/local/lib/python2.7/site-packages/django/core/mail/__init__.py", line 29, in get_connection
File "/home/django/virtualenvs/homestead_django/local/lib/python2.7/site-packages/django/utils/module_loading.py", line 26, in import_by_path
File "/home/django/virtualenvs/homestead_django/local/lib/python2.7/site-packages/django/utils/module_loading.py", line 21, in import_by_path
File "/home/django/virtualenvs/homestead_django/local/lib/python2.7/site-packages/django/utils/importlib.py", line 40, in import_module
ImproperlyConfigured: Error importing module django.core.mail.backends.smtp: "No module named smtp"
所以发生了一些未捕获的异常,然后 Django 试图通过电子邮件将其发送给我。它不能导入 django.core.mail.backends.smtp
的事实没有意义,因为 django.core.mail.backends.smtp
绝对应该在工作进程的 Python 路径上。我可以从 manage.py shell
中导入它,而且我确实会收到有关其他服务器错误(实际软件错误)的电子邮件,所以我知道这是有效的。这就像工作进程的环境以某种方式损坏了。
一旦工作进程进入这种状态,它就很难恢复;它所服务的几乎每个请求最终都以同样的方式失败。如果我重新启动 gunicorn
一切都很好(直到另一个工作进程再次陷入这种奇怪的状态)。
我没有注意到任何明显的模式,所以我认为这不是由我的应用程序中的错误触发的(错误输出的 URL 不同,等等)。这似乎是某种竞争条件。
目前我正在使用 gunicorn
的 --max-requests
选项来缓解这个问题,但我想了解这里发生了什么。这是竞争条件吗?我该如何调试?
我有一个 Django 应用程序由 nginx
+gunicorn
和 3 gunicorn
个工作进程提供服务。偶尔(可能每 100 个请求一次左右),其中一个工作进程进入一种状态,它开始失败它所服务的大多数(但不是全部)请求,然后当它试图通过电子邮件向我发送有关它的信息时抛出异常。 gunicorn
错误日志如下所示:
[2015-04-29 10:41:39 +0000] [20833] [ERROR] Error handling request
Traceback (most recent call last):
File "/home/django/virtualenvs/homestead_django/local/lib/python2.7/site-packages/gunicorn/workers/sync.py", line 130, in handle
File "/home/django/virtualenvs/homestead_django/local/lib/python2.7/site-packages/gunicorn/workers/sync.py", line 171, in handle_request
File "/home/django/virtualenvs/homestead_django/local/lib/python2.7/site-packages/django/core/handlers/wsgi.py", line 206, in __call__
File "/home/django/virtualenvs/homestead_django/local/lib/python2.7/site-packages/django/core/handlers/base.py", line 196, in get_response
File "/home/django/virtualenvs/homestead_django/local/lib/python2.7/site-packages/django/core/handlers/base.py", line 226, in handle_uncaught_exception
File "/usr/lib/python2.7/logging/__init__.py", line 1178, in error
File "/usr/lib/python2.7/logging/__init__.py", line 1271, in _log
File "/usr/lib/python2.7/logging/__init__.py", line 1281, in handle
File "/usr/lib/python2.7/logging/__init__.py", line 1321, in callHandlers
File "/usr/lib/python2.7/logging/__init__.py", line 749, in handle
File "/home/django/virtualenvs/homestead_django/local/lib/python2.7/site-packages/django/utils/log.py", line 122, in emit
File "/home/django/virtualenvs/homestead_django/local/lib/python2.7/site-packages/django/utils/log.py", line 125, in connection
File "/home/django/virtualenvs/homestead_django/local/lib/python2.7/site-packages/django/core/mail/__init__.py", line 29, in get_connection
File "/home/django/virtualenvs/homestead_django/local/lib/python2.7/site-packages/django/utils/module_loading.py", line 26, in import_by_path
File "/home/django/virtualenvs/homestead_django/local/lib/python2.7/site-packages/django/utils/module_loading.py", line 21, in import_by_path
File "/home/django/virtualenvs/homestead_django/local/lib/python2.7/site-packages/django/utils/importlib.py", line 40, in import_module
ImproperlyConfigured: Error importing module django.core.mail.backends.smtp: "No module named smtp"
所以发生了一些未捕获的异常,然后 Django 试图通过电子邮件将其发送给我。它不能导入 django.core.mail.backends.smtp
的事实没有意义,因为 django.core.mail.backends.smtp
绝对应该在工作进程的 Python 路径上。我可以从 manage.py shell
中导入它,而且我确实会收到有关其他服务器错误(实际软件错误)的电子邮件,所以我知道这是有效的。这就像工作进程的环境以某种方式损坏了。
一旦工作进程进入这种状态,它就很难恢复;它所服务的几乎每个请求最终都以同样的方式失败。如果我重新启动 gunicorn
一切都很好(直到另一个工作进程再次陷入这种奇怪的状态)。
我没有注意到任何明显的模式,所以我认为这不是由我的应用程序中的错误触发的(错误输出的 URL 不同,等等)。这似乎是某种竞争条件。
目前我正在使用 gunicorn
的 --max-requests
选项来缓解这个问题,但我想了解这里发生了什么。这是竞争条件吗?我该如何调试?