后端 docker 图像不会等到数据库可用

Backend docker image does not wait until db becomes available

我正在尝试 docker-compose up 我的容器,一个用于后端,另一个用于数据库 (postgis)。如果我 docker-compose up db,我看到 db_1 | 2021-11-23 10:36:02.123 UTC [1] LOG: database system is ready to accept connections,那么,它有效。

但是如果我 docker-compose up 整个项目,我得到

django.db.utils.OperationalError: could not connect to server: Connection refused
web_1  |        Is the server running on host "db" (172.23.0.2) and accepting
web_1  |        TCP/IP connections on port 5432?

据我所知,这意味着我的后端映像不会等到数据库可用,然后抛出错误。如果这个想法是正确的(是吗?),解决方案之一可能是:

我现在有两个子问题:

  1. 我是否正确理解了我的问题?
  2. 如何解决?

非常感谢任何试图帮助我的人!

我的文件如下:

docker-compose.yaml

version: "3.9"

services:
  db:
    image: postgis/postgis
    volumes:
      - ./data/db:/var/lib/postgresql/data
    environment:
      - POSTGRES_DB=postgis
      - POSTGRES_USER=postgis
      - POSTGRES_PASSWORD=postgis
    ports:
      - 5432:5432
      #postgres: 5432

  web:
    build: .

    #command: /wait-for-it.sh db:5432
    #something like command: ["./wait-for-it.sh", "db:5432", "--", "./start.sh"]
    command: python manage.py runserver 0.0.0.0:8000
    volumes:
      - ./:/usr/src/[projectname-backend]/
    ports:
      - "8000:8000"
    env_file:
      - ./.env.dev
    depends_on:
      - db

volumes:
  db:

Dockerfile

FROM python:3.8.3-alpine

WORKDIR /usr/src/[projectname-backend]

RUN apk update && apk upgrade \
  && apk add postgresql-dev \
    gcc \
    python3-dev \
    musl-dev \
    libffi-dev \
  && apk add --repository http://dl-cdn.alpinelinux.org/alpine/edge/testing \
    gdal-dev \
    geos-dev \
    proj-dev \
  && pip install pipenv

ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1

RUN pip install --upgrade pip
COPY ./requirements.txt .
RUN pip install -r requirements.txt

COPY . .

日志

% docker-compose down && docker-compose build && docker-compose up
WARNING: Found orphan containers (lista-backend_nginx_1) for this project. If you removed or renamed this service in your compose file, you can run this command with the --remove-orphans flag to clean it up.
Removing lista-backend_web_1 ... done
Removing lista-backend_db_1  ... done
Removing network lista-backend_default
db uses an image, skipping
Building web
[+] Building 7.8s (18/18) FINISHED                                                                                         
 => [internal] load build definition from Dockerfile                                                                  0.0s
 => => transferring dockerfile: 37B                                                                                   0.0s
 => [internal] load .dockerignore                                                                                     0.0s
 => => transferring context: 2B                                                                                       0.0s
 => resolve image config for docker.io/docker/dockerfile:1                                                            2.5s
 => [auth] docker/dockerfile:pull token for registry-1.docker.io                                                      0.0s
 => CACHED docker-image://docker.io/docker/dockerfile:1@sha256:42399d4635eddd7a9b8a24be879d2f9a930d0ed040a61324cfdf5  0.0s
 => [internal] load .dockerignore                                                                                     0.0s
 => [internal] load build definition from Dockerfile                                                                  0.0s
 => [internal] load metadata for docker.io/library/python:3.8.3-alpine                                                1.3s
 => [auth] library/python:pull token for registry-1.docker.io                                                         0.0s
 => [1/7] FROM docker.io/library/python:3.8.3-alpine@sha256:c5623df482648cacece4f9652a0ae04b51576c93773ccd43ad459e2a  0.0s
 => [internal] load build context                                                                                     1.0s
 => => transferring context: 18.30MB                                                                                  1.0s
 => CACHED [2/7] WORKDIR /usr/src/LISTA_backend                                                                       0.0s
 => CACHED [3/7] RUN apk update && apk upgrade   && apk add postgresql-dev     gcc     python3-dev     musl-dev       0.0s
 => CACHED [4/7] RUN pip install --upgrade pip                                                                        0.0s
 => CACHED [5/7] COPY ./requirements.txt .                                                                            0.0s
 => CACHED [6/7] RUN pip install -r requirements.txt                                                                  0.0s
 => [7/7] COPY . .                                                                                                    1.6s
 => exporting to image                                                                                                0.9s
 => => exporting layers                                                                                               0.9s
 => => writing image sha256:8a5e13ac74a6184b2be21da4269554fc98c677c9a0ee4c11a8989e9027903fec                          0.0s
 => => naming to docker.io/library/lista-backend_web                                                                  0.0s
Creating network "lista-backend_default" with the default driver
WARNING: Found orphan containers (lista-backend_nginx_1) for this project. If you removed or renamed this service in your compose file, you can run this command with the --remove-orphans flag to clean it up.
Creating lista-backend_db_1 ... done
Creating lista-backend_web_1 ... done
Attaching to lista-backend_db_1, lista-backend_web_1
web_1  | Watching for file changes with StatReloader
web_1  | Performing system checks...
web_1  | 
web_1  | System check identified some issues:
web_1  | 
web_1  | WARNINGS:
web_1  | api.CustomUser: (models.W042) Auto-created primary key used when not defining a primary key type, by default 'django.db.models.AutoField'.
web_1  |        HINT: Configure the DEFAULT_AUTO_FIELD setting or the ApiConfig.default_auto_field attribute to point to a subclass of AutoField, e.g. 'django.db.models.BigAutoField'.
web_1  | listings.Realty: (models.W042) Auto-created primary key used when not defining a primary key type, by default 'django.db.models.AutoField'.
web_1  |        HINT: Configure the DEFAULT_AUTO_FIELD setting or the ListingsConfig.default_auto_field attribute to point to a subclass of AutoField, e.g. 'django.db.models.BigAutoField'.
web_1  | 
web_1  | System check identified 2 issues (0 silenced).
web_1  | Exception in thread django-main-thread:
web_1  | Traceback (most recent call last):
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/base.py", line 219, in ensure_connection
web_1  |     self.connect()
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/utils/asyncio.py", line 33, in inner
web_1  |     return func(*args, **kwargs)
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/base.py", line 200, in connect
web_1  |     self.connection = self.get_new_connection(conn_params)
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/utils/asyncio.py", line 33, in inner
web_1  |     return func(*args, **kwargs)
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/db/backends/postgresql/base.py", line 187, in get_new_connection
web_1  |     connection = Database.connect(**conn_params)
web_1  |   File "/usr/local/lib/python3.8/site-packages/psycopg2/__init__.py", line 127, in connect
web_1  |     conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
web_1  | psycopg2.OperationalError: could not connect to server: Connection refused
web_1  |        Is the server running on host "db" (172.27.0.2) and accepting
web_1  |        TCP/IP connections on port 5432?
web_1  | 
web_1  | 
web_1  | The above exception was the direct cause of the following exception:
web_1  | 
web_1  | Traceback (most recent call last):
web_1  |   File "/usr/local/lib/python3.8/threading.py", line 932, in _bootstrap_inner
web_1  |     self.run()
web_1  |   File "/usr/local/lib/python3.8/threading.py", line 870, in run
web_1  |     self._target(*self._args, **self._kwargs)
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/utils/autoreload.py", line 64, in wrapper
web_1  |     fn(*args, **kwargs)
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/core/management/commands/runserver.py", line 121, in inner_run
web_1  |     self.check_migrations()
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/core/management/base.py", line 486, in check_migrations
web_1  |     executor = MigrationExecutor(connections[DEFAULT_DB_ALIAS])
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/db/migrations/executor.py", line 18, in __init__
web_1  |     self.loader = MigrationLoader(self.connection)
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/db/migrations/loader.py", line 53, in __init__
web_1  |     self.build_graph()
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/db/migrations/loader.py", line 220, in build_graph
web_1  |     self.applied_migrations = recorder.applied_migrations()
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/db/migrations/recorder.py", line 77, in applied_migrations
web_1  |     if self.has_table():
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/db/migrations/recorder.py", line 55, in has_table
web_1  |     with self.connection.cursor() as cursor:
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/utils/asyncio.py", line 33, in inner
web_1  |     return func(*args, **kwargs)
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/base.py", line 259, in cursor
web_1  |     return self._cursor()
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/base.py", line 235, in _cursor
web_1  |     self.ensure_connection()
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/utils/asyncio.py", line 33, in inner
web_1  |     return func(*args, **kwargs)
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/base.py", line 219, in ensure_connection
web_1  |     self.connect()
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/db/utils.py", line 90, in __exit__
web_1  |     raise dj_exc_value.with_traceback(traceback) from exc_value
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/base.py", line 219, in ensure_connection
web_1  |     self.connect()
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/utils/asyncio.py", line 33, in inner
web_1  |     return func(*args, **kwargs)
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/base.py", line 200, in connect
web_1  |     self.connection = self.get_new_connection(conn_params)
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/utils/asyncio.py", line 33, in inner
web_1  |     return func(*args, **kwargs)
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/db/backends/postgresql/base.py", line 187, in get_new_connection
web_1  |     connection = Database.connect(**conn_params)
web_1  |   File "/usr/local/lib/python3.8/site-packages/psycopg2/__init__.py", line 127, in connect
web_1  |     conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
web_1  | django.db.utils.OperationalError: could not connect to server: Connection refused
web_1  |        Is the server running on host "db" (172.27.0.2) and accepting
web_1  |        TCP/IP connections on port 5432?
web_1  | 
db_1   | 
db_1   | PostgreSQL Database directory appears to contain a database; Skipping initialization
db_1   | 
db_1   | 2021-11-24 23:19:28.324 UTC [1] LOG:  starting PostgreSQL 13.3 (Debian 13.3-1.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit
db_1   | 2021-11-24 23:19:28.328 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
db_1   | 2021-11-24 23:19:28.329 UTC [1] LOG:  listening on IPv6 address "::", port 5432
db_1   | 2021-11-24 23:19:28.336 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
db_1   | 2021-11-24 23:19:28.364 UTC [66] LOG:  database system was shut down at 2021-11-24 13:55:35 UTC
db_1   | 2021-11-24 23:19:28.400 UTC [1] LOG:  database system is ready to accept connections

我的问题是:如果我 healthcheck 我的数据库容器, 我也 不应该忘记 像这样向我的 depends_on 添加条件:

depends_on:
      db:
        condition: service_healthy

如讨论中所述。现在可以了。

我遇到了同样的问题,我已经使用自定义管理命令解决了它

在您的应用中添加此管理命令

<django_project>/<app_name>/management/commands/wait_for_db.py

"""
Django management command wait_for_database
"""
import sys
from time import sleep, time

from django.core.management.base import BaseCommand, CommandError
from django.db import DEFAULT_DB_ALIAS, connections
from django.db.utils import OperationalError


def wait_for_database(**opts):
    """
    The main loop waiting for the database connection to come up.
    """
    wait_for_db_seconds = opts['wait_when_down']
    alive_check_delay = opts['wait_when_alive']
    stable_for_seconds = opts['stable']
    timeout_seconds = opts['timeout']
    db_alias = opts['database']

    conn_alive_start = None
    connection = connections[db_alias]
    start = time()

    while True:
        # loop until we have a database connection or we run into a timeout
        while True:
            try:
                connection.cursor().execute('SELECT 1')
                if not conn_alive_start:
                    conn_alive_start = time()
                break
            except OperationalError as err:
                conn_alive_start = None

                elapsed_time = int(time() - start)
                if elapsed_time >= timeout_seconds:
                    raise TimeoutError(
                        'Could not establish database connection.'
                    ) from err

                err_message = str(err).strip()
                print(f'Waiting for database (cause: {err_message}) ... '
                      f'{elapsed_time}s',
                      file=sys.stderr, flush=True)
                sleep(wait_for_db_seconds)

        uptime = int(time() - conn_alive_start)
        print(f'Connection alive for > {uptime}s', flush=True)

        if uptime >= stable_for_seconds:
            break

        sleep(alive_check_delay)


class Command(BaseCommand):
    """
    A readiness probe you can use for Kubernetes.
    If the database is ready, i.e. willing to accept connections
    and handling requests, then this call will exit successfully. Otherwise
    the command exits with an error status after reaching a timeout.
    """
    help = 'Probes for database availability'
    requires_system_checks = False

    def add_arguments(self, parser):
        parser.add_argument('--timeout', '-t', type=int, default=180,
                            metavar='SECONDS', action='store',
                            help='how long to wait for the database before '
                                 'timing out (seconds), default: 180')
        parser.add_argument('--stable', '-s', type=int, default=5,
                            metavar='SECONDS', action='store',
                            help='how long to observe whether connection '
                                 'is stable (seconds), default: 5')
        parser.add_argument('--wait-when-down', '-d', type=int, default=2,
                            metavar='SECONDS', action='store',
                            help='delay between checks when database is '
                                 'down (seconds), default: 2')
        parser.add_argument('--wait-when-alive', '-a', type=int, default=1,
                            metavar='SECONDS', action='store',
                            help='delay between checks when database is '
                                 'up (seconds), default: 1')
        parser.add_argument('--database', default=DEFAULT_DB_ALIAS,
                            action='store', dest='database',
                            help='which database of `settings.DATABASES` '
                                 'to wait for. Defaults to the "default" '
                                 'database.')

    def handle(self, *args, **options):
        """
        Wait for a database connection to come up. Exit with error
        status when a timeout threshold is surpassed.
        """
        try:
            wait_for_database(**options)
        except TimeoutError as err:
            raise CommandError(err) from err

在您的 docker-compose.yml 中,运行 wait_for_db 迁移前

command: bash -c "python manage.py wait_for_db; python manage.py migrate;

示例代码docker-compose.yml

version: "3.8"
   
services:
  db:
    image: postgres

  django:
    build: django
    command: bash -c "python manage.py wait_for_db; python manage.py migrate; daphne -b 0.0.0.0 -p 8001 demo.asgi:application"
    volumes:
      - ./django:/workdir
    expose:
      - 29000 # UWSGI application
      - 8001  # ASGI application
    depends_on:
      - db
    stdin_open: true
    tty: true