在 Docker 映像中安装 pydrill

Install pydrill in Docker image

我有一个基于 alpine 的 docker 文件,它使用 conda 安装了几个软件包。最后安装 pydrillpip 因为没有 conda 安装。

from jcrist/alpine-dask

RUN /opt/conda/bin/conda update -n base -c defaults conda -y
RUN /opt/conda/bin/conda update dask
RUN /opt/conda/bin/conda install -c conda-forge dask-ml
RUN /opt/conda/bin/conda install scikit-learn -y
RUN /opt/conda/bin/conda install flask -y
RUN /opt/conda/bin/conda install waitress -y
RUN /opt/conda/bin/conda install gunicorn -y
RUN /opt/conda/bin/conda install pytest -y
RUN /opt/conda/bin/conda install apscheduler -y
RUN /opt/conda/bin/conda install matplotlib -y
RUN /opt/conda/bin/conda install pyodbc -y

USER root
RUN apk update
RUN apk add py-pip
RUN pip install pydrill

当我构建 docker 图像时,一切正常。但是当我 运行 容器时,命令行启动 gunicorn,但失败并显示以下消息:

  File "/code/app/service/cm/exec/run_drill.py", line 1, in <module>
    from pydrill.client import PyDrill
   
   ModuleNotFoundError: No module named 'pydrill'

这个pip安装正确吗?这是 docker 撰写:

version: "3.0"
services:

  web:
    image: img-dask
    volumes:
      - vol_py_code:/code
      - vol_dask_data:/data
      - vol_dask_model:/model
    ports:
      - "5000:5000"
    working_dir: /code
    environment:
      - app.config=/code/conf/py.app.json
      - common.config=/code/conf/py.common.json     
    entrypoint:
      - /opt/conda/bin/gunicorn
    command:
      - -b 0.0.0.0:5000
      - --reload
      - app.frontend.app:app


 scheduler:
    image: img-dask
    ports:
      - "8787:8787"
      - "8786:8786"
    entrypoint:
      - /opt/conda/bin/dask-scheduler

  worker:
    image: img-dask
    depends_on:
      - scheduler
    environment:
      - PYTHONPATH=/code
      - MODEL_PATH=/model/rfc_model.pkl
      - PREPROCESSING_PATH=/model/data_columns.pkl
      - SCHEDULER_ADDRESS=scheduler
      - SCHEDULER_PORT=8786
    volumes:
      - vol_py_code:/code
      - vol_dask_data:/data
      - vol_dask_model:/model
    entrypoint:
      - /opt/conda/bin/dask-worker
    command:
      - scheduler:8786
      
volumes:
  vol_py_code:
     name: vol_py_code
  vol_dask_data:
     name: vol_dask_data
  vol_dask_model:
     name: vol_dask_model
  

更新

如果我运行容器内的命令行,我可以看到安装了pydrill,但我的代码没有看到库。

/code/conf # pip3 list
Package    Version  
---------- ---------
certifi    2020.12.5
chardet    4.0.0    
idna       2.10     
pip        18.1     
pydrill    0.3.4    
requests   2.25.1   
setuptools 40.6.2   
urllib3    1.26.4   
You are using pip version 18.1, however version 21.1.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.

你能试试 conda install pip 而不是 apk

类似

from jcrist/alpine-dask
WORKDIR /opt/conda/bin

RUN conda update -n base -c defaults conda -y
RUN conda update dask
RUN install -c conda-forge dask-ml 
RUN conda install stickit-learn flask waitress gunicorn \
    pytest apscheduler matplotlib pydobc pip -y
RUN pip install pydrill

问题是 pydrill 和所有其他 conda 包在不同的环境中。当服务器启动时,它没有看到 pydrill,只有 conda 包。

要解决此问题,请在 conda 的环境中安装 pip 本身:

from jcrist/alpine-dask

USER root
RUN /opt/conda/bin/conda create -p /pyenv -y
RUN /opt/conda/bin/conda install -p /pyenv dask scikit-learn flask waitress gunicorn \
    pytest apscheduler matplotlib pyodbc -y
RUN /opt/conda/bin/conda install -p /pyenv -c conda-forge dask-ml -y
RUN /opt/conda/bin/conda install -p /pyenv pip -y
RUN /pyenv/bin/pip install pydrill

我已经为conda-forge打包了pydrill,所以你可以简单地conda install -c conda-forge pydrill