尽管在多阶段 docker 构建中存在轮子,但为什么 pip(使用 --find-links)仍然从互联网上收集一些依赖项?

Why does pip (using --find-links) still collect some dependencies from the internet despite a wheel existing for it in multi-stage docker build?

我有一个多阶段 Dockerfile,我在 python 图像中为项目的依赖项构建轮子,然后将轮子复制到 alpine 图像并执行 pip install -r ./wheels/requirements.txt --find-links ./wheels

它似乎可以从大多数 wheels 安装,除了像 numpyspacygensim 这样的某些部门,它会连接到互联网以收集他们的 zips/tars .为什么 pip 找不到它们的链接?轮子在那里。通常它是什么,但是在 alpine 上安装这些依赖需要 很长的时间

这是我的 Dockerfile:

FROM python:3.6.10 as builder
ENV PYTHONUNBUFFERED 1
WORKDIR /wheels

COPY ./requirements.txt /wheels/
RUN pip install -U pip \
    && pip wheel -r ./requirements.txt

FROM python:3.6.10-alpine
ENV PYTHONUNBUFFERED=1

RUN apk update
RUN apk add --no-cache \
            --upgrade \
            --repository http://dl-cdn.alpinelinux.org/alpine/edge/main \
        make \
        automake \
        gcc \
        g++ \
        subversion \
        python3-dev \
        gettext \
        libpq \
        postgresql-client \
    && rm -rf /var/cache/apk/*

COPY --from=builder /wheels /wheels
RUN pip install -U pip \
   && pip install -r ./wheels/requirements.txt --find-links ./wheels \
   && rm -rf /wheels \
   && rm -rf /root/.cache/pip/*


WORKDIR /app
COPY ./ /app

COPY ./docker/entrypoint.sh /
ENTRYPOINT [ "/entrypoint.sh" ] 

还有我的requirements.txt:

asgiref==3.2.3
Django==3.0.2
luhn==0.2.0
nltk==3.4.5
numpy==1.18.1
psycopg2==2.8.4
pytest==5.3.5
pytz==2019.3
spacy==2.2.3
sqlparse==0.3.0
yapf==0.29.0
gensim==3.8.1

这是一个示例日志输出:

Looking in links: ./wheels
Processing /wheels/asgiref-3.2.3-py2.py3-none-any.whl
Processing /wheels/Django-3.0.2-py3-none-any.whl
Processing /wheels/luhn-0.2.0-py3-none-any.whl
Processing /wheels/nltk-3.4.5-py3-none-any.whl
Collecting numpy==1.18.1
  Downloading numpy-1.18.1.zip (5.4 MB)
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Processing /wheels/psycopg2-2.8.4-cp36-cp36m-linux_x86_64.whl
Processing /wheels/pytest-5.3.5-py3-none-any.whl
Processing /wheels/pytz-2019.3-py2.py3-none-any.whl
Collecting spacy==2.2.3
  Downloading spacy-2.2.3.tar.gz (5.9 MB)

基本上我希望 numpyspacy 像 Django 和其他 deps 一样直接从轮子上处理。

numpy, spacy and genism are all packages combining Python and Cython and that interface with C/C++. In "plain" (glibc based) Linux distributions,这些包被预编译为wheel binaries,直接通过pip下载安装。

但是,在 Alpine Linux 上,这些包没有二进制形式,必须在安装期间从 Alpine 目标上的源代码编译。

原因是基于 wheel 构建系统(PEP 517) does not support Alpine Linux. Linux wheel binaries are built for the manylinux target, which is not compatible with musl-libc 的目标,例如 Alpine 和 Void Linux。Alpine 没有 wheel 标签,所以即使您有预构建的 Alpine wheel,你不能直接使用pip安装它。相反,pip必须获取包源代码并在目标上构建它。

其他包都是纯Python,所以按原样使用。

您可以在以下 PyPa GitHub 主题中找到有关此问题的更多详细信息以及当前状态:
https://github.com/pypa/manylinux/issues/37