来自 alpine 包 repo 的 Numpy 无法导入 c-extensions

Numpy from alpine package repo fails to import c-extensions

我正在制作需要 pandas 和 numpy 的 docker 图像,但通过 pip 安装需要大约 20 分钟,这对我的用例来说太长了。然后我选择从 alpine 包 repo 安装 pandas 和 numpy,但它似乎无法正确导入 numpy。

这是我的 Dockerfile:

# syntax=docker/dockerfile:experimental
FROM python:3.9.5-alpine as base

FROM base as builder
RUN apk add build-base gcc musl-dev

RUN --mount=type=cache,target=/root/.cache/pip \
    pip install --target="/install" django

FROM base
RUN apk add py3-pandas py3-numpy

COPY --from=builder /install /usr/local/lib/python3.9/site-packages

ENV PYTHONPATH "${PYTHONPATH}:/usr/lib/python3.9/site-packages"

CMD ["python"]

当我尝试导入依赖于 numpy 的 pandas 时,出现错误:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.9/site-packages/pandas/__init__.py", line 16, in <module>
    raise ImportError(
ImportError: Unable to import required dependencies:
numpy: 

IMPORTANT: PLEASE READ THIS FOR ADVICE ON HOW TO SOLVE THIS ISSUE!

Importing the numpy C-extensions failed. This error can happen for
many reasons, often due to issues with your setup or how NumPy was
installed.

We have compiled some common reasons and troubleshooting tips at:

    https://numpy.org/devdocs/user/troubleshooting-importerror.html

Please note and check the following:

  * The Python version is: Python3.9 from "/usr/local/bin/python"
  * The NumPy version is: "1.20.3"

and make sure that they are the versions you expect.
Please carefully study the documentation linked above for further help.

Original error was: No module named 'numpy.core._multiarray_umath'

以及导入 numpy 时的错误:

Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/numpy/core/__init__.py", line 22, in <module>
    from . import multiarray
  File "/usr/lib/python3.9/site-packages/numpy/core/multiarray.py", line 12, in <module>
    from . import overrides
  File "/usr/lib/python3.9/site-packages/numpy/core/overrides.py", line 7, in <module>
    from numpy.core._multiarray_umath import (
ModuleNotFoundError: No module named 'numpy.core._multiarray_umath'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.9/site-packages/numpy/__init__.py", line 145, in <module>
    from . import core
  File "/usr/lib/python3.9/site-packages/numpy/core/__init__.py", line 48, in <module>
    raise ImportError(msg)
ImportError: 

IMPORTANT: PLEASE READ THIS FOR ADVICE ON HOW TO SOLVE THIS ISSUE!

Importing the numpy C-extensions failed. This error can happen for
many reasons, often due to issues with your setup or how NumPy was
installed.

We have compiled some common reasons and troubleshooting tips at:

    https://numpy.org/devdocs/user/troubleshooting-importerror.html

Please note and check the following:

  * The Python version is: Python3.9 from "/usr/local/bin/python"
  * The NumPy version is: "1.20.3"

and make sure that they are the versions you expect.
Please carefully study the documentation linked above for further help.

Original error was: No module named 'numpy.core._multiarray_umath'

我已经无计可施了,想弄清楚我错过了什么和做错了什么。我已经尝试了错误跟踪给出的 url 中的故障排除提示,但似乎无法解决问题。

非常感谢任何帮助。

我知道这个问题已经有一段时间了,您可能已经找到了解决方案,或者从 Alpine 转移到另一个发行版。但我 运行 遇到了同样的问题,这是我搜索时出现的第一件事。因此,在花了几个小时找到解决方案之后,我认为值得在这里记录下来。

问题(显然)与 numpypandas 软件包有关。我使用社区回购中的预制轮子和 运行 解决了与您相同的问题。因此,很明显,构建过程本身引入了这个问题。具体来说,如果您查看安装位置 (/usr/lib/python3.9/site-packages) 下的 numpy/core,您会发现所有 C 扩展的名称中都有 .cpython-39-x86_64-linux-musl。因此,例如,您遇到问题的模块 numpy.core._multiarray_umath 被命名为 _multiarray_umath.cpython-39-x86_64-linux-musl.so,而不仅仅是 _multiarray_umath.so。从这些文件名中删除 .cpython-39-x86_64-linux-musl 解决了这个问题(编辑:详见附录)。

在安装 py3-pandaspy3-numpy 之后,可以将以下行添加到您的 Dockerfile 以修复它:

RUN find /usr/lib/python3.9/site-packages -iname "*.so" -exec sh -c 'x="{}"; mv "$x" "${x/cpython-39-x86_64-linux-musl./}"' \;

P.S.: 进一步研究问题后,我发现了罪魁祸首:出于某种原因,Alpine 下的 Python 即 运行 认为其完整的平台扩展后缀(可用来自 importlib.machinery.EXTENSION_SUFFIXES) 应该是 cpython-39-x86_64-linux-gnu.so,而不是 cpython-39-x86_64-linux-musl.so。我不相信它是用 glibc 构建的,但谁知道呢。因此,您只需将上面那些共享对象的名称中的 musl 更改为 gnu,它也会起作用。不确定为什么在构建期间生成的扩展后缀与 Python 在运行时使用的不同。