docker 上的无头 chrome M1 错误 - 无法发现 chrome 中打开的 window

headless chrome on docker M1 error - unable to discover open window in chrome

我目前正在尝试 运行 无头 chrome 在 m1 mac 主机/amd64 ubuntu 容器上使用 selenium。

因为armubuntu不支持google-chrome-stable包,我决定使用amd64ubuntu基础镜像

但是不行。出现一些错误。

worker_1    | [2021-10-31 03:58:23,286: DEBUG/ForkPoolWorker-10] POST http://localhost:43035/session {"capabilities": {"firstMatch": [{}], "alwaysMatch": {"browserName": "chrome", "pageLoadStrategy": "normal", "goog:chromeOptions": {"extensions": [], "args": ["--no-sandbox", "--disable-dev-shm-usage", "--disable-gpu", "--remote-debugging-port=9222", "--headless"]}}}, "desiredCapabilities": {"browserName": "chrome", "pageLoadStrategy": "normal", "goog:chromeOptions": {"extensions": [], "args": ["--no-sandbox", "--disable-dev-shm-usage", "--disable-gpu", "--remote-debugging-port=9222", "--headless"]}}}
worker_1    | [2021-10-31 03:58:23,330: DEBUG/ForkPoolWorker-10] Starting new HTTP connection (1): localhost:43035
worker_1    | [2021-10-31 03:58:41,311: DEBUG/ForkPoolWorker-12] http://localhost:47089 "POST /session HTTP/1.1" 500 717
worker_1    | [2021-10-31 03:58:41,412: DEBUG/ForkPoolWorker-12] Finished Request
worker_1    | [2021-10-31 03:58:41,825: WARNING/ForkPoolWorker-12] Error occurred while initializing chromedriver - Message: unknown error: unable to discover open window in chrome
worker_1    |   (Session info: headless chrome=95.0.4638.69)
worker_1    | Stacktrace:
worker_1    | #0 0x004000a18f93 <unknown>
worker_1    | #1 0x0040004f3908 <unknown>
worker_1    | #2 0x0040004d3cdf <unknown>
worker_1    | #3 0x00400054cabe <unknown>
worker_1    | #4 0x004000546973 <unknown>
worker_1    | #5 0x00400051cdf4 <unknown>
worker_1    | #6 0x00400051dde5 <unknown>
worker_1    | #7 0x004000a482be <unknown>
worker_1    | #8 0x004000a5dba0 <unknown>
worker_1    | #9 0x004000a49215 <unknown>
worker_1    | #10 0x004000a5efe8 <unknown>
worker_1    | #11 0x004000a3d9db <unknown>
worker_1    | #12 0x004000a7a218 <unknown>
worker_1    | #13 0x004000a7a398 <unknown>
worker_1    | #14 0x004000a956cd <unknown>
worker_1    | #15 0x004002b29609 <unknown>
worker_1    | 
worker_1    | [2021-10-31 03:58:41,826: WARNING/ForkPoolWorker-12] 
worker_1    | 
worker_1    | [2021-10-31 03:58:41,867: DEBUG/ForkPoolWorker-11] http://localhost:58147 "POST /session HTTP/1.1" 500 717
worker_1    | [2021-10-31 03:58:41,907: DEBUG/ForkPoolWorker-11] Finished Request
worker_1    | [2021-10-31 03:58:41,946: DEBUG/ForkPoolWorker-12] Using selector: EpollSelector
worker_1    | [WDM] - 
worker_1    | 
worker_1    | [2021-10-31 03:58:41,962: INFO/ForkPoolWorker-12] 
worker_1    | 
worker_1    | [WDM] - ====== WebDriver manager ======
worker_1    | [2021-10-31 03:58:41,971: INFO/ForkPoolWorker-12] ====== WebDriver manager ======
worker_1    | [2021-10-31 03:58:42,112: WARNING/ForkPoolWorker-11] Error occurred while initializing chromedriver - Message: unknown error: unable to discover open window in chrome
worker_1    |   (Session info: headless chrome=95.0.4638.69)
worker_1    | Stacktrace:
worker_1    | #0 0x004000a18f93 <unknown>
worker_1    | #1 0x0040004f3908 <unknown>
worker_1    | #2 0x0040004d3cdf <unknown>
worker_1    | #3 0x00400054cabe <unknown>
worker_1    | #4 0x004000546973 <unknown>
worker_1    | #5 0x00400051cdf4 <unknown>
worker_1    | #6 0x00400051dde5 <unknown>
worker_1    | #7 0x004000a482be <unknown>
worker_1    | #8 0x004000a5dba0 <unknown>
worker_1    | #9 0x004000a49215 <unknown>
worker_1    | #10 0x004000a5efe8 <unknown>
worker_1    | #11 0x004000a3d9db <unknown>
worker_1    | #12 0x004000a7a218 <unknown>
worker_1    | #13 0x004000a7a398 <unknown>
worker_1    | #14 0x004000a956cd <unknown>
worker_1    | #15 0x004002b29609 <unknown>
worker_1    | 
worker_1    | [2021-10-31 03:58:42,113: WARNING/ForkPoolWorker-11] 
worker_1    | 
worker_1    | [2021-10-31 03:58:42,166: DEBUG/ForkPoolWorker-11] Using selector: EpollSelector
worker_1    | [WDM] - 
worker_1    | 
worker_1    | [2021-10-31 03:58:42,169: INFO/ForkPoolWorker-11] 
worker_1    | 
worker_1    | [WDM] - ====== WebDriver manager ======
worker_1    | [2021-10-31 03:58:42,170: INFO/ForkPoolWorker-11] ====== WebDriver manager ======
worker_1    | [2021-10-31 03:58:42,702: DEBUG/ForkPoolWorker-9] http://localhost:51793 "POST /session HTTP/1.1" 500 866
worker_1    | [2021-10-31 03:58:42,719: DEBUG/ForkPoolWorker-9] Finished Request
worker_1    | [2021-10-31 03:58:42,986: WARNING/ForkPoolWorker-9] Error occurred while initializing chromedriver - Message: unknown error: Chrome failed to start: crashed.
worker_1    |   (chrome not reachable)
worker_1    |   (The process started from chrome location /usr/bin/google-chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.)
worker_1    | Stacktrace:
worker_1    | #0 0x004000a18f93 <unknown>
worker_1    | #1 0x0040004f3908 <unknown>
worker_1    | #2 0x004000516b32 <unknown>
worker_1    | #3 0x00400051265d <unknown>
worker_1    | #4 0x00400054c770 <unknown>
worker_1    | #5 0x004000546973 <unknown>
worker_1    | #6 0x00400051cdf4 <unknown>
worker_1    | #7 0x00400051dde5 <unknown>
worker_1    | #8 0x004000a482be <unknown>
worker_1    | #9 0x004000a5dba0 <unknown>
worker_1    | #10 0x004000a49215 <unknown>
worker_1    | #11 0x004000a5efe8 <unknown>
worker_1    | #12 0x004000a3d9db <unknown>
worker_1    | #13 0x004000a7a218 <unknown>
worker_1    | #14 0x004000a7a398 <unknown>
worker_1    | #15 0x004000a956cd <unknown>
worker_1    | #16 0x004002b29609 <unknown>
worker_1    | 
worker_1    | [2021-10-31 03:58:42,987: WARNING/ForkPoolWorker-9] 
worker_1    | 
worker_1    | [2021-10-31 03:58:43,045: DEBUG/ForkPoolWorker-9] Using selector: EpollSelector
worker_1    | [WDM] - 
worker_1    | 
worker_1    | [2021-10-31 03:58:43,049: INFO/ForkPoolWorker-9] 
worker_1    | 
worker_1    | [WDM] - ====== WebDriver manager ======
worker_1    | [2021-10-31 03:58:43,050: INFO/ForkPoolWorker-9] ====== WebDriver manager ======
worker_1    | [2021-10-31 03:58:43,936: DEBUG/ForkPoolWorker-10] http://localhost:43035 "POST /session HTTP/1.1" 500 866
worker_1    | [2021-10-31 03:58:43,952: DEBUG/ForkPoolWorker-10] Finished Request
worker_1    | [2021-10-31 03:58:44,163: WARNING/ForkPoolWorker-10] Error occurred while initializing chromedriver - Message: unknown error: Chrome failed to start: crashed.
worker_1    |   (chrome not reachable)
worker_1    |   (The process started from chrome location /usr/bin/google-chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.)
worker_1    | Stacktrace:
worker_1    | #0 0x004000a18f93 <unknown>
worker_1    | #1 0x0040004f3908 <unknown>
worker_1    | #2 0x004000516b32 <unknown>
worker_1    | #3 0x00400051265d <unknown>
worker_1    | #4 0x00400054c770 <unknown>
worker_1    | #5 0x004000546973 <unknown>
worker_1    | #6 0x00400051cdf4 <unknown>
worker_1    | #7 0x00400051dde5 <unknown>
worker_1    | #8 0x004000a482be <unknown>
worker_1    | #9 0x004000a5dba0 <unknown>
worker_1    | #10 0x004000a49215 <unknown>
worker_1    | #11 0x004000a5efe8 <unknown>
worker_1    | #12 0x004000a3d9db <unknown>
worker_1    | #13 0x004000a7a218 <unknown>
worker_1    | #14 0x004000a7a398 <unknown>
worker_1    | #15 0x004000a956cd <unknown>
worker_1    | #16 0x004002b29609 <unknown>
worker_1    | 
worker_1    | [2021-10-31 03:58:44,164: WARNING/ForkPoolWorker-10] 
worker_1    | 
worker_1    | [2021-10-31 03:58:44,205: DEBUG/ForkPoolWorker-10] Using selector: EpollSelector
worker_1    | [WDM] - 
worker_1    | 
worker_1    | [2021-10-31 03:58:44,215: INFO/ForkPoolWorker-10] 
worker_1    | 
worker_1    | [WDM] - ====== WebDriver manager ======
worker_1    | [2021-10-31 03:58:44,217: INFO/ForkPoolWorker-10] ====== WebDriver manager ======
worker_1    | [WDM] - Current google-chrome version is 95.0.4638
worker_1    | [2021-10-31 03:58:44,520: INFO/ForkPoolWorker-12] Current google-chrome version is 95.0.4638
worker_1    | [WDM] - Get LATEST driver version for 95.0.4638
worker_1    | [2021-10-31 03:58:44,525: INFO/ForkPoolWorker-12] Get LATEST driver version for 95.0.4638
worker_1    | [WDM] - Current google-chrome version is 95.0.4638
worker_1    | [2021-10-31 03:58:44,590: INFO/ForkPoolWorker-11] Current google-chrome version is 95.0.4638
worker_1    | [WDM] - Get LATEST driver version for 95.0.4638
worker_1    | [2021-10-31 03:58:44,593: INFO/ForkPoolWorker-11] Get LATEST driver version for 95.0.4638
worker_1    | [2021-10-31 03:58:44,599: DEBUG/ForkPoolWorker-12] Starting new HTTPS connection (1): chromedriver.storage.googleapis.com:443
worker_1    | [2021-10-31 03:58:44,826: DEBUG/ForkPoolWorker-11] Starting new HTTPS connection (1): chromedriver.storage.googleapis.com:443
worker_1    | [2021-10-31 03:58:45,205: DEBUG/ForkPoolWorker-11] https://chromedriver.storage.googleapis.com:443 "GET /LATEST_RELEASE_95.0.4638 HTTP/1.1" 200 12
worker_1    | [2021-10-31 03:58:45,213: DEBUG/ForkPoolWorker-12] https://chromedriver.storage.googleapis.com:443 "GET /LATEST_RELEASE_95.0.4638 HTTP/1.1" 200 12
worker_1    | [WDM] - Driver [/home/ubuntu/.wdm/drivers/chromedriver/linux64/95.0.4638.54/chromedriver] found in cache
worker_1    | [2021-10-31 03:58:45,219: INFO/ForkPoolWorker-11] Driver [/home/ubuntu/.wdm/drivers/chromedriver/linux64/95.0.4638.54/chromedriver] found in cache
worker_1    | [WDM] - Driver [/home/ubuntu/.wdm/drivers/chromedriver/linux64/95.0.4638.54/chromedriver] found in cache
worker_1    | [2021-10-31 03:58:45,242: INFO/ForkPoolWorker-12] Driver [/home/ubuntu/.wdm/drivers/chromedriver/linux64/95.0.4638.54/chromedriver] found in cache
worker_1    | [WDM] - Current google-chrome version is 95.0.4638
worker_1    | [2021-10-31 03:58:45,603: INFO/ForkPoolWorker-9] Current google-chrome version is 95.0.4638
worker_1    | [WDM] - Get LATEST driver version for 95.0.4638
worker_1    | [2021-10-31 03:58:45,610: INFO/ForkPoolWorker-9] Get LATEST driver version for 95.0.4638

循环类似的日志。

当我尝试在 docker 容器上启动 chrome 时,发生了这个错误。

ubuntu@742a62c61201:/backend$ google-chrome --no-sandbox --disable-dev-shm-usage --disable-gpu --remote-debugging-port=9222 --headless
qemu: uncaught target signal 5 (Trace/breakpoint trap) - core dumped
qemu: uncaught target signal 5 (Trace/breakpoint trap) - core dumped
[1031/041139.297323:ERROR:bus.cc(392)] Failed to connect to the bus: Failed to connect to socket /var/run/dbus/system_bus_socket: No such file or directory
[1031/041139.310612:ERROR:file_path_watcher_linux.cc(326)] inotify_init() failed: Function not implemented (38)

DevTools listening on ws://127.0.0.1:9222/devtools/browser/32b15b93-3fe0-4cb8-9c96-8aea011686a8
qemu: unknown option 'type=utility'
[1031/041139.463057:ERROR:gpu_process_host.cc(973)] GPU process launch failed: error_code=1002
[1031/041139.463227:WARNING:gpu_process_host.cc(1292)] The GPU process has crashed 1 time(s)
[1031/041139.543335:ERROR:network_service_instance_impl.cc(638)] Network service crashed, restarting service.
qemu: unknown option 'type=utility'
[1031/041139.718793:ERROR:gpu_process_host.cc(973)] GPU process launch failed: error_code=1002
[1031/041139.718877:WARNING:gpu_process_host.cc(1292)] The GPU process has crashed 2 time(s)
[1031/041139.736641:ERROR:network_service_instance_impl.cc(638)] Network service crashed, restarting service.
qemu: unknown option 'type=utility'
[1031/041139.788529:ERROR:gpu_process_host.cc(973)] GPU process launch failed: error_code=1002
[1031/041139.788615:WARNING:gpu_process_host.cc(1292)] The GPU process has crashed 3 time(s)
[1031/041139.798487:ERROR:network_service_instance_impl.cc(638)] Network service crashed, restarting service.
[1031/041139.808256:ERROR:gpu_process_host.cc(973)] GPU process launch failed: error_code=1002
[1031/041139.808372:WARNING:gpu_process_host.cc(1292)] The GPU process has crashed 4 time(s)
qemu: unknown option 'type=utility'
[1031/041139.825267:ERROR:gpu_process_host.cc(973)] GPU process launch failed: error_code=1002
[1031/041139.825354:WARNING:gpu_process_host.cc(1292)] The GPU process has crashed 5 time(s)
[1031/041139.830175:ERROR:network_service_instance_impl.cc(638)] Network service crashed, restarting service.
[1031/041139.839159:ERROR:gpu_process_host.cc(973)] GPU process launch failed: error_code=1002
[1031/041139.839345:WARNING:gpu_process_host.cc(1292)] The GPU process has crashed 6 time(s)
[1031/041139.839816:FATAL:gpu_data_manager_impl_private.cc(417)] GPU process isn't usable. Goodbye.
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault
ubuntu@742a62c61201:/backend$ qemu: unknown option 'type=utility'

ubuntu@742a62c61201:/backend$ 

也许这个问题相关? https://github.com/docker/for-mac/issues/5766

如果是这样,就没有办法使用 m1 dockerize headless chrome 了吗?

芹菜工人 Dockerfile

FROM --platform=linux/amd64 ubuntu:20.04

ENV DEBIAN_FRONTEND noninteractive

RUN apt update -y && apt install python3.9 python3-pip python-is-python3 sudo wget -y

RUN pip install --upgrade pip

# set environment variables
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1

RUN adduser --disabled-password --gecos '' ubuntu
RUN adduser ubuntu sudo
RUN echo '%sudo ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers

USER ubuntu

RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | sudo apt-key add -
RUN echo "deb http://dl.google.com/linux/chrome/deb/ stable main" | sudo tee /etc/apt/sources.list.d/google.list
RUN sudo apt update -y && sudo apt install -y google-chrome-stable

ENV PATH="/home/ubuntu/.local/bin:$PATH"

WORKDIR /backend

COPY requirements.txt ./

RUN pip install -r requirements.txt --no-cache-dir

COPY . .

ENV DISPLAY=:99

ENTRYPOINT [ "./run-celery.sh" ]

docker-compose.yml

version: "3.3"

services:
  frontend:
    build:
      context: ./frontend
    ports:
      - "3000:3000"
    volumes:
      - ./frontend:/frontend
    depends_on:
      - backend
    deploy:
      resources:
        limits:
          cpus: "2"
          memory: 4G
        reservations:
          cpus: "0.5"
          memory: 512M
    tty: true
    stdin_open: true

  backend:
    build: ./backend
    ports:
      - "8000:8000"
    volumes:
      - ./backend:/backend
    networks:
      - redis-network
    depends_on:
      - redis
      - worker
    environment:
      - is_docker=1
    deploy:
      resources:
        limits:
          cpus: "2"
          memory: 4G
        reservations:
          cpus: "0.5"
          memory: 512M
    tty: true

  worker:
    build:
      context: ./backend
      dockerfile: ./celery-dockerfile/Dockerfile
    deploy:
      resources:
        limits:
          cpus: "2"
          memory: 4G
        reservations:
          cpus: "0.5"
          memory: 4G
    volumes:
      - ./backend:/backend
    networks:
      - redis-network
    depends_on:
      - redis
    environment:
      - is_docker=1
    privileged: true
    tty: true
    platform: linux/amd64

  redis:
    image: redis:alpine
    command: redis-server --port 6379
    container_name: redis_server
    hostname: redis_server
    labels:
      - "name=redis"
      - "mode=standalone"
    networks:
      - redis-network
    expose:
      - "6379"
    tty: true

networks:
  redis-network:

AutoCrawler 存储库中的爬虫完整代码。如果你想要完整的爬虫代码,最好查看这段代码。

我在试错过程中更改了选项。

chrome_options = Options()
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-dev-shm-usage')
chrome_options.add_argument('--disable-gpu')
chrome_options.add_argument("--remote-debugging-port=9222")

我认为没有办法在 m1 docker 上使用 chrome/chromium。

  • chrome arm64 linux
  • 没有二进制文件
  • 当带有 m1 主机的 amd64 容器上的 运行 chrome 崩溃时 - docker docs
  • chromium 可以使用 snap 安装,但在 docker 上没有 运行 snap 服务(没有 snap,出现 127 错误,因为 apt 的二进制文件是空的)- issue report

我试过了

Chromium 支持 arm ubuntu;我尝试使用铬而不是 chrome。

但是chrome驱动官方不支持arm64;我在电子发布时使用了非官方二进制文件。

绕过

最后,我决定在使用 docker 的同时使用 gechodriver 和 firefox。

无论 host/container 架构如何,它都能无缝运行。

找到答案了!

关键是:将容器版本与主机版本相匹配。 它可以通过不指定平台版本来完成。

我从 debian package server as mentioned at https://askubuntu.com/questions/1204571/how-to-install-chromium-without-snap (especially the way from https://www.inx.one/blog/debian-repo-on-ubuntu)

安装了 chromium

Docker 文件:

FROM ubuntu:20.04

ENV DEBIAN_FRONTEND noninteractive

RUN apt update -y && apt install python3.9 python3-pip python-is-python3 libgl1-mesa-glx axel sudo gdebi-core -y

RUN pip install --upgrade pip

# set environment variables
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1

WORKDIR /backend

COPY requirements.txt ./

RUN pip install -r requirements.txt --no-cache-dir

RUN umask 22 && \
    echo 'Package: *\nPin: release a=eoan\nPin-Priority: 500\n\nPackage: *\nPin: origin "ftp.debian.org"\nPin-Priority: 300\n\nPackage: chromium*\nPin: origin "ftp.debian.org"\nPin-Priority: 700\n\nPackage: libwebpmux3\nPin: origin "*.debian.org"\nPin-Priority: 700' \
    > /etc/apt/preferences.d/chromium.pref && \
    echo 'deb http://deb.debian.org/debian buster main\ndeb http://deb.debian.org/debian buster-updates main\ndeb http://deb.debian.org/debian-security buster/updates main\n' \
    > /etc/apt/sources.list.d/debian.list && \
    echo 'deb [signed-by=/usr/share/keyrings/debian-archive-keyring.gpg] http://deb.debian.org/debian stable main\ndeb-src [signed-by=/usr/share/keyrings/debian-archive-keyring.gpg] http://deb.debian.org/debian stable main\n\ndeb [signed-by=/usr/share/keyrings/debian-archive-keyring.gpg] http://deb.debian.org/debian-security/ stable-security main\ndeb-src [signed-by=/usr/share/keyrings/debian-archive-keyring.gpg] http://deb.debian.org/debian-security/ stable-security main\n\ndeb [signed-by=/usr/share/keyrings/debian-archive-keyring.gpg] http://deb.debian.org/debian stable-updates main\ndeb-src [signed-by=/usr/share/keyrings/debian-archive-keyring.gpg] http://deb.debian.org/debian stable-updates main\n' \ 
    > /etc/apt/sources.list.d/debian-stable.list

RUN sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys DCC9EFBF77E11517 && \
    sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 648ACFD622F3D138 && \
    sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys AA8E81B4331F7F50 && \
    sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 112695A0E562B32A

RUN apt install -y debian-archive-keyring && \
    apt update -y && \
    apt install chromium-sandbox chromium chromium-driver -y

COPY . .

ENTRYPOINT [ "./run-celery.sh" ]

另一个解决方案。只需使用 debian。

Docker 文件:

FROM python:3.9
# actually python image is debian based

ENV DEBIAN_FRONTEND noninteractive

RUN pip install --upgrade pip

# set environment variables
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1

WORKDIR /backend

COPY requirements.txt ./

RUN pip install -r requirements.txt --no-cache-dir

COPY . .

RUN apt update -y && apt install libgl1-mesa-glx sudo chromium chromium-driver -y

ENTRYPOINT [ "./run-celery.sh" ]

如官方文档所述,现在有一个社区驱动的 repo 为 ARM64、ARM/v7 和 AMD64 提供图像:

https://github.com/seleniumhq-community/docker-seleniarm#experimental-mult-arch-aarch64armhfamd64-images

切换到这些图像解决了我的问题,只需要最少的配置更改。