流部署卡住然后失败,日志中没有错误

Streams deploying stuck and then fails with no errors in the logs

我正在尝试部署以下流:

STREAM_2=:messages > filter --expression="#jsonPath(payload, '$.id')==1" | rabbit --queues=id_1 --host=rabbitmq --routing-key=id_1 --exchange=ex_1 --own-connection=true
STREAM_3=:messages > filter --expression="#jsonPath(payload, '$.id')==2" | rabbit --queues=id_2 --host=rabbitmq --routing-key=id_2 --exchange=ex_1
STREAM_4=:messages > filter --expression="#jsonPath(payload, '$.id')==3" | rabbit --queues=id_3 --host=rabbitmq --routing-key=id_3 --exchange=ex_1
STREAM_1=rabbit --queues=hello_queue --host=rabbitmq > :messages

可视化:

我正在侦听队列,然后根据消息的属性之一将消息发送到不同的队列。

我是运行一个本地系统,用的是这个docker-compose.yml,但是我改用RabbitMQ而不是Kafka进行通信

当我部署流时,数据流服务器容器需要几分钟才能达到最大内存使用率,并最终在随机流上失败(有时会杀死容器)。

日志(stdoutstderr)没有显示错误。

我运行最新版本如下:

DATAFLOW_VERSION=2.0.1.RELEASE SKIPPER_VERSION=2.0.0.RELEASE docker-compose up

我注意到的另一件事,在我不断收到的日志中:

2019-03-27 09:35:00.485 WARN 70 --- [| adminclient-1] org.apache.kafka.clients.NetworkClient : [AdminClient clientId=adminclient-1] Connection to node -1 could not be established. Broker may not be available.

尽管我的 docker-compose.yml 中没有与 Kafka 相关的内容。有什么想法是从哪里来的吗?

我的 YAML 中的相关部分:

version: '3'

services:
  mysql:
    image: mysql:5.7.25
    environment:
      MYSQL_DATABASE: dataflow
      MYSQL_USER: root
      MYSQL_ROOT_PASSWORD: rootpw
    expose:
      - 3306
  dataflow-server:
    image: springcloud/spring-cloud-dataflow-server:${DATAFLOW_VERSION:?DATAFLOW_VERSION is not set!}
    container_name: dataflow-server
    ports:
      - "9393:9393"
    environment:
      - spring.datasource.url=jdbc:mysql://mysql:3306/dataflow
      - spring.datasource.username=root
      - spring.datasource.password=rootpw
      - spring.datasource.driver-class-name=org.mariadb.jdbc.Driver
      - spring.cloud.skipper.client.serverUri=http://skipper-server:7577/api
      - spring.cloud.dataflow.applicationProperties.stream.spring.rabbitmq.host=rabbitmq
    depends_on:
      - rabbitmq

  rabbitmq:
    image: "rabbitmq:3-management"
    ports:
      - "5672:5672"
      - "15672:15672"
    expose:
      - "5672"

  app-import:
    ...

  skipper-server:
    image: springcloud/spring-cloud-skipper-server:${SKIPPER_VERSION:?SKIPPER_VERSION is not set!}
    container_name: skipper
    ports:
    - "7577:7577"
    - "9000-9010:9000-9010"

volumes:
  scdf-targets:

看来我是 OOM 杀手的受害者。容器崩溃,退出代码为 137。

现在对我来说最简单的解决方案是增加 Docker 内存:

CONTAINER ID        NAME                CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O           PIDS
9a0e3ff0beb8        dataflow-server     0.18%               1.293GiB / 11.71GiB   11.04%              573kB / 183kB       92.1MB / 279kB      49
2a448b3583a3        scdf_kafka_1        7.00%               291.6MiB / 11.71GiB   2.43%               4.65MB / 3.64MB     40.4MB / 36.9kB     73
eb9a70ce2a0e        scdf_rabbitmq_1     2.15%               94.21MiB / 11.71GiB   0.79%               172kB / 92.5kB      41.7MB / 139kB      128
06dd2d6a1501        scdf_zookeeper_1    0.16%               81.72MiB / 11.71GiB   0.68%               77.8kB / 99.2kB     36.7MB / 45.1kB     25
1f1b782ad66d        skipper             8.64%               6.55GiB / 11.71GiB    55.93%              3.63MB / 4.73MB     213MB / 0B          324

skipper 容器现在使用 6.55GiB 内存,如果有人知道它可能是什么,我将不胜感激。

目前,我接受我的回答,因为它确实提供了一种解决方法,尽管我觉得可能有比增加 Docker.

的内存限制更好的解决方案

编辑:

看起来这确实是解决方案,从这个 GitHub issue:

Stream components (parts of the pipe) are deployed as applications. Those applications are deployed into the Skipper container (as well as the Skipper application itself) since skipper deploys streams. The more applications that get deployed (parts of the pipe, streams, etc) the more memory is used.