设置 sunbird-telemetry Kafka DRUID 和 superset

Setting up sunbird-telemetry Kafka DRUID and superset

我正在尝试创建一个基于移动事件的分析仪表板。我想将所有组件 docker 化到 docker 中的容器中,并将其部署在本地主机中并创建一个分析仪表板。

  1. 太阳鸟遥测https://github.com/project-sunbird/sunbird-telemetry-service
  2. 卡夫卡https://github.com/wurstmeister/kafka-docker
  3. 德鲁伊 https://github.com/apache/incubator-druid/tree/master/distribution/docker
  4. 超集https://github.com/apache/incubator-superset

我做了什么

德鲁伊

  1. 我执行了命令docker build -t apache/incubator-druid:tag -f distribution/docker/Dockerfile .

  2. 我执行了命令docker-compose -f distribution/docker/docker-compose.yml up

  3. 一切执行完毕后打开http://localhost:4008/并查看DRUID 运行ning

完成构建和 运行

需要 3.5 小时

卡夫卡

  1. 导航到 kafka 文件夹
  2. docker-compose up -d执行了这个命令

问题
当我们执行 druid 时,zookeeper 启动 运行ning,当我们启动 kafka 时,docker 文件启动另一个 zookeeper,我无法在 kafka 和 zookeeper 之间建立连接。 在我开始 sunbird 遥测并尝试创建主题并从 sunbird 连接 kafka 后,它没有连接。

我不明白我做错了什么。 能否让kafka共享DRUID启动的zookeeper。我对这个环境和这些堆栈完全陌生。 我正在研究这个堆栈。难道我做错了什么。谁能指出如何通过 docker.

正确连接 kafka 和德鲁伊

注意:- 我在 运行 中 mac

我的kafka compose文件

version: '2'
services:
  zookeeper:
    image: wurstmeister/zookeeper
    ports:
      - "2181:2181"
  kafka:
    build: .
    ports:
      - "9092"
    environment:
      KAFKA_ADVERTISED_HOST_NAME: **localhost ip**
      KAFKA_ZOOKEEPER_CONNECT: **localhost ip**:2181
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock

Can we tell kafka to share the zookeeper started by DRUID

您可以将所有服务放在同一个组合文件中。

这里列出了 Druids kafka 连接

https://github.com/apache/incubator-druid/blob/master/distribution/docker/environment#L31

您可以将KAFKA_ZOOKEEPER_CONNECT设置为相同的地址,是


例如下载上面的文件,将Kafka添加到Druid Compose文件中...

version: "2.2"

volumes:
  metadata_data: {}
  middle_var: {}
  historical_var: {}
  broker_var: {}
  coordinator_var: {}
  overlord_var: {}
  router_var: {}

services:

  # TODO: Add sunbird

  postgres:
    container_name: postgres
    image: postgres:latest
    volumes:
      - metadata_data:/var/lib/postgresql/data
    environment:
      - POSTGRES_PASSWORD=FoolishPassword
      - POSTGRES_USER=druid
      - POSTGRES_DB=druid

  # Need 3.5 or later for container nodes
  zookeeper:
    container_name: zookeeper
    image: zookeeper:3.5
    environment:
      - ZOO_MY_ID=1

  druid-coordinator:
    image: apache/incubator-druid
    container_name: druid-coordinator
    volumes:
      - coordinator_var:/opt/druid/var
    depends_on: 
      - zookeeper
      - postgres
    ports:
      - "3001:8081"
    command:
      - coordinator
    env_file:
      - environment

  # renamed to druid-broker
  druid-broker: 
    image: apache/incubator-druid
    container_name: druid-broker
    volumes:
      - broker_var:/opt/druid/var
    depends_on: 
      - zookeeper
      - postgres
      - druid-coordinator
    ports:
      - "3002:8082"
    command:
      - broker
    env_file:
      - environment

 # TODO: add other Druid services

 kafka:
    image: wurstmeister/kafka
    ports:
      - "9092"
    depends_on: 
      - zookeeper
    environment:
      KAFKA_ADVERTISED_HOST_NAME: kafka
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181/kafka  # This is the same service that Druid is using

Can we tell kafka to share the zookeeper started by DRUID

是的,因为 Kafka 代理有一个 zookeeper.connect 设置,它指定 Kafka 将尝试连接到的 Zookeeper 地址。如何操作完全取决于您使用的 docker 图像。比如一张热门图片wurstmeister/kafka-docker does this by mapping all environmental variables starting with KAFKA_ to broker settings and adds them to server.properties, so that KAFKA_ZOOKEEPER_CONNECT becomes a zookeeper.connect setting. I suggest taking a look at the official documentation看看你还能配置什么

and when we start kafka the docker file starts another zookeeper

这是你的问题。它是启动 Zookeeper、Kafka 并配置 Kafka 以使用捆绑的 Zookeeper 的 docker-compose 文件。您需要通过删除捆绑的 Zookeeper 并将 Kafka 配置为使用不同的 Zookeeper 来修改它。理想情况下,您应该有一个 docker-compose 文件来启动整个设置。