Kafka 容器在一段时间后停止,客户端会话超时
Kafka Container stopped after sometime, Client Session Timedout
我在 docker 环境中有一个 Zookeeper 和 2 个 kafka 代理 运行。我能够启动动物园管理员和两个卡夫卡经纪人并成功 运行(producers/consumers 能够连接和 send/receive 数据)但过了一段时间(也许一天后),其中一个经纪人停了下来。下面是停止的kafka服务器的最后日志。
[2021-10-14 16:15:23,553] INFO [GroupCoordinator 2]: Preparing to rebalance group console-consumer-95901 in state PreparingRebalance with old generation 1 (__consumer_offsets-35) (reason: removing member consumer-console-consumer-95901-1-66524e7c-561d-49f8-882e-93e5ee9732fa on heartbeat expiration) (kafka.coordinator.group.GroupCoordinator)
[2021-10-14 16:15:23,553] INFO [GroupCoordinator 2]: Group console-consumer-95901 with generation 2 is now empty (__consumer_offsets-35) (kafka.coordinator.group.GroupCoordinator)
[2021-10-14 16:23:09,577] INFO [GroupMetadataManager brokerId=2] Group console-consumer-95901 transitioned to Dead in generation 2 (kafka.coordinator.group.GroupMetadataManager)
[2021-10-15 02:04:23,654] WARN Client session timed out, have not heard from server in 15654ms for sessionid 0x10005a177990003 (org.apache.zookeeper.ClientCnxn)
[2021-10-15 02:05:35,005] INFO Client session timed out, have not heard from server in 15654ms for sessionid 0x10005a177990003, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
这些是 zookeeper 的最后日志
[2021-10-15 02:04:54,812] INFO Expiring session 0x10005a177990003, timeout of 18000ms exceeded (org.apache.zookeeper.server.ZooKeeperServer)
[2021-10-15 02:05:28,649] INFO Expiring session 0x10005a177990002, timeout of 18000ms exceeded (org.apache.zookeeper.server.ZooKeeperServer)
[2021-10-15 02:07:27,106] WARN CancelledKeyException causing close of session 0x10005a177990002 (org.apache.zookeeper.server.NIOServerCnxn)
[2021-10-15 02:14:44,252] INFO Invalid session 0x10005a177990002 for client /172.18.0.3:36926, probably expired (org.apache.zookeeper.server.ZooKeeperServer)
我不能完全理解发生了什么,但看起来 broker 出于某种原因无法再与 zookeeper 通信。
下面是我的docker-撰写
version: '3.0'
services:
zookeeper-1:
image: confluentinc/cp-zookeeper:latest
container_name: zookeeper
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
ports:
- 22181:2181
kafka1:
image: confluentinc/cp-kafka:latest
container_name: kafka1
depends_on:
- zookeeper-1
ports:
- 29092:29092
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper-1:2181
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka1:9092,PLAINTEXT_HOST://my-hostname-here:29092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_HEAP_OPTS: -Xmx512M -Xms512M
kafka2:
image: confluentinc/cp-kafka:latest
container_name: kafka2
depends_on:
- zookeeper-1
ports:
- 29093:29093
environment:
KAFKA_BROKER_ID: 2
KAFKA_ZOOKEEPER_CONNECT: zookeeper-1:2181
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka2:9092,PLAINTEXT_HOST://my-hostname-here:29093
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_HEAP_OPTS: -Xmx512M -Xms512M
以下是容器的状态
在 docker ps
中,您会看到退出代码 137
这是一个 OOMKilled 代码,这意味着容器需要更多内存。
我建议您删除 KAFKA_HEAP_OPTS
并让 JVM 受限于容器的全部可用内存 space
我在 docker 环境中有一个 Zookeeper 和 2 个 kafka 代理 运行。我能够启动动物园管理员和两个卡夫卡经纪人并成功 运行(producers/consumers 能够连接和 send/receive 数据)但过了一段时间(也许一天后),其中一个经纪人停了下来。下面是停止的kafka服务器的最后日志。
[2021-10-14 16:15:23,553] INFO [GroupCoordinator 2]: Preparing to rebalance group console-consumer-95901 in state PreparingRebalance with old generation 1 (__consumer_offsets-35) (reason: removing member consumer-console-consumer-95901-1-66524e7c-561d-49f8-882e-93e5ee9732fa on heartbeat expiration) (kafka.coordinator.group.GroupCoordinator)
[2021-10-14 16:15:23,553] INFO [GroupCoordinator 2]: Group console-consumer-95901 with generation 2 is now empty (__consumer_offsets-35) (kafka.coordinator.group.GroupCoordinator)
[2021-10-14 16:23:09,577] INFO [GroupMetadataManager brokerId=2] Group console-consumer-95901 transitioned to Dead in generation 2 (kafka.coordinator.group.GroupMetadataManager)
[2021-10-15 02:04:23,654] WARN Client session timed out, have not heard from server in 15654ms for sessionid 0x10005a177990003 (org.apache.zookeeper.ClientCnxn)
[2021-10-15 02:05:35,005] INFO Client session timed out, have not heard from server in 15654ms for sessionid 0x10005a177990003, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
这些是 zookeeper 的最后日志
[2021-10-15 02:04:54,812] INFO Expiring session 0x10005a177990003, timeout of 18000ms exceeded (org.apache.zookeeper.server.ZooKeeperServer)
[2021-10-15 02:05:28,649] INFO Expiring session 0x10005a177990002, timeout of 18000ms exceeded (org.apache.zookeeper.server.ZooKeeperServer)
[2021-10-15 02:07:27,106] WARN CancelledKeyException causing close of session 0x10005a177990002 (org.apache.zookeeper.server.NIOServerCnxn)
[2021-10-15 02:14:44,252] INFO Invalid session 0x10005a177990002 for client /172.18.0.3:36926, probably expired (org.apache.zookeeper.server.ZooKeeperServer)
我不能完全理解发生了什么,但看起来 broker 出于某种原因无法再与 zookeeper 通信。
下面是我的docker-撰写
version: '3.0'
services:
zookeeper-1:
image: confluentinc/cp-zookeeper:latest
container_name: zookeeper
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
ports:
- 22181:2181
kafka1:
image: confluentinc/cp-kafka:latest
container_name: kafka1
depends_on:
- zookeeper-1
ports:
- 29092:29092
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper-1:2181
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka1:9092,PLAINTEXT_HOST://my-hostname-here:29092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_HEAP_OPTS: -Xmx512M -Xms512M
kafka2:
image: confluentinc/cp-kafka:latest
container_name: kafka2
depends_on:
- zookeeper-1
ports:
- 29093:29093
environment:
KAFKA_BROKER_ID: 2
KAFKA_ZOOKEEPER_CONNECT: zookeeper-1:2181
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka2:9092,PLAINTEXT_HOST://my-hostname-here:29093
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_HEAP_OPTS: -Xmx512M -Xms512M
以下是容器的状态
在 docker ps
中,您会看到退出代码 137
这是一个 OOMKilled 代码,这意味着容器需要更多内存。
我建议您删除 KAFKA_HEAP_OPTS
并让 JVM 受限于容器的全部可用内存 space