Kafka 不适用于外部 NFS 卷

Kafka doesn't work with external NFS Volume

我正在尝试 运行 已安装 NFS 卷的 Kafka,遇到异常且无法启动 Kafka:

    [2020-03-15 09:36:11,580] ERROR There was an error in one of the threads during logs loading: org.apache.kafka.common.KafkaException: Found directory /var/lib/kafka/data/.snapshot, '.snapshot' is not in the form of topic-partition or topic-partition.uniqueId-delete (if marked for deletion).
Kafka's log directories (and children) should only contain Kafka topic data. (kafka.log.LogManager)
[2020-03-15 09:36:11,582] ERROR [KafkaServer id=1] Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
org.apache.kafka.common.KafkaException: Found directory /var/lib/kafka/data/.snapshot, '.snapshot' is not in the form of topic-partition or topic-partition.uniqueId-delete (if marked for deletion).
Kafka's log directories (and children) should only contain Kafka topic data.
        at kafka.log.Log$.exception(Log.scala:2150)
        at kafka.log.Log$.parseTopicPartitionName(Log.scala:2157)
        at kafka.log.LogManager.kafka$log$LogManager$$loadLog(LogManager.scala:260)
        at kafka.log.LogManager$$anonfun$loadLogs$$anonfun$$anonfun$apply$$anonfun$apply.apply$mcV$sp(LogManager.scala:345)
        at kafka.utils.CoreUtils$$anon.run(CoreUtils.scala:63)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

这是我的 docker-compose 脚本:

  zookeeper:
    image: confluentinc/cp-zookeeper:5.3.2
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181
    volumes:
      - zk-data:/var/lib/zookeeper/data:nocopy
      - zk-log:/var/lib/zookeeper/log:nocopy

  kafka:
    image: confluentinc/cp-kafka:5.3.2
    environment:
      KAFKA_ADVERTISED_HOST_NAME: kafka 
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181 
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT
      KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
    volumes:
      - kf-data:/var/lib/kafka/data:nocopy


volumes:
  zk-data:
    driver: local
    driver_opts:
      type: "nfs"
      o: addr=18.0.3.227 #IP of NFS
      device: ":/opt/data/zk-data"
  zk-log:
    driver: local
    driver_opts:
      type: "nfs"
      o: addr=18.0.3.227
      device: ":/opt/data/zk-log"
  kf-data:
    driver: local
    driver_opts:
      type: "nfs"
      o: addr=18.0.3.227
      device: ":/opt/data/kf-data"

如果我访问我的 NFS 服务器,

ls -la /opt/data/kf-data/.snapshot

total 80
drwxrwxrwx 33 root   root         12288 Mar 28 00:10 .
drwx------  2 root domain^users  4096 Feb 21 19:20 ..
drwx------  2 root domain^users  4096 Feb 13 11:06 daily.2020-02-14_0010
drwx------  2 root domain^users  4096 Feb 13 11:06 daily.2020-02-15_0010
drwx------  2 root domain^users  4096 Feb 13 11:06 daily.2020-02-16_0010
drwx------  2 root domain^users  4096 Feb 13 11:06 daily.2020-02-17_0010
drwx------  2 root domain^users  4096 Feb 21 19:20 snapmirror.ka938443-8ea1-22e8-6608-00a067d1a20a_2148891236.2020-02-27_180700

有一个名为.snapshot 的隐藏文件夹,该文件夹由NFS 自动生成,无法删除。这就是 Kafka 抱怨的原因:Found directory /var/lib/kafka/data/.snapshot, '.snapshot' is not in the form of topic-partition or topic-partition.uniqueId-delete (if marked for deletion).

这可能是Kafka的一般问题,有什么特殊的配置或解决方案可以让Kafka使用外部NFS卷吗?

任何想法将不胜感激!

There is a hidden folder named .snapshot, this folder is generated by NFS automatically and can not be removed

好吧,如果没有办法解决这个问题,那么 Kafka 将无法启动,期间。


据我所知,文档中没有任何地方表示支持远程附加存储。

如果您使用 NetApp 作为 NFS 平台,此信息可能会有所帮助:禁用 NetApp 中的 .snapshot 访问是一个全局 vFilter 函数,它不是每个文件夹或共享的函数。

如果不能关闭对.snapshot的访问,没有解决办法,除非你使用其他NFS平台,不会在每个文件夹中生成.snapshot文件夹。

如上所述,由于 NFS 文件系统的工作方式,NFS 上的 Kafka 是一个有缺陷的解决方案。您将 运行 陷入重新分区和扩展的问题。这与 NFS 处理打开文件删除的方式有关 - 愚蠢的重命名行为。您可以在此博客 post (Kafka on NFS) 中阅读相关内容。

您试过不使用根目录吗? .snapshot 目录只能在根目录下访问。