KafkaStreams:更改目录权限时出错 /var/data/state-store

KafkaStreams: Error changing permissions for the directory /var/data/state-store

我在 AWS OCP 集群上 运行 Kafka Streams 3.1.0,我在 pod 重启期间遇到这个错误:

10:33:18,529 [INFO ] Loaded Kafka Streams properties {topology.optimization=all, processing.guarantee=at_least_once, bootstrap.servers=PLAINTEXT://app-kafka-headless.app.svc.cluster.local:9092, state.dir=/var/data/state-store, metrics.recording.level=INFO, consumer.auto.offset.reset=earliest, cache.max.bytes.buffering=10485760, producer.compression.type=lz4, num.stream.threads=3, application.id=AppProcessor}
10:33:18,572 [ERROR] Error changing permissions for the directory /var/data/state-store
java.nio.file.FileSystemException: /var/data/state-store: Operation not permitted
    at java.base/sun.nio.fs.UnixException.translateToIOException(Unknown Source)
    at java.base/sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source)
    at java.base/sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source)
    at java.base/sun.nio.fs.UnixFileAttributeViews$Posix.setMode(Unknown Source)
    at java.base/sun.nio.fs.UnixFileAttributeViews$Posix.setPermissions(Unknown Source)
    at java.base/java.nio.file.Files.setPosixFilePermissions(Unknown Source)
    at org.apache.kafka.streams.processor.internals.StateDirectory.configurePermissions(StateDirectory.java:154)
    at org.apache.kafka.streams.processor.internals.StateDirectory.<init>(StateDirectory.java:144)
    at org.apache.kafka.streams.KafkaStreams.<init>(KafkaStreams.java:867)
    at org.apache.kafka.streams.KafkaStreams.<init>(KafkaStreams.java:851)
    at org.apache.kafka.streams.KafkaStreams.<init>(KafkaStreams.java:821)
    at org.apache.kafka.streams.KafkaStreams.<init>(KafkaStreams.java:733)
    at com.xyz.app.kafka.streams.AbstractProcessing.run(AbstractProcessing.java:54)
    at com.xyz.app.kafka.streams.AppProcessor.main(AppProcessor.java:97)
10:33:18,964 [INFO ] Topologies:
   Sub-topology: 0
    Source: app-stream (topics: [app-app-stream])
      --> KSTREAM-AGGREGATE-0000000002
    Processor: KSTREAM-AGGREGATE-0000000002 (stores: [KSTREAM-AGGREGATE-STATE-STORE-0000000001])
      --> none
      <-- app-stream
10:33:18,991 [WARN ] stream-thread [main] Failed to delete state store directory of /var/data/state-store/AppProcessor for it is not empty

在OCP集群上,用户运行应用程序由集群提供,状态存储由持久卷提供(允许pod在相同的上下文中重启),所以/var/data/state-store/ 文件夹具有以下权限 drwxrwsr-x. (u:root g:1001030000) :

1001030000@app-processor-0:/$ ls -al /var/data/state-store/
total 24
drwxrwsr-x. 4 root       1001030000  4096 Mar 21 10:43 .
drwxr-xr-x. 3 root       root          25 Mar 23 11:04 ..
drwxr-x---. 2 1001030000 1001030000  4096 Mar 23 11:04 AppProcessor
drwxrws---. 2 root       1001030000 16384 Mar 21 10:36 lost+found

1001030000@app-processor-0:/$ chmod 750 /var/data/state-store/
chmod: changing permissions of '/var/data/state-store/': Operation not permitted

POD 清单相关部分是:

spec:
  containers:
  - name: app-processor
     volumeMounts:
     - mountPath: /var/data/state-store
       name: data
     securityContext:
       capabilities:
         drop:
         - KILL
         - MKNOD
         - SETGID
         - SETUID
  securityContext:
    fsGroup: 1001030000
    runAsUser: 1001030000
    seLinuxOptions:
      level: s0:c32,c19
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: data-app-processor-0

如何处理? 我们应该在 volumeMount 上使用 subPath 吗?

感谢您的见解。

按照建议,我找到的修复方法是在 mountPath:

下面设置一个 subPath

这里是使用的 helm 模板的相关部分:

spec:
  containers:
  - name: app-processor
    volumeMounts:
    - name: data
      mountPath: {{ dir .Values.streams.state_dir | default "/var/data/" }}
      subPath: {{ base .Values.streams.state_dir | default "state-store" }}

其中 .Values.streams.state_dir 映射到流 属性 state.dir。 注意这个值是必须的,必须在values中初始化。

在这种情况下,state-store 目录是由 securityContext.runAsUser 用户而不是 root 创建的,因此 org.apache.kafka.streams.processor.internals.StateDirectory class 可以强制执行权限。