aks 中的存储 class 无法 chown 目录

storage class in aks can't chown a directory

希望你一切都好

我正在尝试使用 argocd 在 aks 中的 gitlab 中构建一个 cdap 映像

构建在我的本地 kubernetes 集群中使用 rook-ceph 存储工作 class 但在 aks 中使用托管高级存储 class 似乎权限有问题

这是我的存储空间 class :

#The default value for fileMode and dirMode is 0777 for Kubernetes #version 1.13.0 and above, you can modify as per your need
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: azurefile-zrs
provisioner: kubernetes.io/azure-file
mountOptions:
  - dir_mode=0777
  - file_mode=0777
  - uid=0
  - gid=0
  - mfsymlinks
  - cache=strict
parameters:
  skuName: Standard_LRS

这是我的统计集:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: {{ .Release.Name }}-sts
  labels:
    app: {{ .Release.Name }}
spec:
  revisionHistoryLimit: 2
  replicas: {{ .Values.replicas }}
  updateStrategy:
    type: RollingUpdate
  serviceName: {{ .Release.Name }}
  selector:
    matchLabels:
      app: {{ .Release.Name }}
  template:
    metadata:
      labels:
        app: {{ .Release.Name }}
    spec:
      imagePullSecrets:
        - name: regcred-secret-argo
      volumes:
        - name: nginx-proxy-config
          configMap:
            name: {{ .Release.Name }}-nginx-conf
      containers:
        - name: nginx
          image: nginx:1.17
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 80
            - containerPort: 8080
          volumeMounts:
            - name: nginx-proxy-config
              mountPath: /etc/nginx/conf.d/default.conf
              subPath: default.conf
        - name: cdap-sandbox
          image: {{ .Values.containerrepo }}:{{ .Values.containertag }}
          imagePullPolicy: Always
          resources:
            limits:
              cpu: 1000m
              memory: 8Gi
            requests:
              cpu: 500m
              memory: 6000Mi
          readinessProbe:
            httpGet:
              path: /
              port: 11011
            initialDelaySeconds: 30
            periodSeconds: 20  
          volumeMounts:
            - name: {{ .Release.Name }}-data
              mountPath: /opt/cdap/sandbox/data
          ports:
            - containerPort: 11011
            - containerPort: 11015
  volumeClaimTemplates:
    - metadata:
        name: {{ .Release.Name }}-data
      spec:
        accessModes: 
          - ReadWriteMany
        resources:
          requests:
            storage: "300Gi"

问题是 cdap pod 无法更改目录的所有权
这是日志:

Fri Oct 22 13:48:08 UTC 2021 Starting CDAP Sandbox ...LOGBACK: No context given for io.cdap.cdap.logging.framework.local.LocalLogAppender[io.cdap.cdap.logging.framework.local.LocalLogAppender]
55
log4j:WARN No appenders could be found for logger (DataNucleus.General).
54
log4j:WARN Please initialize the log4j system properly.
53
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
52
2021-10-22 13:48:56,030 - ERROR [main:i.c.c.StandaloneMain@446] - Failed to start Standalone CDAP
51
Failed to start Standalone CDAP
50
com.google.common.util.concurrent.UncheckedExecutionException: com.google.common.util.concurrent.UncheckedExecutionException: java.lang.RuntimeException: Error applying authorization policy on hive configuration: ExitCodeException exitCode=1: chmod: changing permissions of '/opt/cdap/sandbox-6.2.3/data/explore/tmp/cdap/7f546668-0ccc-45ae-8188-9ac12af4c504': Operation not permitted
49
48
    at com.google.common.util.concurrent.Futures.wrapAndThrowUnchecked(Futures.java:1015)
47
    at com.google.common.util.concurrent.Futures.getUnchecked(Futures.java:1001)
46
    at com.google.common.util.concurrent.AbstractService.startAndWait(AbstractService.java:220)
45
    at com.google.common.util.concurrent.AbstractIdleService.startAndWait(AbstractIdleService.java:106)
44
    at io.cdap.cdap.StandaloneMain.startUp(StandaloneMain.java:300)
43
    at io.cdap.cdap.StandaloneMain.doMain(StandaloneMain.java:436)
42
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
41
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
40
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
39
    at java.lang.reflect.Method.invoke(Method.java:498)
38
    at io.cdap.cdap.StandaloneMain.main(StandaloneMain.java:418)
37
Caused by: com.google.common.util.concurrent.UncheckedExecutionException: java.lang.RuntimeException: Error applying authorization policy on hive configuration: ExitCodeException exitCode=1: chmod: changing permissions of '/opt/cdap/sandbox-6.2.3/data/explore/tmp/cdap/7f546668-0ccc-45ae-8188-9ac12af4c504': Operation not permitted
36
35
    at com.google.common.util.concurrent.Futures.wrapAndThrowUnchecked(Futures.java:1015)
34
    at com.google.common.util.concurrent.Futures.getUnchecked(Futures.java:1001)
33
    at com.google.common.util.concurrent.AbstractService.startAndWait(AbstractService.java:220)
32
    at com.google.common.util.concurrent.AbstractIdleService.startAndWait(AbstractIdleService.java:106)
31
    at io.cdap.cdap.explore.executor.ExploreExecutorService.startUp(ExploreExecutorService.java:99)
30
    at com.google.common.util.concurrent.AbstractIdleService.run(AbstractIdleService.java:43)
29
    at java.lang.Thread.run(Thread.java:748)
28
Caused by: java.lang.RuntimeException: Error applying authorization policy on hive configuration: ExitCodeException exitCode=1: chmod: changing permissions of '/opt/cdap/sandbox-6.2.3/data/explore/tmp/cdap/7f546668-0ccc-45ae-8188-9ac12af4c504': Operation not permitted
27
26
    at org.apache.hive.service.cli.CLIService.init(CLIService.java:114)
25
    at io.cdap.cdap.explore.service.hive.BaseHiveExploreService.startUp(BaseHiveExploreService.java:309)
24
    at io.cdap.cdap.explore.service.hive.Hive14ExploreService.startUp(Hive14ExploreService.java:76)
23
    ... 2 more
22
Caused by: java.lang.RuntimeException: ExitCodeException exitCode=1: chmod: changing permissions of '/opt/cdap/sandbox-6.2.3/data/explore/tmp/cdap/7f546668-0ccc-45ae-8188-9ac12af4c504': Operation not permitted
21
20
    at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
19
    at org.apache.hive.service.cli.CLIService.applyAuthorizationConfigPolicy(CLIService.java:127)
18
    at org.apache.hive.service.cli.CLIService.init(CLIService.java:112)
17
    ... 4 more
16
Caused by: ExitCodeException exitCode=1: chmod: changing permissions of '/opt/cdap/sandbox-6.2.3/data/explore/tmp/cdap/7f546668-0ccc-45ae-8188-9ac12af4c504': Operation not permitted
15
14
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:972)
13
    at org.apache.hadoop.util.Shell.run(Shell.java:869)
12
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1170)
11
    at org.apache.hadoop.util.Shell.execCommand(Shell.java:1264)
10
    at org.apache.hadoop.util.Shell.execCommand(Shell.java:1246)
9
    at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:771)
8
    at org.apache.hadoop.fs.RawLocalFileSystem.mkOneDirWithMode(RawLocalFileSystem.java:515)
7
    at org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:555)
6
    at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:533)
5
    at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:313)
4
    at org.apache.hadoop.hive.ql.session.SessionState.createPath(SessionState.java:639)
3
    at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:574)
2
    at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:508)
1
    ... 6 more

不知道为什么改不了权限

如果我能得到任何帮助,我将不胜感激,因为我一直卡在这个问题上,我不知道如何解决这个问题,而不是安装一个我真的不想做的新供应商

提前谢谢大家,祝大家有个美好的一天

我做了一些研究,这让我想到了这个 github 问题: https://github.com/Azure/aks-engine/issues/1494

SMB mount options(including dir permission) could not be changed, it's by SMB proto design, while for disk(ext4, xfs) dir permission could be changed after mount close this issue, let me know if you have any question.

据我所知,挂载后没有选项chown

但是

我还找到了可能适用于您的问题的解决方法: https://docs.openshift.com/container-platform/3.11/install_config/persistent_storage/persistent_storage_azure_file.html

这是 将 MySQL 与 Azure File 一起用于 Openshift 的解决方法,但我认为它适用于您的情况。

spec:
  containers:
    ...
  securityContext:
    runAsUser: <mounted_dir_uid>
mountOptions:
  - dir_mode=0700
  - file_mode=0600
  - uid=<container_process_uid>
  - gid=0

经过大量测试后,我更改了存储 class 我使用以下方式安装了 rook-ceph:this procedure

注意: 您必须将 cluster.yaml 中的图像版本从 ceph/ceph:v14.​​2.4 更改为 ceph/ceph:v16