MySQL Azure 集群中使用 Azure 文件作为 PV 的数据库无法启动

MySQL database in Azure cluster using Azure Files as PV won't start

我有一个 Azure kubernetes 集群,但由于每个节点附加的默认卷的限制(我的节点大小为 8 个),我不得不找到一个不同的解决方案来配置卷。
解决方案是使用 Azure 文件卷,我按照这篇文章 https://docs.microsoft.com/en-us/azure/aks/azure-files-volume#mount-options 进行了操作,我安装了一个卷。

但问题出在 MySQL 实例上,它就是无法启动。

出于测试目的,我创建了一个包含 2 个简单数据库容器的部署,其中一个使用 default 存储 class 卷,第二个使用Azure 文件.

这是我的清单:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: test-db
  labels:
    prj: test-db
spec:
  selector:
    matchLabels:
      prj: test-db
  template:
    metadata:
      labels:
        prj: test-db
    spec:
      containers:
        - name: db-default
          image: mysql:5.7.37
          imagePullPolicy: IfNotPresent
          args:
            - "--ignore-db-dir=lost+found"
          ports:
            - containerPort: 3306
              name: mysql
          env:
            - name: MYSQL_ROOT_PASSWORD
              value: password
          volumeMounts:
            - name: default-pv
              mountPath: /var/lib/mysql
              subPath: test

        - name: db-azurefiles
          image: mysql:5.7.37
          imagePullPolicy: IfNotPresent
          args:
            - "--ignore-db-dir=lost+found"
            - "--initialize-insecure"
          ports:
            - containerPort: 3306
              name: mysql
          env:
            - name: MYSQL_ROOT_PASSWORD
              value: password
          volumeMounts:
            - name: azurefile-pv
              mountPath: /var/lib/mysql
              subPath: test
      volumes:
        - name: default-pv
          persistentVolumeClaim:
            claimName: default-pvc
        - name: azurefile-pv
          persistentVolumeClaim:
            claimName: azurefile-pvc
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: default-pvc
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: default
  resources:
    requests:
      storage: 200Mi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: azurefile-pvc
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: azure-file-store
  resources:
    requests:
      storage: 200Mi
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
mountOptions:
- dir_mode=0777
- file_mode=0777
- uid=0
- gid=0
- mfsymlinks
- cache=strict
- nosharesock
parameters:
  skuName: Standard_LRS
provisioner: file.csi.azure.com
reclaimPolicy: Delete
volumeBindingMode: Immediate

带有默认 PV 的那个没有任何问题,但是第二个带有 Azure-files 的会抛出这个错误:

[Note] [Entrypoint]: Entrypoint script for MySQL Server 5.7.37-1debian10 started.
[Note] [Entrypoint]: Switching to dedicated user 'mysql'
[Note] [Entrypoint]: Entrypoint script for MySQL Server 5.7.37-1debian10 started.
[Note] [Entrypoint]: Initializing database files
[Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
[Warning] InnoDB: New log files created, LSN=45790
[Warning] InnoDB: Creating foreign key constraint system tables.
[Warning] No existing UUID has been found, so we assume that this is the first time that this server has been started. Generating a new UUID: e86bdae0-979b-11ec-abbf-f66bf9455d85.
[Warning] Gtid table is not ready to be used. Table 'mysql.gtid_executed' cannot be opened.
mysqld: Can't change permissions of the file 'ca-key.pem' (Errcode: 1 - Operation not permitted)
[ERROR] Could not set file permission for ca-key.pem
[ERROR] Aborting

根据错误,数据库似乎无法写入卷安装,但这并非(完全)正确。我将这两个卷都安装到另一个容器以便能够检查文件,这是输出,我们可以看到数据库能够在卷上写入文件:

-rwxrwxrwx 1 root root       56 Feb 27 07:07 auto.cnf
-rwxrwxrwx 1 root root     1680 Feb 27 07:07 ca-key.pem
-rwxrwxrwx 1 root root      215 Feb 27 07:07 ib_buffer_pool
-rwxrwxrwx 1 root root 50331648 Feb 27 07:07 ib_logfile0
-rwxrwxrwx 1 root root 50331648 Feb 27 07:07 ib_logfile1
-rwxrwxrwx 1 root root 12582912 Feb 27 07:07 ibdata1

显然,有些文件丢失了,但是这个输出反驳了我认为 Mysql 无法写入文件夹的想法。

我的猜测是,MySQL 无法与 Azure 文件上使用的文件系统一起正常工作。

我尝试了什么:

似乎是以这种方式安装的卷的权限导致了这个问题。

如果我们修改您的存储 class 以匹配 mysql 用户的 uid/gid,则 pod 可以启动:

apiVersion: storage.k8s.io/v1
kind: StorageClass
mountOptions:
- dir_mode=0777
- file_mode=0777
- uid=999
- gid=999
- mfsymlinks
- cache=strict
- nosharesock
parameters:
  skuName: Standard_LRS
provisioner: file.csi.azure.com
reclaimPolicy: Delete
volumeBindingMode: Immediate

挂载选项永久设置挂载中包含的文件的所有者,这对于想要拥有其创建的文件的任何人来说都不太适用。因为事物是在 777 中创建的,所以任何人都可以 read/write 访问目录而不是它们的所有者。