为什么 EMQX Persistence 在本地 kubernetes 上工作时,在 azure kubernetes 上不工作?

Why is EMQX Persistence not working on azure kubernetes when it is working on local kubernetes?

当在本地机器上使用 kubernetes(minikube) statefulset 时,EMQX 规则持续存在,因为相同的 pod IP 被分配给 emqx 节点,例如 /opt/emqx/data/mnesia/emqx@172.17.0.9。即使我在新的 pod 启动时删除了 pod,它也会被分配到与以前相同的 IP。一切正常。

但是当我使用 aks(azure kubernetes) 使用 azure 文件在 aks 集群上部署 EMQX 时,pod IP 每次都不一样。例如,如果 /opt/emqx/data/mnesia/emqx@10.1.1.10 分配给 EMQX 节点,那么如果我尝试删除 pod 那么 /opt/emqx/data/mnesia/emqx@10.1 .1.11 可能分配给它。

所以,没有什么是持久的。

Local code

---

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-storage5
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer

---

apiVersion: v1
kind: PersistentVolume
metadata:
  name: emqx-pv5
spec:
  capacity:
    storage: 300Mi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Delete
  storageClassName: local-storage5
  local:
    path: /opt/emqx/data/mnesia
  nodeAffinity:
    required:
      nodeSelectorTerms:
        - matchExpressions:
            - key: kubernetes.io/hostname
              operator: In
              values:
                - minikube

---

apiVersion: v1
kind: Service
metadata:
  name: emqx-headless
spec:
  type: ClusterIP
  clusterIP: None
  selector:
    app: emqx
  ports:
    - name: mqtt
      port: 1883
      protocol: TCP
      targetPort: 1883
    - name: mqttssl
      port: 8883
      protocol: TCP
      targetPort: 8883
    - name: mgmt
      port: 8081
      protocol: TCP
      targetPort: 8081
    - name: websocket
      port: 8083
      protocol: TCP
      targetPort: 8083
    - name: wss
      port: 8084
      protocol: TCP
      targetPort: 8084
    - name: dashboard
      port: 18083
      protocol: TCP
      targetPort: 18083

---

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: emqx-statefulset
  labels:
    app: emqx
spec:
  replicas: 1
  serviceName: emqx-headless
  selector:
    matchLabels:
      app: emqx
  template:
    metadata:
      labels:
        app: emqx
    spec:
      containers:
        - name: emqx
          image: emqx/emqx:4.2.7
          ports:
            - name: emqx-dashboard
              containerPort: 18083
            - name: ssl-port
              containerPort: 8883
            - name: emqx-port
              containerPort: 1883
            - name: ssl-dashboard
              containerPort: 18084
          env:
            - name: EMQX_LOADED_PLUGINS
              value: emqx_management,emqx_recon,emqx_retainer,emqx_dashboard,emqx_rule_engine,emqx_auth_username
            - name: EMQX_CLUSTER__DISCOVERY
              value: k8s
            - name: EMQX_NAME
              value: emqx
            - name: EMQX_CLUSTER__K8S__APISERVER
              value: https://kubernetes.default:443
            - name: EMQX_CLUSTER__K8S__SERVICE_NAME
              value: emqx
            - name: EMQX_CLUSTER__K8S__ADDRESS_TYPE
              value: ip
            - name: EMQX_CLUSTER__K8S__APP_NAME
              value: emqx
            - name: EMQX_ALLOW_ANONYMOUS
              value: "false"
            - name: EMQX_LISTENER__SSL__EXTERNAL__MAX_CONNECTIONS
              value: "1024000"
            - name: EMQX_AUTH__USER__PASSWORD_HASH
              value: sha256
            - name: EMQX_AUTH__USER__1__USERNAME
              value: 
            - name: EMQX_AUTH__USER__1__PASSWORD
              value: 
            - name: EMQX_DASHBOARD__DEFAULT_USER__LOGIN
              value: 
            - name: EMQX_DASHBOARD__DEFAULT_USER__PASSWORD
              value:
            - name: EMQX_DASHBOARD__LISTENER__HTTPS
              value: "18084"
            - name: MQX_DASHBOARD__LISTENER__HTTPS__ACCEPTORS
              value: "4"
            - name: EMQX_DASHBOARD__LISTENER__HTTPS__MAX_CLIENTS
              value: "512"
          tty: true
          volumeMounts:
            - name: emqx-mnesia
              mountPath: "/opt/emqx/data/mnesia"

  volumeClaimTemplates:
    - metadata:
        name: emqx-mnesia
      spec:
        accessModes: [ "ReadWriteOnce" ]
        storageClassName: "local-storage5"
        resources:
          requests:
            storage: 300Mi

Azure Kubernetes code

apiVersion: v1
kind: ServiceAccount
metadata:
  name: emqx
  namespace: emqx-test

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: emqx
subjects:
  - kind: ServiceAccount
    name: emqx
    namespace: emqx-test
roleRef:
  kind: ClusterRole
  name: cluster-admin
  apiGroup: rbac.authorization.k8s.io
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: emqx-files
provisioner: kubernetes.io/azure-file
mountOptions:
  - dir_mode=0777
  - file_mode=0777
  - uid=0
  - gid=0
  - mfsymlinks
  - cache=strict
  - actimeo=30
parameters:
  skuName: Standard_LRS
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: emqx-pvc
  namespace: emqx-test
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: emqx-files
  resources:
    requests:
      storage: 1Gi
---
apiVersion: v1
kind: Service
metadata:
  name: emqx
  namespace: emqx-test
spec:
  ports:
    - name: emqx-dashboard
      port: 80
      targetPort: 18083
      protocol: TCP
    - name: ssl-port
      port: 8883
      targetPort: ssl-port
      protocol: TCP
    - name: emqx-port
      port: 1883
      targetPort: emqx-port
      protocol: TCP
    - name: ssl-dashboard
      port: 443
      targetPort: 18084
      protocol: TCP
  selector:
    app: emqx
  type: LoadBalancer
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: emqx
  labels:
    app: emqx
  namespace: emqx-test
spec:
  serviceName: "emqx"
  selector:
    matchLabels:
      app: emqx
  replicas: 1
  template:
    metadata:
      labels:
        app: emqx
    spec:
      containers:
        - name: emqx
          image: emqx/emqx:4.2.7
          ports:
            - name: emqx-dashboard
              containerPort: 18083
            - name: ssl-port
              containerPort: 8883
            - name: emqx-port
              containerPort: 1883
            - name: ssl-dashboard
              containerPort: 18084
          env:
            - name: EMQX_LOADED_PLUGINS
              value: emqx_management,emqx_recon,emqx_retainer,emqx_dashboard,emqx_rule_engine,emqx_auth_username
            - name: EMQX_CLUSTER__DISCOVERY
              value: k8s
            - name: EMQX_NAME
              value: emqx
            - name: EMQX_CLUSTER__K8S__APISERVER
              value: https://kubernetes.default:443
            - name: EMQX_CLUSTER__K8S__NAMESPACE
              value: emqx-test
            - name: EMQX_CLUSTER__K8S__SERVICE_NAME
              value: emqx
            - name: EMQX_CLUSTER__K8S__ADDRESS_TYPE
              value: ip
            - name: EMQX_CLUSTER__K8S__APP_NAME
              value: emqx
            - name: EMQX_ALLOW_ANONYMOUS
              value: "false"
            - name: EMQX_LISTENER__SSL__EXTERNAL__MAX_CONNECTIONS
              value: "1024000"
            - name: EMQX_AUTH__USER__PASSWORD_HASH
              value: sha256
            - name: EMQX_AUTH__USER__1__USERNAME
              value:
            - name: EMQX_AUTH__USER__1__PASSWORD
              value:
            - name: EMQX_DASHBOARD__DEFAULT_USER__LOGIN
              value:
            - name: EMQX_DASHBOARD__DEFAULT_USER__PASSWORD
              value:
            - name: EMQX_DASHBOARD__LISTENER__HTTPS
              value: "18084"
            - name: MQX_DASHBOARD__LISTENER__HTTPS__ACCEPTORS
              value: "4"
            - name: EMQX_DASHBOARD__LISTENER__HTTPS__MAX_CLIENTS
              value: "512"
          volumeMounts:
            - name: emqx-data
              mountPath: "/opt/emqx/data/mnesia"
          tty: true
      volumes:
        - name: emqx-data
          persistentVolumeClaim:
            claimName: emqx-pvc

在关于 StatefulSet Basics 的 k8s 文档中,您阅读了:

The Pods' ordinals, hostnames, SRV records, and A record names have not changed, but the IP addresses associated with the Pods may have changed. In the cluster used for this tutorial, they have. This is why it is important not to configure other applications to connect to Pods in a StatefulSet by IP address.

这是预期的,如您所见,文档中提到了这种行为。

但是为什么您会看到 minikube 上的行为与 azure 上的行为不同? IP 地址由 CNI 分配。在 minikube 默认 CNI 上,它是 docker-bridge,在 azure 上它是 Azure CNI,所以由 CNI 分配什么地址。

最好始终假设您不能依赖 pod IP 地址来保持静态。使用 DNS for statefulsets and for other pods and services 进行通信,切勿直接使用硬编码的 pod ip 地址。