上传文件时Minio OOM(Out Of Memory)

Minio OOM(Out Of Memory) when uploading a file

我在 Mac 上的 Minikube 中有一个本地 Kubernetes 集群。我将 Minio 独立服务器部署为指定资源限制的单个容器。当我上传一个大于容器内存限制的文件时,容器因 OOMKilled 原因而终止。在 Ubuntu 上载相同的设置文件,没有错误。

Minikube 在 VirtualBox 中 运行 并配置为使用 4GB 内存。我还使用 Heapster 和 Metric Server 来检查一段时间内的内存使用情况。

$ minikube config set memory 4096
$ minikube addons enable heapster
$ minikube addons enable metrics-server
$ minikube start

我使用 Kubernetes confiuration for Minio standalone setup provided in Minio documentation 的略微修改版本。我为 Minio 服务器的存储、部署和服务创建 PV 和 PVC。容器配置:

资源限制设置为具有保证的 QoS。容器限制为 256 MB 内存和 0.5 CPU.

resources:
  requests:
    cpu: '500m'
    memory: '256Mi'
  limits:
    cpu: '500m'
    memory: '256Mi'

完整 videos-storage.yaml

apiVersion: v1
kind: PersistentVolume
metadata:
  name: videos-storage-pv
  labels:
    software: minio
spec:
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: /data/storage-videos/

---

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: videos-storage-pv-claim
spec:
  storageClassName: ''
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 2Gi
  selector:
    matchLabels:
      software: minio

---

apiVersion: apps/v1
kind: Deployment
metadata:
  name: videos-storage-deployment
spec:
  selector:
    matchLabels:
      app: videos-storage
  template:
    metadata:
      labels:
        app: videos-storage
    spec:
      initContainers:
      - name: init-minio-buckets
        image: minio/minio
        volumeMounts:
        - name: data
          mountPath: /data/storage-videos
        command: ['mkdir', '-p', '/data/storage-videos/videos']
      containers:
      - name: minio
        image: minio/minio
        volumeMounts:
        - name: data
          mountPath: /data/storage-videos
        args:
        - server
        - /data/storage-videos
        env:
        - name: MINIO_ACCESS_KEY
          value: 'minio'
        - name: MINIO_SECRET_KEY
          value: 'minio123'
        ports:
        - containerPort: 9000
        resources:
          requests:
            cpu: '500m'
            memory: '256Mi'
          limits:
            cpu: '500m'
            memory: '256Mi'
        readinessProbe:
          httpGet:
            path: /minio/health/ready
            port: 9000
          initialDelaySeconds: 5
          periodSeconds: 20
        livenessProbe:
          httpGet:
            path: /minio/health/live
            port: 9000
          initialDelaySeconds: 5
          periodSeconds: 20
      volumes:
      - name: data
        persistentVolumeClaim:
          claimName: videos-storage-pv-claim

---

apiVersion: v1
kind: Service
metadata:
  name: videos-storage-service
spec:
  type: LoadBalancer
  ports:
    - port: 9000
      targetPort: 9000
      protocol: TCP
  selector:
    app: videos-storage

我将配置部署到集群:

$ kubectl apply -f videos-storage.yaml

并使用 Minikube 访问 Minio 仪表板,以下命令在浏览器中打开它:

$ minikube service videos-storage-service

然后我 select 一个 videos 存储桶并上传一个 1 GB 的文件。上传大约 250 MB 后,我在 Minio 仪表板中收到错误消息。玩弄限制并分析 Heapster 图表,我可以看到文件大小和内存使用之间的相关性。容器使用与文件大小完全相同的内存量,这对我来说很奇怪。我的印象是文件上传时不会直接存储在内存中。

描述 pod

Name:           videos-storage-deployment-6cd94b697-p4v8n
Namespace:      default
Priority:       0
Node:           minikube/10.0.2.15
Start Time:     Mon, 22 Jul 2019 11:05:53 +0300
Labels:         app=videos-storage
                pod-template-hash=6cd94b697
Annotations:    <none>
Status:         Running
IP:             172.17.0.8
Controlled By:  ReplicaSet/videos-storage-deployment-6cd94b697
Init Containers:
  init-minio-buckets:
    Container ID:  docker://09d75629a39ad1dc0dbdd6fc0a6a6b7970285d0a349bccee2b0ed2851738a9c1
    Image:         minio/minio
    Image ID:      docker-pullable://minio/minio@sha256:456074355bc2148c0a95d9c18e1840bb86f57fa6eac83cc37fce0212a7dae080
    Port:          <none>
    Host Port:     <none>
    Command:
      mkdir
      -p
      /data/storage-videos/videos
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Mon, 22 Jul 2019 11:08:45 +0300
      Finished:     Mon, 22 Jul 2019 11:08:45 +0300
    Ready:          True
    Restart Count:  1
    Environment:    <none>
    Mounts:
      /data/storage-videos from data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-zgs9q (ro)
Containers:
  minio:
    Container ID:  docker://1706139f0cc7852119d245e3cfe31d967eb9e9537096a803e020ffcd3becdb14
    Image:         minio/minio
    Image ID:      docker-pullable://minio/minio@sha256:456074355bc2148c0a95d9c18e1840bb86f57fa6eac83cc37fce0212a7dae080
    Port:          9000/TCP
    Host Port:     0/TCP
    Args:
      server
      /data/storage-videos
    State:          Running
      Started:      Mon, 22 Jul 2019 11:08:48 +0300
    Last State:     Terminated
      Reason:       OOMKilled
      Exit Code:    0
      Started:      Mon, 22 Jul 2019 11:06:06 +0300
      Finished:     Mon, 22 Jul 2019 11:08:42 +0300
    Ready:          True
    Restart Count:  1
    Limits:
      cpu:     500m
      memory:  256Mi
    Requests:
      cpu:      500m
      memory:   256Mi
    Liveness:   http-get http://:9000/minio/health/live delay=5s timeout=1s period=20s #success=1 #failure=3
    Readiness:  http-get http://:9000/minio/health/ready delay=5s timeout=1s period=20s #success=1 #failure=3
    Environment:
      MINIO_ACCESS_KEY:  minio
      MINIO_SECRET_KEY:  minio123
    Mounts:
      /data/storage-videos from data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-zgs9q (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  videos-storage-pv-claim
    ReadOnly:   false
  default-token-zgs9q:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-zgs9q
    Optional:    false
QoS Class:       Guaranteed
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:          <none>

Minikube 登录 dmesg:

[  +3.529889] minio invoked oom-killer: gfp_mask=0x14000c0(GFP_KERNEL), nodemask=(null), order=0, oom_score_adj=-998
[  +0.000006] CPU: 1 PID: 8026 Comm: minio Tainted: G           O     4.15.0 #1
[  +0.000001] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[  +0.000000] Call Trace:
[  +0.000055]  dump_stack+0x5c/0x82
[  +0.000010]  dump_header+0x66/0x281
[  +0.000006]  oom_kill_process+0x223/0x430
[  +0.000002]  out_of_memory+0x28d/0x490
[  +0.000003]  mem_cgroup_out_of_memory+0x36/0x50
[  +0.000004]  mem_cgroup_oom_synchronize+0x2d3/0x310
[  +0.000002]  ? get_mem_cgroup_from_mm+0x90/0x90
[  +0.000002]  pagefault_out_of_memory+0x1f/0x4f
[  +0.000002]  __do_page_fault+0x4a3/0x4b0
[  +0.000003]  ? page_fault+0x36/0x60
[  +0.000002]  page_fault+0x4c/0x60
[  +0.000002] RIP: 0033:0x427649
[  +0.000001] RSP: 002b:000000c0002eaae8 EFLAGS: 00010246
[  +0.000154] Memory cgroup out of memory: Kill process 7734 (pause) score 0 or sacrifice child
[  +0.000013] Killed process 7734 (pause) total-vm:1024kB, anon-rss:4kB, file-rss:0kB, shmem-rss:0kB

最初问题发生在我没有任何资源限制的情况下。当我尝试上传一个大文件时,带有 Minio 的容器将使用节点中所有可用的内存,并且由于没有剩余内存,Kubernetes 服务变得无响应并开始杀死其他容器,例如 apiserver;并且文件也不会上传。之后我给 Minio 容器添加了资源限制,集群本身变得稳定,但是 Minio 容器仍然死了。

我希望 Minio 在提供的限制内运行,并且不消耗等于文件大小的内存。我不确定问题出在哪一边,是 Minio 还是 Minikube 还是 VirtualBox 还是 Docker 还是 Kubernetes。我也不熟悉内存在此设置中的工作方式。正如我所说,相同的设置在 Ubuntu 18.04.

上运行良好

版本:

我也 posted this issue to Minio repository,得到的回复是 256mb 太低了。将内存增加到 512mb 后,它工作正常。