我应该如何在 Kubernetes 上为 JupyterHub 部署 Persistent Volume(PV)?

How should I deploy Persistent Volume(PV) for JupyterHub on Kubernetes?

环境信息:

Computer detail: One master node and four slave nodes. All are CentOS Linux release 7.8.2003 (Core).
Kubernetes version: v1.18.0.
Zero to JupyterHub version: 0.9.0.
Helm version: v2.11.0

我最近尝试通过零到 JupyterHub 在新的实验室服务器中部署在线代码环境(如 Google Colab)。不幸的是,我未能为 JupyterHub 部署 Persistent Volume (PV),并且收到如下失败消息:

Events:
  Type     Reason            Age                From               Message
  ----     ------            ----               ----               -------
  Warning  FailedScheduling  4s (x27 over 35m)  default-scheduler  running "VolumeBinding" filter plugin for pod "hub-7b9cbbcf59-747jl": pod has unbound immediate PersistentVolumeClaims

我是按照tutorial of JupyterHub的安装流程来的,我是用Helm在k8s上安装JupyterHub的。该配置文件如下所示:

config.yaml

proxy:
  secretToken: "2fdeb3679d666277bdb1c93102a08f5b894774ba796e60af7957cb5677f40706"
singleuser:
  storage:
    dynamic:
      storageClass: local-storage

在这里,我为 JupyterHub 配置了一个 local-storage,观察到了 local-storage k8s: Link。及其 yaml 文件 像那样:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer

然后我使用 kubectl get storageclass 检查它是否工作,我收到以下消息:

NAME            PROVISIONER                    RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
local-storage   kubernetes.io/no-provisioner   Delete          WaitForFirstConsumer   false                  64m

所以,我以为我为 JupyterHub 部署了一个存储,但我太天真了。我对此感到非常失望,因为我的其他 Pods(JupyterHub) 都是 运行。而且我一直在寻找一些解决方案,但还是失败了。

所以现在,我的问题是:

  1. PV问题的真正解决之道是什么? (最好使用本地存储。)

  2. 本地存储方式是否会使用其他节点磁盘而不仅仅是master?

  3. 事实上,我的实验室有云存储服务,所以如果 Q2 的答案是否定的,我如何使用我的实验室云存储服务来部署 PV?


@Arghya Sadhu 的解决方案解决了上述问题。但是现在,我遇到了一个新问题,即 Pod hub-db-dir 也处于待处理状态,这导致我的服务 proxy-public 处于待处理状态。

hub-db-dir的描述如下:

Name:           hub-7b9cbbcf59-jv49z
Namespace:      jhub
Priority:       0
Node:           <none>
Labels:         app=jupyterhub
                component=hub
                hub.jupyter.org/network-access-proxy-api=true
                hub.jupyter.org/network-access-proxy-http=true
                hub.jupyter.org/network-access-singleuser=true
                pod-template-hash=7b9cbbcf59
                release=jhub
Annotations:    checksum/config-map: c20a64c7c9475201046ac620b057f0fa65ad6928744f7d265bc8705c959bce2e
                checksum/secret: 1beaebb110d06103988476ec8a3117eee58d97e7dbc70c115c20048ea04e79a4
Status:         Pending
IP:
IPs:            <none>
Controlled By:  ReplicaSet/hub-7b9cbbcf59
Containers:
  hub:
    Image:      jupyterhub/k8s-hub:0.9.0
    Port:       8081/TCP
    Host Port:  0/TCP
    Command:
      jupyterhub
      --config
      /etc/jupyterhub/jupyterhub_config.py
      --upgrade-db
    Requests:
      cpu:      200m
      memory:   512Mi
    Readiness:  http-get http://:hub/hub/health delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      PYTHONUNBUFFERED:        1
      HELM_RELEASE_NAME:       jhub
      POD_NAMESPACE:           jhub (v1:metadata.namespace)
      CONFIGPROXY_AUTH_TOKEN:  <set to the key 'proxy.token' in secret 'hub-secret'>  Optional: false
    Mounts:
      /etc/jupyterhub/config/ from config (rw)
      /etc/jupyterhub/cull_idle_servers.py from config (rw,path="cull_idle_servers.py")
      /etc/jupyterhub/jupyterhub_config.py from config (rw,path="jupyterhub_config.py")
      /etc/jupyterhub/secret/ from secret (rw)
      /etc/jupyterhub/z2jh.py from config (rw,path="z2jh.py")
      /srv/jupyterhub from hub-db-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from hub-token-vlgwz (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      hub-config
    Optional:  false
  secret:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  hub-secret
    Optional:    false
  hub-db-dir:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  hub-db-dir
    ReadOnly:   false
  hub-token-vlgwz:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  hub-token-vlgwz
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age                 From               Message
  ----     ------            ----                ----               -------
  Warning  FailedScheduling  61s (x43 over 56m)  default-scheduler  0/5 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 4 node(s) didn't find available persistent volumes to bind.

带有kubectl get pv,pvc,sc的信息。

NAME                               STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS    AGE
persistentvolumeclaim/hub-db-dir   Pending                                      local-storage   162m

NAME                                                  PROVISIONER                    RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
storageclass.storage.k8s.io/local-storage (default)   kubernetes.io/no-provisioner   Delete          WaitForFirstConsumer   false                  8h

那么,如何解决呢?

  1. 我认为你需要将 local-storage 作为默认存储 class

kubectl patch storageclass local-storage -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

  1. 本地存储将使用调度 Pod 的节点的本地磁盘存储。

  2. 没有更多细节很难说。您可以手动创建 PV 或使用执行动态卷配置的存储 class。

除了@Arghya Sadhu 的回答之外,为了使用 local storage 使其工作,您必须手动创建一个 PersistentVolume

例如:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: hub-db-pv
spec:
  capacity:
    storage: 5Gi
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: local-storage
  local:
    path: <path_to_local_volume>
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - <name_of_the_node>

然后就可以部署图表了:

helm upgrade --install $RELEASE jupyterhub/jupyterhub \
  --namespace $NAMESPACE  \
  --version=0.9.0 \
  --values config.yaml

config.yaml 文件可以保持原样:

proxy:
  secretToken: "<token>"
singleuser:
  storage:
    dynamic:
      storageClass: local-storage