GKE 集群的 velero 备份失败

velero backup failing for GKE Cluster

我正在尝试使用 velero 创建 GKE Cluster 的备份。我在GKE Cluster上安装成功如下图

   $ kubectl get deployment/velero --namespace velero
   NAME     READY   UP-TO-DATE   AVAILABLE   AGE
   velero   1/1     1            1           43h 

   $ kubectl get pods --namespace velero
   NAME                      READY   STATUS    RESTARTS    AGE
   velero-847c69f497-hwv6l   1/1     Running     0          43h  

我执行了以下命令来启动备份

  $ velero backup create cluster1-backup --include-namespaces default --snapshot-volumes
  Backup request "cluster1-backup" submitted successfully.
  Run `velero backup describe cluster1-backup` or `velero backup logs cluster1-backup` for more details.

看起来备份过程失败了

  $ velero backup describe cluster1-backup
   Name:         cluster1-backup
   Namespace:    velero
   Labels:       velero.io/storage-location=default
   Annotations:  velero.io/source-cluster-k8s-gitversion=v1.15.12-gke.20
   velero.io/source-cluster-k8s-major-version=1
   velero.io/source-cluster-k8s-minor-version=15+

   Phase:  Failed (run `velero backup logs cluster1-backup` for more information)

   Errors:    0
   Warnings:  0

   Namespaces:
   Included:  default
   Excluded:  <none>

   Resources:
   Included:        *
   Excluded:        <none>
   Cluster-scoped:  auto
   Label selector:  <none>
   Storage Location:  default
   Velero-Native Snapshot PVs:  true
   TTL:  720h0m0s
   Hooks:  <none>
   Backup Format Version:  1.1.0

   Started:    2020-10-05 09:57:12 +0000 UTC
   Completed:  <n/a>

   Expiration:  2020-11-04 09:57:12 +0000 UTC
   Velero-Native Snapshots: <none included>

  $ velero get backups
  NAME              STATUS   ERRORS   WARNINGS   CREATED  EXPIRES   STORAGE LOCATION   SELECTOR
 cluster1-backup    Failed   0        0     2020-10-05 09:57:12 +0000 UTC   29d default        <none>

日志显示如下

$ velero backup logs cluster1-backup
An error occurred: timed out waiting for download URL

我在启用了 Master Authorized NetworksSharedVPC 上使用 public GKE Cluster 35.235.240.0/20。有什么解决问题的建议吗?

问题现已解决

在日志中看到以下错误

 kubectl logs deployment/velero -n velero
 

time="2020-10-05T13:41:19Z" level=error msg="Error getting backup store for this location" backupLocation=default controller=backup-sync error="backup storage location's bucket name \"gs://bucketname/\" must not contain a '/' (if using a prefix, put it in the 'Prefix' field instead)" error.file="/go/src/github.com/vmware-tanzu/velero/pkg/persistence/object_store.go:110" error.function=github.com/vmware-tanzu/velero/pkg/persistence.NewObjectBackupStore logSource="pkg/controller/backup_sync_controller.go:168"

创建环境变量时bucket名称后面有一个'/'

好像在创建环境变量的时候,不用在环境变量中加上"gs://"。

            BUCKET=bucketname

如果存储桶不存在,请创建如下所示的存储桶

  gsutil mb gs://$BUCKET/

安装velero服务器时,velero install命令中bucket名称前不要加gs://,如下所示

 velero install --provider gcp --plugins velero/velero-plugin-for-gcp:v1.1.0 --bucket $BUCKET  --secret-file ./credentials-velero

BUCKET=bucketname

$ velero backup describe backup-test-ns
Name:         backup-test-ns
Namespace:    velero
Labels:       <none>
Annotations:  <none>

Phase:  New

Errors:    0
Warnings:  0

Namespaces:
   Included:  backup-test
   Excluded:  <none>

 Resources:
    Included:        *
    Excluded:        <none>
    Cluster-scoped:  auto

 Label selector:  <none>

 Storage Location:

 Velero-Native Snapshot PVs:  auto

 TTL:  720h0m0s

 Hooks:  <none>

 Backup Format Version:

 Started:    <n/a>
 Completed:  <n/a>

 Expiration:  <nil>

 Velero-Native Snapshots: <none included>

在尝试新安装之前,您可能需要删除现有的 velero 安装。要卸载 velero,请使用以下命令

      kubectl delete namespace -n velero