9

利用Velero对K8S备份还原与集群迁移实战 - 帝都攻城狮

 1 year ago
source link: https://www.cnblogs.com/wubolive/p/17345716.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

利用Velero对K8S备份还原与集群迁移实战

Velero 是一款云原生时代的灾难恢复和迁移工具,采用 Go 语言编写,并在 github 上进行了开源,利用 velero 用户可以安全的备份、恢复和迁移 Kubernetes 集群资源和持久卷。

1.1 支持的版本列表

1686997-20230423100402231-977815386.png

1.2 Velero组件

Velero 组件一共分两部分,分别是服务端和客户端。

  • 服务端:运行在你 Kubernetes 的集群中
  • 客户端:是一些运行在本地的命令行的工具,需要已配置好 kubectl 及集群 kubeconfig 的机器上

1.3 velero备份流程

  1. velero客户端调用kubernetes API Server创建backup任务
  2. Backup控制器基于watch机制通过Api Server获取到备份任务
  3. Backup控制器开始执行备份动作,会通过请求Api Server获取到需要备份的数据
  4. Backup 控制器将获取到的数据备份到指定的对象存储server端
1686997-20230423101114648-1506721209.png

1.4 Velero后端存储

Velero支持两种关于后端存储的CRD,分别是BackupStorageLocationVolumeSnapshotLocation

1.4.1 BackupStorageLocation

主要用来定义 Kubernetes 集群资源的数据存放位置,也就是集群对象数据,不是 PVC 的数据。主要支持的后端存储是 S3 兼容的存储,比如:Mino 和阿里云 OSS 等。

1.4.2 VolumeSnapshotLocation

主要用来给 PV 做快照,需要云提供商提供插件。阿里云已经提供了插件,这个需要使用 CSI 等存储机制。你也可以使用专门的备份工具 Restic,把 PV 数据备份到阿里云 OSS 中去(安装时需要自定义选项)。

Restic 是一款 GO 语言开发的数据加密备份工具,顾名思义,可以将本地数据加密后传输到指定的仓库。支持的仓库有 Local、SFTP、Aws S3、Minio、OpenStack Swift、Backblaze B2、Azure BS、Google Cloud storage、Rest Server。

二、准备存储插件

Velero支持很多种存储插件,可查看:https://velero.io/docs/v1.10/supported-providers/获取插件信息,我们这里使用minio作为S3兼容的对象存储提供程序。您也可以在任意地方部署Minio对象存储,只需要保证K8S集群可以访问到即可。

2.1 准备minio清单

---apiVersion: v1kind: Servicemetadata: name: minio namespace: velero labels: app: miniospec: selector: app: minio ports: - name: api port: 9000 protocol: TCP - name: console port: 9001 protocol: TCP---apiVersion: apps/v1kind: StatefulSetmetadata: name: minio namespace: velero labels: app: miniospec: replicas: 1 serviceName: minio selector: matchLabels: app: minio template: metadata: labels: app: minio spec: containers: - name: minio image: docker.io/bitnami/minio:2023.3.22 imagePullPolicy: IfNotPresent ports: - containerPort: 9000 name: api protocol: TCP - containerPort: 9001 name: console protocol: TCP env: - name: MINIO_ROOT_USER value: "minio" - name: MINIO_ROOT_PASSWORD value: "minio123" - name: MINIO_DEFAULT_BUCKETS value: "velero" volumeMounts: - name: data mountPath: /data volumeClaimTemplates: - metadata: name: data spec: accessModes: [ "ReadWriteOnce" ] resources: requests: storage: 50Gi

2.2 创建minio应用

# 创建velero命名空间$ kubectl create namespace velero# 创建minio资源$ kubectl apply -f minio.yaml # 查看部署状态$ kubectl get sts,pod,svc -n veleroNAME READY AGEstatefulset.apps/minio 1/1 66s NAME READY STATUS RESTARTS AGEpod/minio-0 1/1 Running 0 65s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGEservice/minio ClusterIP 10.99.223.162 <none> 9000/TCP,9001/TCP 66s # 开放NodePort端口$ kubectl patch svc minio -n velero -p '{"spec": {"type": "NodePort"}}'$ kubectl patch svc minio -n velero --type='json' -p='[{"op": "replace", "path": "/spec/ports/0/nodePort", "value":9000},{"op": "replace", "path": "/spec/ports/1/nodePort", "value":9001}]' $ kubectl get svc -n veleroNAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGEminio NodePort 10.99.223.162 <none> 9000:9000/TCP,9001:9001/TCP 140m

通过浏览器访问服务器IP:9001,并使用账号minio密码minio123登入验证。

1686997-20230423101159036-1960834087.png

三、安装velero v1.10.2

版本列表:https://github.com/vmware-tanzu/velero/releases

3.1 安装velero命令程序

$ wget https://github.com/vmware-tanzu/velero/releases/download/v1.10.2/velero-v1.10.2-linux-amd64.tar.gz$ tar zxf velero-v1.10.2-linux-amd64.tar.gz $ mv velero-v1.10.2-linux-amd64/velero /usr/bin/$ velero -h# 启用命令补全$ source <(velero completion bash)$ velero completion bash > /etc/bash_completion.d/velero

3.2 创建密钥

$ cat > credentials-velero <<EOF[default]aws_access_key_id = minioaws_secret_access_key = minio123EOF

3.3 安装velero到k8s集群

$ velero install \ --provider aws \ --image velero/velero:v1.10.2 \ --plugins velero/velero-plugin-for-aws:v1.6.0 \ --bucket velero \ --secret-file ./credentials-velero \ --use-node-agent \ --use-volume-snapshots=false \ --namespace velero \ --backup-location-config region=minio,s3ForcePathStyle="true",s3Url=http://minio:9000 \ --wait# 执行install命令后会创建一系列清单,包括CustomResourceDefinition、Namespace、Deployment等。 # 可使用如下命令查看运行日志$ kubectl logs deployment/velero -n velero # 查看velero创建的api对象$ kubectl api-versions | grep velerovelero.io/v1 # 查看备份位置$ velero backup-location getNAME PROVIDER BUCKET/PREFIX PHASE LAST VALIDATED ACCESS MODE DEFAULTdefault aws velero Available 2023-03-28 15:45:30 +0800 CST ReadWrite true

选项说明:

  • --kubeconfig(可选):指定kubeconfig认证文件,默认使用.kube/config
  • --provider:定义插件提供方;
  • --image:定义运行velero的镜像,默认与velero客户端一致;
  • --plugins:指定使用aws s3兼容的插件镜像;
  • --bucket:指定对象存储Bucket桶名称;
  • --secret-file:指定对象存储认证文件;
  • --use-node-agent:创建Velero Node Agent守护进程,托管FSB模块;
  • --use-volume-snapshots:是否启使用快照;
  • --namespace:指定部署的namespace名称,默认为velero;
  • --backup-location-config:指定对象存储地址信息;

aws插件与velero版本对应关系:

1686997-20230423101211903-287604028.png

3.4 卸载velero

如果您想从集群中完全卸载Velero,则以下命令将删除由velero install创建的所有资源:

kubectl delete namespace/velero clusterrolebinding/velerokubectl delete crds -l component=velero

四、备份与恢复

备份命令:velero create backup NAME [flags]
backup选项:

  • --exclude-namespaces stringArray : 要从备份中排除的名称空间
  • --exclude-resources stringArray: 要从备份中排除的资源,如storageclasses.storage.k8s.io
  • --include-cluster-resources optionalBool[=true]: 包含集群资源类型
  • --include-namespaces stringArray: 要包含在备份中的名称空间(默认'*')
  • --include-resources stringArray: 备份中要包括的资源
  • --labels mapStringString: 给这个备份加上标签
  • -o, --output string: 指定输出格式,支持'table'、'json'和'yaml';
  • -l, --selector labelSelector: 对指定标签的资源进行备份
  • --snapshot-volumes optionalBool[=true]: 对 PV 创建快照
  • --storage-location string: 指定备份的位置
  • --ttl duration: 备份数据多久删掉
  • --volume-snapshot-locations strings: 指定快照的位置,也就是哪一个公有云驱动

4.1 备份

4.1.1 使用官方案例创建测试应用
$ kubectl apply -f examples/nginx-app/base.yaml namespace/nginx-example createddeployment.apps/nginx-deployment createdservice/my-nginx created # 查看资源清单$ kubectl get all -n nginx-exampleNAME READY STATUS RESTARTS AGEpod/nginx-deployment-57d5dcb68-g42mk 1/1 Running 0 41spod/nginx-deployment-57d5dcb68-pcc6t 1/1 Running 0 41s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGEservice/my-nginx LoadBalancer 10.96.0.31 <pending> 80:27370/TCP 41s NAME READY UP-TO-DATE AVAILABLE AGEdeployment.apps/nginx-deployment 2/2 2 2 41s NAME DESIRED CURRENT READY AGEreplicaset.apps/nginx-deployment-57d5dcb68 2 2 2 41s
4.1.2 备份测试应用
$ velero backup create nginx-backup --include-namespaces nginx-exampleBackup request "nginx-backup" submitted successfully.Run `velero backup describe nginx-backup` or `velero backup logs nginx-backup` for more details.
  • --include-namespaces:指定命名空间
  • --selector:标签选择器,如app=nginx
4.1.3 查看备份列表
$ velero backup getNAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTORnginx-backup Completed 0 0 2023-03-09 09:38:50 +0800 CST 29d default <none> # 查看备份详细信息$ velero backup describe nginx-backup # 查看备份日志$ velero backup logs nginx-backup

登入minio控制台查看备份内容

1686997-20230423101229069-113765014.png
4.1.4 定时备份指南
# 使用cron表达式备份$ velero schedule create nginx-daily --schedule="0 1 * * *" --include-namespaces nginx-example # 使用一些非标准的速记 cron 表达式$ velero schedule create nginx-daily --schedule="@daily" --include-namespaces nginx-example # 手动触发定时任务$ velero backup create --from-schedule nginx-daily

更多cron示例请参考:cron package’s documentation

4.2 恢复

4.2.1 模拟灾难
# 删除nginx-example命名空间和资源$ kubectl delete namespace nginx-example# 检查是否删除$ kubectl get all -n nginx-exampleNo resources found in nginx-example namespace.
4.2.2 恢复资源
$ velero backup getNAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTORnginx-backup Completed 0 0 2023-03-09 09:38:50 +0800 CST 29d default <none>$ velero restore create --from-backup nginx-backupRestore request "nginx-backup-20230309095025" submitted successfully.Run `velero restore describe nginx-backup-20230309095025` or `velero restore logs nginx-backup-20230309095025` for more details.
4.2.3 检查恢复的资源
$ velero restore getNAME BACKUP STATUS STARTED COMPLETED ERRORS WARNINGS CREATED SELECTORnginx-backup-20230309095025 nginx-backup Completed 2023-03-09 09:50:25 +0800 CST 2023-03-09 09:50:25 +0800 CST 0 1 2023-03-09 09:50:25 +0800 CST <none> # 查看详细信息$ velero restore describe nginx-backup-20230309095025 # 检查资源状态$ kubectl get all -n nginx-exampleNAME READY STATUS RESTARTS AGEpod/nginx-deployment-57d5dcb68-g42mk 1/1 Running 0 2m19spod/nginx-deployment-57d5dcb68-pcc6t 1/1 Running 0 2m19s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGEservice/my-nginx LoadBalancer 10.96.0.204 <pending> 80:31291/TCP 2m19s NAME READY UP-TO-DATE AVAILABLE AGEdeployment.apps/nginx-deployment 2/2 2 2 2m19s NAME DESIRED CURRENT READY AGEreplicaset.apps/nginx-deployment-57d5dcb68 2 2 2 2m19s

五、项目迁移实战

5.1、项目介绍

我们将使用Velero快速完成云原生应用PV数据的迁移实践过程,在本文示例中,我们将A集群中的一个MOS应用迁移到集群B中,其中数据备份采用自建Minio对象存储服务。

1686997-20230423100904685-155121710.png
5.1.1 环境要求
  1. 迁移项目最好保证两个Kubernetes集群版本一致。
  2. 为了保证PV数据成功迁移,两边需要安装好相同名字的StorageClass
  3. 可以自己部署Minio,也可以使用公有云的对象存储服务,如华为的OBS、阿里的OSS等。
  4. 本案例将集群A中app-system命名空间中的服务及PV数据迁移到集群B中。
5.1.2 项目环境
角色 集群IP 集群版本 部署软件
K8S 集群A 192.168.1.102 v1.22.10 openebs、velero、app-system
K8S 集群B 192.168.1.103 v1.22.10 openebs、velero、minio
5.1.3 项目说明

我们需要将集群A中 app-system 空间的所有资源和数据全部迁移到集群B中,该项目包括了deploymentstatefulsetserviceingressjobcronjobsecretconfigmappvpvc

# 项目清单信息$ kubectl get deployment,sts,pvc -n app-systemNAME READY UP-TO-DATE AVAILABLE AGEdeployment.apps/hasura-graphql 1/1 1 1 5h27mdeployment.apps/iot-backend 1/1 1 1 5h27mdeployment.apps/iot-gateway 1/1 1 1 5h27mdeployment.apps/iot-history 1/1 1 1 5h27mdeployment.apps/iot-observer 1/1 1 1 5h27mdeployment.apps/app-backend 1/1 1 1 5h27mdeployment.apps/app-frontend 1/1 1 1 5h27m NAME READY AGEstatefulset.apps/minio 1/1 5h27mstatefulset.apps/mongo 1/1 5h27mstatefulset.apps/postgres 1/1 5h27mstatefulset.apps/rabbitmq 1/1 5h27mstatefulset.apps/redis 1/1 5h27m NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGEpersistentvolumeclaim/data-minio-0 Bound pvc-950d15a8-20a5-4e5f-8dbf-b904295355bb 50Gi RWO openebs-hostpath 5h27mpersistentvolumeclaim/data-mongo-0 Bound pvc-e435b80e-0370-4100-b223-ca841f24bd5d 50Gi RWO openebs-hostpath 5h27mpersistentvolumeclaim/data-postgres-0 Bound pvc-359ec32a-4bfc-4bc8-8cf3-38322e8ef59b 300Gi RWO openebs-hostpath 5h27mpersistentvolumeclaim/data-redis-0 Bound pvc-da718e0c-992c-4f6e-af44-abb1c7214a9e 2Gi RWO openebs-hostpath 5h27mpersistentvolumeclaim/app-backend Bound pvc-506261a9-6be4-4d95-8807-58201e31a527 10Gi RWO openebs-hostpath 5h27m

5.2 准备对象存储

按照2.12.2的方法在集群B(192.168.1.103)中创建minio应用,用来存放备份数据。

$ kubectl get sts,svc -n veleroNAME READY AGEstatefulset.apps/minio 1/1 3h44m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGEservice/minio NodePort 10.99.223.162 <none> 9000:9000/TCP,9001:9001/TCP 3h44m

5.3 安装velero

请确保在集群A集群B中已经安装好velero客户端,请参考3.1 安装velero命令程序

5.3.1 在集群A中安装velero服务
$ cat > credentials-velero <<EOF[default]aws_access_key_id = minioaws_secret_access_key = minio123EOF $ velero install \ --provider aws \ --image velero/velero:v1.10.2 \ --plugins velero/velero-plugin-for-aws:v1.6.0 \ --bucket velero \ --secret-file ./credentials-velero \ --use-node-agent \ --use-volume-snapshots=false \ --namespace velero \ --backup-location-config region=minio,s3ForcePathStyle="true",s3Url=http://192.168.1.103:9000 \ --wait

注意:其中S3的地址指向集群B(192.168.1.103)的minio对象存储。

5.3.2 在集群B种安装velero服务
$ cat > credentials-velero <<EOF[default]aws_access_key_id = minioaws_secret_access_key = minio123EOF $ velero install \ --provider aws \ --image velero/velero:v1.10.2 \ --plugins velero/velero-plugin-for-aws:v1.6.0 \ --bucket velero \ --secret-file ./credentials-velero \ --use-node-agent \ --use-volume-snapshots=false \ --namespace velero \ --backup-location-config region=minio,s3ForcePathStyle="true",s3Url=http://minio:9000 \ --wait

注意:其中S3的地址指向本集群minio对象存储的svc地址。

5.4 备份MOS项目

$ velero backup create app-backup \ --default-volumes-to-fs-backup \ --include-namespaces app-systemBackup request "app-backup" submitted successfully.Run `velero backup describe app-backup` or `velero backup logs app-backup` for more details. # 查看备份状态$ velero backup getNAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTORapp-backup Completed 0 0 2023-03-28 16:34:56 +0800 CST 29d default <none>
  • --default-volumes-to-fs-backup:默认将所有PV卷进行备份,详情查看官方文档
  • --include-namespaces:指定要备份的命名空间

登入minio控制台上可以看到备份的文件:

1686997-20230423101253111-54562663.png

5.5 恢复到集群B

# 到集群B中查看备份资源$ velero backup getNAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTORapp-backup Completed 0 0 2023-03-28 16:41:55 +0800 CST 29d default <none> # 执行恢复命令$ velero restore create --from-backup app-backupRestore request "app-backup-20230328164601" submitted successfully.Run `velero restore describe app-backup-20230328164601` or `velero restore logs app-backup-20230328164601` for more details. # 查看恢复任务$ velero restore get NAME BACKUP STATUS STARTED COMPLETED ERRORS WARNINGS CREATED SELECTORapp-backup-20230328164601 app-backup Completed 2023-03-28 16:46:01 +0800 CST 2023-03-28 17:01:26 +0800 CST 0 30 2023-03-28 16:46:01 +0800 CST <none>

5.6 验证服务和数据

$ kubectl get deploy,sts,svc,pvc -n app-systemNAME READY UP-TO-DATE AVAILABLE AGEdeployment.apps/hasura-graphql 1/1 1 1 29mdeployment.apps/iot-backend 1/1 1 1 29mdeployment.apps/iot-gateway 1/1 1 1 29mdeployment.apps/iot-history 1/1 1 1 29mdeployment.apps/iot-observer 1/1 1 1 29mdeployment.apps/app-backend 1/1 1 1 29mdeployment.apps/app-frontend 1/1 1 1 29m NAME READY AGEstatefulset.apps/minio 1/1 29mstatefulset.apps/mongo 1/1 29mstatefulset.apps/postgres 1/1 29mstatefulset.apps/rabbitmq 1/1 29mstatefulset.apps/redis 1/1 29m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGEservice/hasura-graphql ClusterIP 10.107.231.148 <none> 8080/TCP 29mservice/iot-backend ClusterIP 10.98.78.23 <none> 3000/TCP 29mservice/iot-gateway ClusterIP 10.108.211.114 <none> 1880/TCP 29mservice/iot-history ClusterIP 10.98.217.234 <none> 3000/TCP 29mservice/iot-observer ClusterIP 10.105.75.200 <none> 3010/TCP 29mservice/minio ClusterIP 10.97.14.151 <none> 9000/TCP,9001/TCP 29mservice/mongo ClusterIP 10.97.212.84 <none> 27017/TCP 29mservice/app-backend ClusterIP 10.107.16.116 <none> 5959/TCP 29mservice/app-frontend ClusterIP 10.100.136.90 <none> 80/TCP,443/TCP 29mservice/postgres ClusterIP 10.101.235.245 <none> 5432/TCP,9187/TCP 29mservice/rabbitmq ClusterIP 10.108.29.67 <none> 15672/TCP,5672/TCP 29mservice/redis ClusterIP 10.101.105.81 <none> 6379/TCP,9121/TCP 29m NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGEpersistentvolumeclaim/data-minio-0 Bound pvc-ca5c63bf-9aaa-4755-ad8f-35718f51decf 50Gi RWO openebs-hostpath 29mpersistentvolumeclaim/data-mongo-0 Bound pvc-03203801-e339-44ee-bfb4-b196808c7cc5 50Gi RWO openebs-hostpath 29mpersistentvolumeclaim/data-postgres-0 Bound pvc-e2f2593e-5869-420c-bd39-54ce01dfa63f 300Gi RWO openebs-hostpath 29mpersistentvolumeclaim/data-redis-0 Bound pvc-a8e4445c-c6b5-483b-8b18-9d650daf35cc 2Gi RWO openebs-hostpath 29mpersistentvolumeclaim/app-backend Bound pvc-2b2f9747-a020-4a16-9975-2491457c4032 10Gi RWO openebs-hostpath 29m

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK