Refreshing Test/Dev Environments With Prod Data Using Percona Backup for MongoDB
source link: https://www.percona.com/blog/2021/05/19/refreshing-test-dev-environments-with-prod-data-using-percona-backup-for-mongodb/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
This is a very straightforward article written with the intention to show you how easy it is to refresh your Test/Dev environments with PROD data, using Percona Backup for MongoDB (PBM). This article will cover all the steps from the PBM configuration until the restore, assuming that the PBM agents are all up and running on all the replica set members of either PROD and Dev/Test servers.
Taking the Backup on PROD
This step is quite simple and it demands no more than two commands:
1. Configuring the Backup
Important note on two things: I will address my backups to an S3 bucket and I am defining a prefix. When defining a prefix in the PBM storage configuration, a subdirectory will be automatically created and the backup files will be stored on that subdirectory instead of the root of the S3 bucket.
2. Taking the Backup
Having the PBM properly configured, it is time to take the backup. (You can skip this step if you already have PBM backups to use, of course.)
And if we hit the PBM status command, we will see the snapshot running and when it is complete, the PBM status will show it as completed like below:
Configuring the PBM Space on a DEV/TEST Environment
All right, now my PROD has a proper backup routine configured. I will move one step forward and configure my PBM space but this time in a Dev/Test environment – named here as DEV.
The backup list resync from the store has started.
Note that the S3 bucket is exactly the same where PROD is storing the backups but with a different prefix. If I hit a status command, I will see it is configured but no snapshots available yet:
Lastly, note that the replica set name is exactly the same as PROD. If this was a sharded cluster, rather than a non-sharded replicaset, all the replica set names have to match in the target cluster. PBM is guided by the replica set name and if my DEV env had a different one, it would not be possible to load backup metadata from PROD to DEV
Transfering the Desired Backup Files
The next step will be transferring the backup files from the PROD prefix to the target prefix. I will use the AWS CLI to achieve that, but there is one important thing to keep in mind in advance: determining which files are referent to a certain backup set (snapshot). Let’s go back to the PBM status output taken in PROD previously:
The PBM snapshots are named with the timestamp from when the backup started. If we check at the S3 prefix where it is stored, we will see that the file’s names contain that timestamp in its name composition.
So, it will be easy now to know which file I have to copy.
Checking the DEV prefix:
The files are already there and PBM has already automatically loaded their metadata into the DEV PBM collections:
Finally – Restoring It
Believing it or not, now comes the easiest part: the restore. It is only one command and nothing else:
Refreshing Dev/Test environments with PROD data is a very common and required task in corporations worldwide. I hope this article helps to clarify the practical questions regarding using PBM for it!
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK