6

Moving MongoDB Cluster to a Different Environment with Percona Backup for MongoD...

 2 years ago
source link: https://www.percona.com/blog/moving-mongodb-cluster-to-a-different-environment-with-percona-backup-for-mongodb/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Moving MongoDB Cluster to a Different Environment with Percona Backup for MongoDB

Moving MongoDB Cluster to a Different Environment with Percona Backup for MongoDB
Percona Backup for MongoDB

(PBM) is a distributed backup and restore tool for sharded and non-sharded clusters. In 1.8.0, we added the replset-remapping functionality that allows you to restore data on a new compatible cluster topology.

The new environment can have different replset names and/or serve on different hosts and ports. PBM handles this hard work for you. Making such migration indistinguishable from the usual restore. In this blog post, I’ll show you how to migrate to a new cluster practically.

The Problem

Usually to change a cluster topology you do lots of manual steps. PBM reduces the process.

Let’s have a look at a case where we will have an initial cluster and a desired one.

Initial cluster:

configsrv: "configsrv/conf:27017"
shards:
  - "rs0/rs0:27017,rs1:27017,rs2:27017"
  - "extra-shard/extra:27018"

The cluster consists of the configsrv configsvr replset with a single node and two shards: rs0 (3 nodes in the replset) and extra-shard (1 node in the replset). The names, hosts, and ports are not conventional across the cluster but we will resolve this.

Target cluster:

configsrv: "cfg/cfg0:27019"
shards:
  - "rs0/rs00:27018,rs01:27018,rs02:27018"
  - "rs1/rs10:27018,rs11:27018,rs12:27018"
  - "rs2/rs20:27018,rs21:27018,rs22:27018"

Here we have the cfg configsvr replset with a single node and 3 shards rs0rs2 where each shard is 3-nodes replset.

Think about how you can do this.

With PBM, all that we need is deployed cluster and logical backup made with PBM 1.5.0 or later. The following simple command will do the rest:

pbm restore $BACKUP_NAME --replset-remapping "cfg=configsrv,rs1=extra-shard"

Migration in Action

Let me show you how it looks in practice. I’ll provide details at the end of the post. In the repo, you can find all configs, scripts, and output used here.

As mentioned above, we need a backup. For this, we will deploy a cluster, seed data, and then make the backup.

Deploying the initial cluster

$> initial/deploy >initial/deploy.out
$> docker compose -f "initial/compose.yaml" exec pbm-conf \
     pbm status -s cluster
Cluster:
========
configsvr:
  - configsvr/conf:27019: pbm-agent v1.8.0 OK
  - rs0/rs00:27017: pbm-agent v1.8.0 OK
  - rs0/rs01:27017: pbm-agent v1.8.0 OK
  - rs0/rs02:27017: pbm-agent v1.8.0 OK
extra-shard:
  - extra-shard/extra:27018: pbm-agent v1.8.0 OK

links: initial/deployinitial/deploy.out

The cluster is ready and we can add some data.

Seed data

We will insert the first 1000 numbers in a natural number sequence: 1 – 1000.

$> mongosh "mongo:27017/rsmap" --quiet --eval "
     for (let i = 1; i <= 1000; i++)
       db.coll.insertOne({ i })" >/dev/null

Getting the data state

These documents should be partitioned across all shards at insert time. Let’s see, in general, how. We will use the “dbHash“ command on all shards to have the collections’ state. It will be useful for verification later.

We will also do a quick check on shards and mongos.

$> initial/dbhash >initial/dbhash.out && cat initial/dbhash.out
# rs00:27017  db.getSiblingDB("rsmap").runCommand("dbHash").collections
{ "coll" : "550f86eb459b4d43de7999fe465e39e0" }
# rs01:27017  db.getSiblingDB("rsmap").runCommand("dbHash").collections
{ "coll" : "550f86eb459b4d43de7999fe465e39e0" }
# rs02:27017  db.getSiblingDB("rsmap").runCommand("dbHash").collections
{ "coll" : "550f86eb459b4d43de7999fe465e39e0" }
# extra:27018  db.getSiblingDB("rsmap").runCommand("dbHash").collections
{ "coll" : "4a79c07e0cbf3c9076d6e2d81eb77f0a" }
# rs00:27017  db.getSiblingDB("rsmap").coll
    .find().sort({ i: 1 }).toArray()
    .reduce(([count = 0, seq = true, next = 1], { i }) =>
             [count + 1, seq && next == i, i + 1], [])
    .slice(0, 2)
[ 520, false ]
# extra:27018  db.getSiblingDB("rsmap").coll
    .find().sort({ i: 1 }).toArray()
    .reduce(([count = 0, seq = true, next = 1], { i }) =>
             [count + 1, seq && next == i, i + 1], [])
    .slice(0, 2)
[ 480, false ]
# mongo:27017
[ 1000, true ]

links: initial/dbhashinitial/dbhash.out

All rs0 members have the same data. So secondaries replicate from primary correctly.

The quickcheck.js used in the initial/dbhash script describes our documents. It returns the number of documents and whether these documents make the natural number sequence.

We have data for the backup. Time to make the backup.

Making a backup

$> docker compose -f initial/compose.yaml exec pbm-conf bash
pbm-conf> pbm backup --wait
Starting backup '2022-06-15T08:18:44Z'....
Waiting for '2022-06-15T08:18:44Z' backup.......... done
pbm-conf> pbm status -s backups
Backups:
========
FS  /data/pbm
  Snapshots:
    2022-06-15T08:18:44Z 28.23KB <logical> [complete: 2022-06-15T08:18:49Z]

We have a backup. It’s enough for migration to the new cluster.

Let’s destroy the initial cluster and deploy the target environment. (Destroying the initial cluster is not a requirement. I just don’t want to waste resources on it.)

Deploying the target cluster

pbm-conf> exit
$> docker compose -f initial/compose.yaml down -v >/dev/null
$> target/deploy >target/deploy.out

links: target/deploy, target/deploy.out

Let’s check the PBM status.

PBM Status

$> docker compose -f target/compose.yaml exec pbm-cfg0 bash
pbm-cfg0> pbm config --force-resync  # ensure agents sync from storage
Storage resync started
pbm-cfg0> pbm status -s backups
Backups:
========
FS  /data/pbm
  Snapshots:
    2022-06-15T08:18:44Z 28.23KB <logical> [incompatible: Backup doesn't match current cluster topology - it has different replica set names. Extra shards in the backup will cause this, for a simple example. The extra/unknown replica set names found in the backup are: extra-shard, configsvr. Backup has no data for the config server or sole replicaset] [2022-06-15T08:18:49Z]

As expected, it is incompatible with the new deployment.

See how to make it work

Resolving PBM Status

pbm-cfg0> export PBM_REPLSET_REMAPPING="cfg=configsvr,rs1=extra-shard"
pbm-cfg0> pbm status -s backups
Backups:
========
FS  /data/pbm
  Snapshots:
    2022-06-15T08:18:44Z 28.23KB <logical> [complete: 2022-06-15T08:18:49Z]

Nice. Now we can restore.

Restoring

pbm-cfg0> pbm restore '2022-06-15T08:18:44Z' --wait
Starting restore from '2022-06-15T08:18:44Z'....Started logical restore.
Waiting to finish.....Restore successfully finished!

The –wait flag blocks the shell session till the restore completes. You could not wait but check it later.

pbm-cfg0> pbm list --restore
Restores history:
  2022-06-15T08:18:44Z

Everything is going well so far. Almost done

Let’s verify the data.

Data verification

pbm-cfg0> exit
$> target/dbhash >target/dbhash.out && cat target/dbhash.out
# rs00:27018  db.getSiblingDB("rsmap").runCommand("dbHash").collections
{ "coll" : "550f86eb459b4d43de7999fe465e39e0" }
# rs01:27018  db.getSiblingDB("rsmap").runCommand("dbHash").collections
{ "coll" : "550f86eb459b4d43de7999fe465e39e0" }
# rs02:27018  db.getSiblingDB("rsmap").runCommand("dbHash").collections
{ "coll" : "550f86eb459b4d43de7999fe465e39e0" }
# rs10:27018  db.getSiblingDB("rsmap").runCommand("dbHash").collections
{ "coll" : "4a79c07e0cbf3c9076d6e2d81eb77f0a" }
# rs11:27018  db.getSiblingDB("rsmap").runCommand("dbHash").collections
{ "coll" : "4a79c07e0cbf3c9076d6e2d81eb77f0a" }
# rs12:27018  db.getSiblingDB("rsmap").runCommand("dbHash").collections
{ "coll" : "4a79c07e0cbf3c9076d6e2d81eb77f0a" }
# rs20:27018  db.getSiblingDB("rsmap").runCommand("dbHash").collections
# rs21:27018  db.getSiblingDB("rsmap").runCommand("dbHash").collections
# rs22:27018  db.getSiblingDB("rsmap").runCommand("dbHash").collections
# rs00:27018  db.getSiblingDB("rsmap").coll
    .find().sort({ i: 1 }).toArray()
    .reduce(([count = 0, seq = true, next = 1], { i }) =>
             [count + 1, seq && next == i, i + 1], [])
    .slice(0, 2)
[ 520, false ]
# rs10:27018  db.getSiblingDB("rsmap").coll
    .find().sort({ i: 1 }).toArray()
    .reduce(([count = 0, seq = true, next = 1], { i }) =>
             [count + 1, seq && next == i, i + 1], [])
    .slice(0, 2)
[ 480, false ]
# rs20:27018  db.getSiblingDB("rsmap").coll
    .find().sort({ i: 1 }).toArray()
    .reduce(([count = 0, seq = true, next = 1], { i }) =>
             [count + 1, seq && next == i, i + 1], [])
    .slice(0, 2)
# mongo:27017
[ 1000, true ]

links: target/dbhash, target/dbhash.out

As you can see, the rs2 shard is empty. The other two have the identical dbHash and the quickcheck results as in the initial cluster. I think balancer can tell something about this

Balancer status

$> mongosh "mongo:27017" --quiet --eval "sh.balancerCollectionStatus('rsmap.coll')"
  balancerCompliant: false,
  firstComplianceViolation: 'chunksImbalance',
  ok: 1,
  '$clusterTime': {
    clusterTime: Timestamp({ t: 1655281436, i: 1 }),
    signature: {
      hash: Binary(Buffer.from("0000000000000000000000000000000000000000", "hex"), 0),
      keyId: Long("0")
  operationTime: Timestamp({ t: 1655281436, i: 1 })

We know what to do. Starting balancer and checking status again.

$> mongosh "mongo:27017" --quiet --eval "sh.startBalancer().ok"
$> mongosh "mongo:27017" --quiet --eval "sh.balancerCollectionStatus('rsmap.coll')"
  balancerCompliant: true,
  ok: 1,
  '$clusterTime': {
    clusterTime: Timestamp({ t: 1655281457, i: 1 }),
    signature: {
      hash: Binary(Buffer.from("0000000000000000000000000000000000000000", "hex"), 0),
      keyId: Long("0")
  operationTime: Timestamp({ t: 1655281457, i: 1 })
$> target/dbhash >target/dbhash-2.out && cat target/dbhash-2.out
# rs00:27018  db.getSiblingDB("rsmap").runCommand("dbHash").collections
{ "coll" : "550f86eb459b4d43de7999fe465e39e0" }
# rs01:27018  db.getSiblingDB("rsmap").runCommand("dbHash").collections
{ "coll" : "550f86eb459b4d43de7999fe465e39e0" }
# rs02:27018  db.getSiblingDB("rsmap").runCommand("dbHash").collections
{ "coll" : "550f86eb459b4d43de7999fe465e39e0" }
# rs10:27018  db.getSiblingDB("rsmap").runCommand("dbHash").collections
{ "coll" : "4a79c07e0cbf3c9076d6e2d81eb77f0a" }
# rs11:27018  db.getSiblingDB("rsmap").runCommand("dbHash").collections
{ "coll" : "4a79c07e0cbf3c9076d6e2d81eb77f0a" }
# rs12:27018  db.getSiblingDB("rsmap").runCommand("dbHash").collections
{ "coll" : "4a79c07e0cbf3c9076d6e2d81eb77f0a" }
# rs20:27018  db.getSiblingDB("rsmap").runCommand("dbHash").collections
{ "coll" : "6a54e10a5526e0efea0d58b5e2fbd7c5" }
# rs21:27018  db.getSiblingDB("rsmap").runCommand("dbHash").collections
{ "coll" : "6a54e10a5526e0efea0d58b5e2fbd7c5" }
# rs22:27018  db.getSiblingDB("rsmap").runCommand("dbHash").collections
{ "coll" : "6a54e10a5526e0efea0d58b5e2fbd7c5" }
# rs00:27018  db.getSiblingDB("rsmap").coll
    .find().sort({ i: 1 }).toArray()
    .reduce(([count = 0, seq = true, next = 1], { i }) =>
             [count + 1, seq && next == i, i + 1], [])
    .slice(0, 2)
[ 520, false ]
# rs10:27018  db.getSiblingDB("rsmap").coll
    .find().sort({ i: 1 }).toArray()
    .reduce(([count = 0, seq = true, next = 1], { i }) =>
             [count + 1, seq && next == i, i + 1], [])
    .slice(0, 2)
[ 480, false ]
# rs20:27018  db.getSiblingDB("rsmap").coll
    .find().sort({ i: 1 }).toArray()
    .reduce(([count = 0, seq = true, next = 1], { i }) =>
             [count + 1, seq && next == i, i + 1], [])
    .slice(0, 2)
[ 229, false ]
# mongo:27017
[ 1000, true ]

links: target/dbhash-2.out

Interesting. rs2 shard has some data. However, rs1 and rs2 haven’t changed. It’s expected that mongos moves some chunks to rs2 and updates the router config. Physically deletion of chunks on a shard is a separate step. That’s why querying data directly on a shard is inaccurate. The data could disappear at any time. The cursor returns all available documents in a replset at the moment despite the router config.

Anyway, we shouldn’t care about it anymore. It is mongos/mongod responsibility now to update router config, query right shards, and remove moved chunks from shards by demand. In the end, we have valid data through mongos.

That’s it.

But wait, we didn’t make a backup! Never forget to make another solid backup.

Making a new backup

Better to change the storage so that we will have backups for the new deployment in a different place and will not see errors about incompatible backups from the initial cluster further.

$> pbm config --file "$NEW_PBM_CONFIG" >/dev/null
$> pbm config --force-resync >/dev/null
$> pbm backup -w >/dev/null
pbm-cfg0> pbm status -s backups
Backups:
========
FS  /data/pbm
  Snapshots:
    2022-06-15T08:25:44Z 165.34KB <logical> [complete: 2022-06-15T08:25:49Z]

Now we’re done. And can sleep better.

One More Thing: Possible Misconfiguration

Let’s review another imaginal case to explain all possible errors.

Initial cluster: cfg, rs0, rs1, rs2, rs3, rs4, rs5

Target cluster: cfg, rs0, rs1, rs2, rs3, rs4, rs6

If we apply remapping: “rs0=rs0,rs1=rs2,rs2=rs1,rs3=rs4“, we will get error like “missed replsets: rs3, rs5“. And nothing about rs6.

The missed rs5 should be obvious: backup topology has rs5 replset, but it is missed on target. And target rs6 does not have data to restore from. Adding rs6=rs5 fixes this.

But the missed rs3 could be confusing. Let’s visualize:

init | curr
-----+-----
cfg     cfg  # unchanged
rs0 --> rs0  # mapped. unchanged
rs1 --> rs2
rs2 --> rs1
rs3 -->      # err: no shard
rs4 --> rs3
     -> rs4  # ok: no data
rs5 -->      # err: no shard
     -> rs6  # ok: no data

When we remap the backup from rs4 to rs3, the target rs3 is reserved. The rs3 in the backup does not have a target replset now. Just remapping rs3 to available rs4 will fix it too.

This reservation avoids data duplication. That’s why we use the quick check via mongos.

Details

Compatible topology

Simply speaking, compatible topology is equal to or has a larger number of shards in the target deployment. In our example, we had initial 2 shards but restored them to 3 shards. PBM restored data on two shards only. MongoDB can distribute it with the remaining shards later when the balancer is enabled (sh.startBalancer()). The number of replset members does not matter because PBM takes backup from a member (per replset) and restores it to primary only. Other data-bearing members replicate data from the primary. So you could make a backup from a multi-members replset and then restore it to a single member replset.

You cannot restore to a different replset type like from shardsvr to configsvr.

Preconfigured environment

The cluster should be deployed with all shards added. Users and permissions should be added and assigned in advance. PBM agents should be configured to the same storage and be accessible to it from the new cluster.

Note: PBM agents store backup metadata on storage and keep the cache in MongoDB. pbm config –force-resync lets you refresh the cache from the storage. Do it on a new cluster right after deployment to see backups/oplog chunks made from the initial cluster.

Understanding replset remapping

You can remap replset names by the –replset-remapping flag or PBM_REPLSET_REMAPPING environment variable. If both sets, the flag has precedence.

For full restore, point-in-time recovery, and oplog replay, PBM CLI sends the mapping as a parameter in the command. Each command gets a separate explicit mapping (or none). It can be done only by CLI. Agents do not use the environment variable nor have the flag.

pbm status and pbm list use the flag/envvar to remap replsets in backups/oplog metadata and apply this mapping to the current deployment to show them properly. If backup and present replset names do not match, pbm list will not show these backups, and pbm status prints an error with missed replset names.

Restoring with remapping works with logical backups only.

How does PBM do this?

During restore, PBM reviews current topology and assigns members’ snapshots and oplog chunks to each shard/replset by name, respectively. The remapping changes the default assignment.

After the restore is done, PBM agents sync the router config to make the restored data “native” to this cluster.

Behind the scene

The config.shards collection describes the current topology. PBM uses it to know where and what to restore. The collection is not modified by PBM. But restored data contains some other router configurations for initial topology.

We updated two collections to replace old shard names with new ones in restored data:

  • config.databases – primary shard for non-sharded databases
  • config.chunks – shards where chunks are

After this, MongoDB knows where databases, collections, and chunks are in the new cluster.

CONCLUSION

Migration of a cluster requires much attention, knowledge, and calm. The replset-remapping functionality in Percona Backup for MongoDB reduces complexity during migration between two different environments. I would say, it is near to a routine job now.

Have a nice day 🙂


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK