5

PostgreSQL 101 for Non-Postgres DBAs (Simple Backup and Restore)

 2 years ago
source link: https://www.percona.com/blog/postgresql-101-simple-backup-and-restore/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

PostgreSQL 101 for Non-Postgres DBAs (Simple Backup and Restore)

It’s no surprise that PostgreSQL is becoming the de facto goto database for many. Just a few of the many reasons include advanced technology, scalability, and ways to save money. With that said, we see many experienced DBAs being tasked with migrating existing databases from Oracle, MySQL, SQL Server, and others to Postgres. Although fundamentally speaking, a good DBA should have a conceptual knowledge and understanding of database fundamentals, translating your existing way of performing daily tasks differs from one technology to the other. With that in mind, this blog is addressed to those experienced DBAs that have a well-known and proven set of routines in their old technology and want to know how to perform them in Postgres.

Postgres offers several utilities for performing both physical and logical backups and restores. We will talk about these and how to use them here.

For the purpose of this mini-tutorial, we are assuming all tasks will be performed by the user “postgres”, which has superuser privileges on the database unless otherwise noted.

I Want To …..

Logical Backups

Logical backups are processed with native tools such as pg_dump and pg_dumpall. These tools should be included in the default bin directory for postgres installation such as /usr/pgsql-11/bin. If your path is not set, you may want to include the bin directory in your path.

There are many options that can be used when running these tools to customize your data dumps. So, we will cover a few scenarios in this blog.

Physical Backups

Physical backups are processed with native tools such as pg_basebackup. Again, these tools should be included in the default bin directory for postgres installation such as /usr/pgsql-11/bin. If your path is not set, you may want to include the bin directory in your path.

You can also use system tools for physical backups such as tar or other archiving tools at your disposal.

Prerequisite for Remote Backups

The source database server has to allow a remote connection for the user performing the task. Remember, we are assuming for our examples that the user is postgres. 

  1. Create an entry in the pg_hba.conf file similar to the following under the IPv4 connections section.

host    all        postgres        0.0.0.0/0               md5

      2. Edit your postgresql.conf file or whatever file you may be loading for runtime configs and change the parameter listen_addresses to the following:

listen_addresses = ‘*’

Once the above changes are made, reload your configuration file or restart postgres. 

The above examples are pretty open. For security, you most likely will restrict the IP address in the hba.conf file to a more specific IP, Subnet.

In our example, we are allowing postgres to connect from anywhere with password authentication. Thus, the 0.0.0.0/0 and md5. You could change the 0.0.0.0/0 to the address of the other database server like 192.168.1.2/32 We also specify the user postgres with the -U option since it is the user we opened up in the pg_hba.conf file.

If the user running the commands has different credentials on source/target servers you will need to save the password to .pgpass or set the environment variable PGPASSWORD so you are not prompted for the password whenever it is needed.

I want to dump my entire database, including users and credentials to a file.

This is quite a simple task to perform if you have the correct privileges and configuration settings along with the storage needed depending on your database size.

Performing the Data Dump Locally

If you have only one instance of postgres running on your server and have minimal / default configuration for the pg_hba.conf file and your path includes the postgres bin directory, all you need to do as user postgres is ….

pg_dumpall > savedfile.sql

The above works well for small databases where you have space on the local server and just want a quick and simple dump of your database.

If you are running multiple instances on the local server and want to dump a specific instance all you do is …

pg_dumpall -p port > savedfile.sql

Replace the port above with the port number the instance you wish to dump is running on.

Performing the data dump remotely.

Although this is pretty much the same thing as on a local server, there are a few things you need to have configured in order to execute this data dump remotely. Plus, your prerequisites need to be addressed.

Now from our remote client or server, we can run the following commands as long as the postgres tools are installed.

pg_dumpall -h host -p port -U postgres > savedfile.sql

Replace the host above with the address of the source DB and port with the port number it is running on.

There are other flags and options you can use. Have a look here for the usage options

I want to dump a specific database only.

Performing the data dump locally.

Similar to the other commands with a slight variation

pg_dump -d dname > savedfile.sql

Like in other scenarios, the above works well for small databases where you have space on the local server and just want a quick and simple dump of your database.

If you are running multiple instances on the local server and want to dump from a specific instance all you do is …

pg_dump -p port -d dbname > savedfile.sql

I want to dump a specific database and specific table or tables only.

On a local server

Similar to the other commands with a slight variation

pg_dump -d dname -t tablename > savedfile.sql

Like in other scenarios, the above works well for small databases where you have space on the local server and just want a quick and simple dump of your database.

If you are running multiple instances on the local server and want to dump from a specific instance all you do is …

pg_dump -p port -d dbname -t tablename > savedfile.sql

If you want more than one table, list their names or patterns like so …

pg_dump -d dname -t table1 -t table2 -t table3 > savedfile.sql

From a remote server

Just like in previous examples, specify the connection options with -h host -p port

I only want to dump the users and credentials to restore them somewhere else.

This is just as simple as the above data dumps. However, keep in mind that this will not get you what you need if your instance is an RDS instance. Amazon really locks down what you can do as a privileged user on an RDS instance. Even as Postgres.

From a local server

pg_dumpall -g > users.sql

From a remote server or client. ( saves file locally )

pg_dumpall -g -h host -p port -U postgres > users.sql

You can edit the above dump file and remove any user you do not wish to apply when you restore the file to a different server.

Restoring a Logical Dump

Restoring the newly created backup is a simple task. There are several ways to accomplish this and we will go over a few of these just to get you going.  Keep in mind there is a pg_restore utility as well which we will not be addressing in this blog. Pg_restore lets you get more creative with your dumps and imports.

Again, we assume all actions here are executed as user postgres.

Restoring a pg_dumpall to a local server from a saved file.

psql postgres -f savedfile.sql

Restoring a pg_dumpall to a remote server from a saved file.

psql -h host -p port postgres -f savedfile.sql

Restoring a pg_dumpall to a remote server from the source server.

pg_dumpall | psql -h host -p port postgres

Restoring from a pg_dumpall from a remote server to a remote server.

pg_dumpall -h src_host -p src_port | psql -h target_host -p target_port postgres

Restoring a pg_dump of a specific database from a saved file.

psql dbname -f savedfile.sql

Restoring a pg_dump of a specific database to a remote server from a saved file.

psql -h host -p port dbname -f savedfile.sql

Restoring a pg_dump with a different owner on the target.

Sometimes you don’t have access to the users and credentials on a source database or want them to be different on your target/restored database. Follow these steps to achieve this.

  1. Perform your pg_dump command as noted previously but add the –no-owner option.
  2. Perform the restore as noted above but run the commands as the new owner. 

pg_dump -d database –no-owner > savedfile.sql

psql -U newowner dbname -f savedfile.sql

Remember for remote servers as noted in the other examples, use the -h host -p port and any other connection string option needed.

If the user’s credentials are different and you are prompted for passwords,  read the prerequisites section of this blog.

Let’s Get Physical with pg_baseback

A common way of performing physical backups in Postgres is with the use of pg_basebackup. This tool allows us to generate a physical backup with the necessary WAL files needed to restore or stand up a stand-alone instance.

There are many flags and options for this tool including compression but for the sake of this blog, we will focus on the basic use of pg_basebackup with minimal options.

For the purpose of this document, we will cover physical backups using the native pg_basebackup tool.

NOTE: Typically, one specifies the destination path for the physical backup. This is noted with the -D option of pg_basebackup.

Saving the backup to destination path

pg_basebackup -D /destination/path -Pv –checkpoint=fast

Sending the backup as tar files to the directory path specified

pg_basebackup -D /destination/path -Pv –checkpoint=fast -F t

The above will generate two tar files. A base.tar and a pg_wal.tar

Create a Physical Backup From a Remote Instance

Make sure you have set up the prerequisites as explained here

The only difference between remote and local execution is that for remote, we specify a source server with the -h remote_host and the port postgres is running on with the -p remote_port  

pg_basebackup -h host -p port -D /destination/path -Pv –checkpoint=fast

If the user executing pg_basebackup is not trusted directly from the server executing the pg_basebackup, add the additional option of -U username. For example …

pg_basebackup -U postgres -h host -p port -D /destination/path -Pv –checkpoint=fast

Stand up a Local Instance of Postgres using pg_basebackup

Tar file method

If you execute the pg_baseback with the tar file option, it will generate two tar files. A base.tar and a pg_wal.tar 

Extract the base.tar. If you do not have different WAL files to restore, extract the pg_wal.tar and place the wal segment file in the pg_wal directory.

Directory method

Make sure the directory where the new cluster will be located exists with the proper permissions and storage capacity. Remember, this will consume the same amount of space as the source database.

Define where the target database will reside.

  • mkdir -p /destination/path
  • chmod 700 /destination/path
  • chown postgres:postgres  /destination/path

As user postgres, run the following command assuming pg_basebackup is in your path.

Source database is local

pg_basebackup -D /destination/path-Pv –checkpoint=fast -X stream

Source database is on a remote server

pg_basebackup -h host -p port -D /destination/path-Pv –checkpoint=fast -X stream

What does the above do?

  1. Assumes postgres is running on the localhost using the default port of 5432 and the user executing it has the necessary privs to do so.
  2. initiate a pg_basebackup of the current and running instance of postgres.
  3. Save the copy to the path specified after the -D 
  4. Optionally, the -Pv will show the progress and verbose output of the process.
  5. Perform a fast checkpoint rather than spreading it out. Makes the backup start sooner.
  6. Stream the WAL changes that are happening on the running cluster and save them in the new cluster. This will allow for starting the new cluster without additional WALs.

The above applies to whether the database is remote or not.

Starting the separate instance of postgres

When the pg_basebackup completes, to start up the new local instance, go into the new data directory /destination/path modify the postgresql.conf file or whatever file you may have defined your previous port in..

  • Set the port to a number not in use such as 5433. I.e  port = 5433
  • Modify any memory parameters necessary
  • Make sure, if archiving is enabled, it archives to a different location than the original cluster.

You can then proceed to start the new instance of postgres as follows:

pg_ctl -D /destination/path -o “-p 5433” start

You should now be able to connect to the new cluster with the exact credentials as the source cluster with 

psql -p 5433

Stand up a remote cluster

This process is pretty much identical to the local cluster process above. The only difference is you will specify a host and credentials.

From the remote target host 

pg_basebackup -h source_server -p port -U username -D /destination/path  -Pv –checkpoint=fast -X stream

As you can see, we are simply adding a connection string to the original command we ran for the local copy. This will generate the backup on the remote host and save it to the local destination path.

Once the copy is placed on the target host, if necessary, change your port and archive location if archiving is enabled as mentioned above.

Last words

The above examples are meant to get you started with basic backups and restores. They do not cover more advanced options such as archiving of wal files, point in time recovery, etc … This will be addressed in a future blog or by simply searching online.  Furthermore, using backups to stand up replicas will also be addressed in future blog postings.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK