3

Diffing PostgreSQL Schema Changes

 1 year ago
source link: https://www.percona.com/blog/diffing-postgresql-schema-changes/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Diffing PostgreSQL Schema Changes

One of the routine operations when administering PostgreSQL is periodic updates to the database system’s architecture. PostgreSQL does a good job of allowing one to update a schema, add types, functions, triggers, or alter a table by adding and removing columns and updating column data types, etc., in a reliable manner. However, there is no built-in mechanism to help identify the differences, let alone generate the necessary SQL, to accomplish updates in an easy manner from the development to the production environment.

So let’s talk about possible approaches to schema changes.

Using logical dump manifests

The easiest way to identify changes between schemas from one database to another is to compare schema dump manifests

The following example demonstrates an approach one can take looking for differences between schema on different databases:

EXAMPLE:

PgSQL
-- create database schemas
create database db01
create database db01<br /><br /><br />
PgSQL
-- db01: version 1
create table t1 (
    c1 int,
    c2 text,
    c4 date
create table t2(
    c1 int,
    c2 varchar(3),
    c3 timestamp,
    c4 date
PgSQL
-- db02: version 2
create table t1 (
    c1 serial primary key,
    c2 varchar(256),
    c3 date default now()
create table t2(
    c1 serial primary key,
    c2 varchar(3),
    c3 varchar(50),
    c4 timestamp with time zone default now(),
    c5 int references t1(c1)
create index on t2 (c5);
PgSQL
# generate schema dumps
pg_dump -s db01 -Fc > db01.db
pg_dump -s db02 -Fc > db02.db
PgSQL
# generate manifests
pg_restore -l db01.db > db01_manifest.ini
pg_restore -l db02.db > db02_manifest.ini

This snippet demonstrates looking for differences by comparing the md5 checksums:

PgSQL
# EX 1: generate checksums
md5sum \
<(tail +16 db01_manifest.ini | cut -d ' ' -f 4-) \
<(tail +16 db02_manifest.ini | cut -d ' ' -f 4-)
# output
$ 9d76c028259f2d8bed966308c256943e  /dev/fd/63
$ ba124f9410ea623085c237dc4398388a  /dev/fd/62

This next snippet diffs the differences between the two manifests identifying only those objects and attributes that have changed. Notice that redundant information, the first 16 lines, are skipped:

PgSQL
# EX 2: perform diff
diff \
> <(tail +16 db01_manifest.ini | cut -d ' ' -f 4-) \
> <(tail +16 db02_manifest.ini | cut -d ' ' -f 4-)

This resultant diff shows the changes made between the two schemas:

PgSQL
1a2,3
> SEQUENCE public t1_c1_seq postgres
> SEQUENCE OWNED BY public t1_c1_seq postgres
2a5,12
> SEQUENCE public t2_c1_seq postgres
> SEQUENCE OWNED BY public t2_c1_seq postgres
> DEFAULT public t1 c1 postgres
> DEFAULT public t2 c1 postgres
<span style="font-weight: 400;">> CONSTRAINT public t1 t1_pkey postgres</span><span style="font-weight: 400;"><br /></span><span style="font-weight: 400;">> CONSTRAINT public t2 t2_pkey postgres</span><span style="font-weight: 400;"><br /></span><span style="font-weight: 400;">> INDEX public t2_c5_idx postgres</span><span style="font-weight: 400;"><br /></span><span style="font-weight: 400;">> FK CONSTRAINT public t2 t2_c5_fkey postgres</span>

The good news is that there are a number of existing tools that can reconcile differences between a proposed schema design and the target schema:

  1. Commercial offerings can differentiate schema between databases in an elegant and efficient manner. Researching, ala Google, yields the most popular technologies one can use.
  2. In regards to open source solutions, there are a number of projects capable of diffing Postgres database schemas.

Working with the apgdiff extension

The following is an example implementation of the open source tool apgdiff

Apgdiff can be found in the Postgres community repository. It compares two schema dump files and creates an SQL output file that is, for the most part, suitable for upgrades of old schemata:

Package: apgdiff
Version: 2.7.0-1.pgdg18.04+1
Architecture: all
Maintainer: Debian PostgreSQL Maintainers <[email protected]>
Installed-Size: 173
Depends: default-jre-headless | java2-runtime-headless
Homepage: https://www.apgdiff.com/
Priority: optional
Section: database
Filename: pool/main/a/apgdiff/apgdiff_2.7.0-1.pgdg18.04+1_all.deb
Size: 154800
SHA256: 9a83fcf54aed00e1a28c3d00eabe1c166977af1e26e91035e15f88b5215b181b
SHA1: ea713acb55898f07374dadd1bebb09ec2fa4b589
MD5sum: e70a97903cb23b8df8a887da4c54e945

The following example demonstrates how one can update differences between the development environment and the production database schema using apgdiff.

EXAMPLE:

Shell
apt install -y apgdiff
Shell
# EX 1: dump as SQL statements
pg_dump -s db01 -Fp > db01.sql
pg_dump -s db02 -Fp > db02.sql
createdb db03 --template=db01
apgdiff --ignore-start-with db01.sql db02.sql > db01-db02.sql
# “psql -1” encapsulates statements within a transaction
psql -1 -f db01-db02.sql db03
Shell
# EX 2: uses logical dumps
# notice the dumps are standard logical dumps and includes data
pg_dump db01 -Fc > db01.db
pg_dump db02 -Fc > db02.db
createdb db03 --template=db01
# this invocation assumes the resultant diff doesn’t require editing
apgdiff --ignore-start-with \
    <(pg_restore -s -f - db01.db) \
    <(pg_restore -s -f - db02.db) \
    | psql -1 db03

There’s more you can accomplish with these simple approaches. By incorporating variations of these, one can create fairly sophisticated shell scripts with little code and, with a little luck, not that much effort.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK