5

基于Patroni的PostgreSQL高可用实践

 1 year ago
source link: https://blog.51cto.com/candon123/5995857
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

因环境有限,本文在一台机器上实现基于Patroni的PostgreSQL高可用服务测试。

1、安装软件包

[root@lee ~]# yum -y install https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm
[root@lee ~]# for i in pgdg10 pgdg11 pgdg12 pgdg13 pgdg14;do yum-config-manager --disable $i;done
[root@lee ~]# yum-config-manager --disable postgresql
[root@lee ~]# yum -y install watchdog patroni patroni-etcd etcd haproxy postgresql15-server

2、配置ETCD服务

[root@lee ~]# vi /etc/etcd/etcd.conf
[Member]
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="http://172.16.104.112:2380"
ETCD_LISTEN_CLIENT_URLS="http://172.16.104.112:2379,http://127.0.0.1:2379"
ETCD_NAME="lee"
[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://172.16.104.112:2380"
ETCD_ADVERTISE_CLIENT_URLS="http://172.16.104.112:2379,http://127.0.0.1:2379"
ETCD_INITIAL_CLUSTER="lee=http://172.16.104.112:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
[root@lee ~]# systemctl enable etcd && systemctl start etcd
[root@lee ~]# systemctl status etcd
[root@lee ~]# etcdctl member list
9227a59dc4fe5c0e: name=lee peerURLs=http://172.16.104.112:2380 clientURLs=http://127.0.0.1:2379,http://172.16.104.112:2379 isLeader=true
[root@lee ~]# etcdctl cluster-health
member 9227a59dc4fe5c0e is healthy: got healthy result from http://127.0.0.1:2379
cluster is healthy

3、初始化PostgreSQL

初始化之前,创建三个目录作为实例的数据存放路径。

[root@lee ~]# mkdir -p /pgsql/{data01,data02,data03}
[root@lee ~]# chown -R postgres: /pgsql
[root@lee ~]# echo redhat|passwd --stdin postgres
[root@lee ~]# su - postgres
[postgres@lee ~]$ initdb -E UTF8 --locale=en_US.UTF-8 -D /pgsql/data01 -U postgres -W
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

Enter new superuser password:
Enter it again:

fixing permissions on existing directory /pgsql/data01 ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Asia/Shanghai
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

pg_ctl -D /pgsql/data01 -l logfile start

初始化完成后,启动PG服务,然后创建三个物理复制槽:

postgres=# select *from pg_create_physical_replication_slot('lee01');
postgres=# select *from pg_create_physical_replication_slot('lee02');
postgres=# select *from pg_create_physical_replication_slot('lee03');
postgres=# select slot_name,slot_type from pg_replication_slots;
slot_name | slot_type
-----------+-----------
lee01 | physical
lee02 | physical
lee03 | physical

由于是新初始化的环境,还必须设置白名单:

[postgres@lee ~]$ vi /pgsql/data/pg_hba.conf
host all all 0.0.0.0/0 md5
host replication postgres 0.0.0.0/0 md5
host all all 172.16.104.112/32 md5

4、配置Patroni服务

4.1 配置sudo权限

在配置patroni之前,需要设置postgres用户的sudo权限,因为在patroni启动的过程中会加载watchdog程序。

[root@lee ~]# visudo
postgres ALL=(ALL) NOPASSWD: /usr/sbin/modprobe,/usr/bin/chown
4.2 创建patroni实例配置文件

接下来,创建三个实例的patroni配置文件:

[root@lee ~]# vi /etc/patroni/patroni01.yml
scope: postgres
namespace: /pg_cluster/
#name这个参数的值随便写,建议为主机名,如果在同一台机器上,设置别名即可。
name: lee01

log:
level: INFO
traceback_level: ERROR
#定义patroni的日志路径,此路径对于postgres用户必须可读可写。
dir: /tmp/patroni01
file_num: 10
file_size: 104857600

#restapi的端口后,如果在同一台节点上,设置不同的端口号,在不同的节点使用默认的8008即可。
restapi:
listen: 0.0.0.0:8008
#connect_address为本机的IP加端口号
connect_address: 172.16.104.112:8008

etcd:
host: 172.16.104.112:2379

bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
#Patroni默认使用的是异步流复制,如果使用同步流复制,必须设置以下三个以synchronous开头的参数。
#在多个节点的同步流复制模式,默认只有一个节点是同步流复制,要设置多个同步流复制,必须设置synchronous_node_count参数。
synchronous_mode: true
synchronous_node_count: 2
synchronous_mode_strict: true
postgresql:
use_pg_rewind: true
use_slots: true
parameters:
wal_level: replica
hot_standby: "on"
logging_collector: 'on'
max_wal_senders: 10
max_replication_slots: 10
wal_log_hints: "on"
max_connections: 1000
synchronous_commit: on
synchronous_standby_names: '*'

initdb:
- encoding: UTF8
- data-checksums

pg_hba:
- host replication postgres 127.0.0.1/32 md5
- host replication postgres 172.16.104.112/32 trust
- host replication postgres 0.0.0.0/0 md5
- host all all 0.0.0.0/0 md5

users:
admin:
password: admin
options:
- createrole
- createdb

postgresql:
listen: 0.0.0.0:5432
connect_address: 172.16.104.112:5432
data_dir: /pgsql/data01
bin_dir: /usr/pgsql-15/bin
authentication:
replication:
username: postgres
password: redhat
superuser:
username: postgres
password: redhat
parameters:
unix_socket_directories: '/var/run/postgresql'
superuser_reserved_connections: 13
tcp_keepalives_idle: 60
tcp_keepalives_interval: 10
tcp_keepalives_count: 10
shared_buffers: 2048MB
vacuum_cost_delay: 10
bgwriter_delay: 10ms
wal_writer_delay: 10ms
wal_receiver_status_interval: 1s
hot_standby_feedback: on
log_destination: 'csvlog'
logging_collector: on
log_directory: 'pg_log'
log_truncate_on_rotation: on
log_rotation_age: 1d
log_rotation_size: 10MB
log_checkpoints: on
log_connections: on
log_disconnections: on
log_error_verbosity: verbose
log_timezone: 'PRC'
full_page_writes: on

watchdog:
mode: automatic # Allowed values: off, automatic, required
device: /dev/watchdog
safety_margin: 5

tags:
nofailover: false
noloadbalance: false
clonefrom: false
nosync: false

实例2的配置文件内容:

[root@lee ~]# vi /etc/patroni/patroni02.yml
scope: postgres
namespace: /pg_cluster/
name: lee02

log:
level: INFO
traceback_level: ERROR
dir: /tmp/patroni02
file_num: 10
file_size: 104857600

restapi:
listen: 0.0.0.0:8009
connect_address: 172.16.104.112:8009

etcd:
host: 172.16.104.112:2379

bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
synchronous_mode: true
synchronous_node_count: 2
synchronous_mode_strict: true
postgresql:
use_pg_rewind: true
use_slots: true
parameters:
wal_level: replica
hot_standby: "on"
logging_collector: 'on'
max_wal_senders: 10
max_replication_slots: 10
wal_log_hints: "on"
max_connections: 1000
synchronous_commit: on
synchronous_standby_names: '*'

initdb:
- encoding: UTF8
- data-checksums

pg_hba:
- host replication postgres 127.0.0.1/32 md5
- host replication postgres 172.16.104.112/32 trust
- host replication postgres 0.0.0.0/0 md5
- host all all 0.0.0.0/0 md5

users:
admin:
password: admin
options:
- createrole
- createdb

postgresql:
listen: 0.0.0.0:5433
connect_address: 172.16.104.112:5433
data_dir: /pgsql/data02
bin_dir: /usr/pgsql-15/bin
authentication:
replication:
username: postgres
password: redhat
superuser:
username: postgres
password: redhat
parameters:
unix_socket_directories: '/var/run/postgresql'
superuser_reserved_connections: 13
tcp_keepalives_idle: 60
tcp_keepalives_interval: 10
tcp_keepalives_count: 10
shared_buffers: 2048MB
vacuum_cost_delay: 10
bgwriter_delay: 10ms
wal_writer_delay: 10ms
wal_receiver_status_interval: 1s
hot_standby_feedback: on
log_destination: 'csvlog'
logging_collector: on
log_directory: 'pg_log'
log_truncate_on_rotation: on
log_rotation_age: 1d
log_rotation_size: 10MB
log_checkpoints: on
log_connections: on
log_disconnections: on
log_error_verbosity: verbose
log_timezone: 'PRC'
full_page_writes: on

watchdog:
mode: automatic # Allowed values: off, automatic, required
device: /dev/watchdog
safety_margin: 5

tags:
nofailover: false
noloadbalance: false
clonefrom: false
nosync: false

实例3的配置文件内容:

[root@lee ~]# vi /etc/patroni/patroni03.yml
scope: postgres
namespace: /pg_cluster/
name: lee03

log:
level: INFO
traceback_level: ERROR
dir: /tmp/patroni03
file_num: 10
file_size: 104857600

restapi:
listen: 0.0.0.0:8010
connect_address: 172.16.104.112:8010

etcd:
host: 172.16.104.112:2379

bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
synchronous_mode: true
synchronous_node_count: 2
synchronous_mode_strict: true
postgresql:
use_pg_rewind: true
use_slots: true
parameters:
wal_level: replica
hot_standby: "on"
logging_collector: 'on'
max_wal_senders: 10
max_replication_slots: 10
wal_log_hints: "on"
max_connections: 1000
synchronous_commit: on
synchronous_standby_names: '*'

initdb:
- encoding: UTF8
- data-checksums

pg_hba:
- host replication postgres 127.0.0.1/32 md5
- host replication postgres 172.16.104.112/32 trust
- host replication postgres 0.0.0.0/0 md5
- host all all 0.0.0.0/0 md5

users:
admin:
password: admin
options:
- createrole
- createdb

postgresql:
listen: 0.0.0.0:5434
connect_address: 172.16.104.112:5434
data_dir: /pgsql/data03
bin_dir: /usr/pgsql-15/bin
authentication:
replication:
username: postgres
password: redhat
superuser:
username: postgres
password: redhat
parameters:
unix_socket_directories: '/var/run/postgresql'
superuser_reserved_connections: 13
tcp_keepalives_idle: 60
tcp_keepalives_interval: 10
tcp_keepalives_count: 10
shared_buffers: 2048MB
vacuum_cost_delay: 10
bgwriter_delay: 10ms
wal_writer_delay: 10ms
wal_receiver_status_interval: 1s
hot_standby_feedback: on
log_destination: 'csvlog'
logging_collector: on
log_directory: 'pg_log'
log_truncate_on_rotation: on
log_rotation_age: 1d
log_rotation_size: 10MB
log_checkpoints: on
log_connections: on
log_disconnections: on
log_error_verbosity: verbose
log_timezone: 'PRC'
full_page_writes: on

watchdog:
mode: automatic # Allowed values: off, automatic, required
device: /dev/watchdog
safety_margin: 5

tags:
nofailover: false
noloadbalance: false
clonefrom: false
nosync: false
4.3 创建Patroni服务

三个patroni实例运行在一台机器上,所有创建三个服务:patroni01、patroni02和patroni03。

[root@lee ~]# cd /usr/lib/systemd/system
[root@lee system]# vi patroni01.service
# It's not recommended to modify this file in-place, because it will be
# overwritten during package upgrades. It is recommended to use systemd
# "dropin" feature; i.e. create file with suffix .conf under
# /etc/systemd/system/patroni.service.d directory overriding the
# unit's defaults. You can also use "systemctl edit patroni"
# Look at systemd.unit(5) manual page for more info.

[Unit]
Description=Runners to orchestrate a high-availability PostgreSQL
After=syslog.target network.target

[Service]
Type=simple

User=postgres
Group=postgres

# Read in configuration file if it exists, otherwise proceed
EnvironmentFile=-/etc/patroni_env.conf

# WorkingDirectory=/var/lib/pgsql

# Where to send early-startup messages from the server
# This is normally controlled by the global default set by systemd
#StandardOutput=syslog

# Pre-commands to start watchdog device
# Uncomment if watchdog is part of your patroni setup
ExecStartPre=-/usr/bin/sudo /sbin/modprobe softdog soft_noboot=1
ExecStartPre=-/usr/bin/sudo /bin/chown postgres:postgres /dev/watchdog

# Start the patroni process
ExecStart=/usr/bin/patroni /etc/patroni/patroni01.yml

# Send HUP to reload from patroni.yml
ExecReload=/usr/bin/kill -s HUP $MAINPID

# only kill the patroni process, not it's children, so it will gracefully stop postgres
KillMode=process

# Give a reasonable amount of time for the server to start up/shut down
TimeoutSec=30

# Do not restart the service if it crashes, we want to manually inspect database on failure
Restart=no

[Install]
WantedBy=multi-user.target
#以下直接复制并修改启动程序里对应的配置文件文件路径即可。
[root@lee system]# cp patroni01.service patroni02.service
[root@lee system]# cp patroni01.service patroni03.service
[root@lee ~]# systemctl daemon-reload
4.4 启动Patroni服务
[root@lee ~]# for i in `patroni01 patroni02 patroni03`;do systemctl enable $i;done
[root@lee ~]# for i in `patroni01 patroni02 patroni03`;do systemctl start $i;done
[root@lee ~]# echo "alias patronictl='patronictl -c /etc/patroni/patroni01.yml'" >>/etc/profile
[root@lee ~]# source /etc/profile

启动完成后,可以使用patronictl命令验证,服务是否已正确启动,如下图:

基于Patroni的PostgreSQL高可用实践_postgresql

当前lee02的5433端口对应的postgresql实例为主库,其他两个为从库。

5、配置HAProxy服务

这里使用5000端口访问主库,而5001端口访问其他两个从库。编辑/etc/haproxy/haproxy.cfg文件,加入以下内容:

[root@lee ~]# vi /etc/haproxy/haproxy.cfg
global
maxconn 1000
pidfile /var/run/haproxy.pid
maxconn 5000
user root
group root
daemon
nbproc 2

defaults
mode tcp
log global
option tcplog
option dontlognull
option redispatch
retries 3
maxconn 1000
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout check 5s

listen stats
mode http
bind *:7000
log global
stats enable
stats refresh 30s
stats uri /
stats realm Private lands
stats auth admin:admin

listen primary
bind *:5000
mode tcp
option tcplog
balance roundrobin
option httpchk /primary
http-check expect status 200
default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
server lee01 172.16.104.112:5432 maxconn 1000 check port 8008
server lee02 172.16.104.112:5433 maxconn 1000 check port 8009
server lee03 172.16.104.112:5434 maxconn 1000 check port 8010

listen standbys
balance roundrobin
bind *:5001
mode tcp
option tcplog
option httpchk /replica
http-check expect status 200
default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
server lee01 172.16.104.112:5432 maxconn 1000 check port 8008
server lee02 172.16.104.112:5433 maxconn 1000 check port 8009
server lee03 172.16.104.112:5434 maxconn 1000 check port 8010

编辑完成后,启动服务:

[root@lee ~]# systemctl enable haproxy && systemctl start haproxy
[root@lee ~]# systemctl status haproxy
基于Patroni的PostgreSQL高可用实践_postgresql_02

也可以使用下面的命令进行验证:

[postgres@lee ~]$ psql "host=172.16.104.112 port=5000 password=redhat" -c 'select inet_server_addr(),inet_server_port(),pg_is_in_recovery()'
[postgres@lee ~]$ psql "host=172.16.104.112 port=5001 password=redhat" -c 'select inet_server_addr(),inet_server_port(),pg_is_in_recovery()'
基于Patroni的PostgreSQL高可用实践_postgresql_03

6、主备倒换操作

如果某个节点挂掉,patroni会自动进行主备倒换操作,这里演示下手工倒换操作。

基于Patroni的PostgreSQL高可用实践_postgresql_04

也可以通过查询pg_stat_replication视图获取流复制相关信息:

基于Patroni的PostgreSQL高可用实践_postgresql_05

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK