3

How to Use Group Replication with Haproxy

 7 months ago
source link: https://www.percona.com/blog/how-to-use-group-replication-with-haproxy/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

How to Use Group Replication with Haproxy

January 9, 2024

Ivan Groenewold

When working with group replication, MySQL router would be the obvious choice for the connection layer. It is tightly coupled with the rest of the technologies since it is part of the InnoDB cluster stack.

The problem is that except for simple workloads, MySQL router’s performance is still not on par with other proxies like Haproxy or even ProxySQL. Fortunately, Haproxy and group replication can still work together as well. Let’s see how this can be accomplished.

Architecture

For our testing, let’s deploy an InnoDB ClusterSet in single-writer mode. 

There are three nodes on the primary site:

  • mysql1-t1
  • mysql2-t1
  • mysql3-t1

And three nodes on the DR site:

  • mysql1-t2
  • mysql2-t2
  • mysql3-t2

In this scenario, group replication is used locally between the nodes on each site, and both sites are linked together via asynchronous replication. Local or regional failover can be controlled from the MySQL shell. 

Similarly to using Haproxy as the connection layer for Percona XtraDB Cluster, health checks can be used to detect which member of the cluster to send reads or writes. This is usually accomplished through an xinetd service that runs the actual health check script. Here’s what it looks like; let’s discuss all the components in more detail.

Health check script

We need to create a database user with enough permissions to run any queries on the health check script. Let’s create the user on the current primary:

PgSQL
CREATE USER clustercheck@localhost IDENTIFIED BY 'Clust3rCh3ck@' WITH MAX_USER_CONNECTIONS 10;
GRANT PROCESS ON *.* to 'clustercheck'@'localhost';
GRANT SELECT ON sys.* to 'clustercheck'@'localhost';

The queries in our sample script require a few special objects to be created in the SYS schema:

Shell
USE sys;
DELIMITER $$
CREATE FUNCTION my_id() RETURNS TEXT(36) DETERMINISTIC NO SQL RETURN (SELECT @@global.server_uuid as my_id);$$
CREATE FUNCTION IFZERO(a INT, b INT)
RETURNS INT
DETERMINISTIC
RETURN IF(a = 0, b, a)$$
CREATE FUNCTION LOCATE2(needle TEXT(10000), haystack TEXT(10000), offset INT)
RETURNS INT
DETERMINISTIC
RETURN IFZERO(LOCATE(needle, haystack, offset), LENGTH(haystack) + 1)$$
CREATE FUNCTION GTID_NORMALIZE(g TEXT(10000))
RETURNS TEXT(10000)
DETERMINISTIC
RETURN GTID_SUBTRACT(g, '')$$
CREATE FUNCTION GTID_COUNT(gtid_set TEXT(10000))
RETURNS INT
DETERMINISTIC
BEGIN
  DECLARE result BIGINT DEFAULT 0;
  DECLARE colon_pos INT;
  DECLARE next_dash_pos INT;
  DECLARE next_colon_pos INT;
  DECLARE next_comma_pos INT;
  SET gtid_set = GTID_NORMALIZE(gtid_set);
  SET colon_pos = LOCATE2(':', gtid_set, 1);
  WHILE colon_pos != LENGTH(gtid_set) + 1 DO
     SET next_dash_pos = LOCATE2('-', gtid_set, colon_pos + 1);
     SET next_colon_pos = LOCATE2(':', gtid_set, colon_pos + 1);
     SET next_comma_pos = LOCATE2(',', gtid_set, colon_pos + 1);
     IF next_dash_pos < next_colon_pos AND next_dash_pos < next_comma_pos THEN SET result = result + SUBSTR(gtid_set, next_dash_pos + 1, LEAST(next_colon_pos, next_comma_pos) - (next_dash_pos + 1)) - SUBSTR(gtid_set, colon_pos + 1, next_dash_pos - (colon_pos + 1)) + 1; ELSE SET result = result + 1; END IF; SET colon_pos = next_colon_pos; END WHILE; RETURN result; END$$ CREATE FUNCTION gr_applier_queue_length() RETURNS INT DETERMINISTIC BEGIN RETURN (SELECT sys.gtid_count( GTID_SUBTRACT( (SELECT Received_transaction_set FROM performance_schema.replication_connection_status WHERE Channel_name = 'group_replication_applier' ), (SELECT @@global.GTID_EXECUTED) ))); END$$ CREATE FUNCTION gr_member_in_primary_partition() RETURNS VARCHAR(3) DETERMINISTIC BEGIN RETURN (SELECT IF( MEMBER_STATE='ONLINE' AND ((SELECT COUNT(*) FROM performance_schema.replication_group_members WHERE MEMBER_STATE NOT IN ('ONLINE', 'RECOVERING')) >=
    ((SELECT COUNT(*) FROM performance_schema.replication_group_members)/2) = 0),
    'YES', 'NO' ) FROM performance_schema.replication_group_members JOIN
    performance_schema.replication_group_member_stats USING(member_id) where member_id=my_id());
END$$
CREATE VIEW gr_member_routing_candidate_status AS SELECT
sys.gr_member_in_primary_partition() as viable_candidate,
IF( (SELECT (SELECT GROUP_CONCAT(variable_value) FROM
performance_schema.global_variables WHERE variable_name IN ('read_only',
'super_read_only')) != 'OFF,OFF'), 'YES', 'NO') as read_only,
Count_Transactions_Remote_In_Applier_Queue as transactions_behind, Count_Transactions_in_queue as 'transactions_to_cert'
from performance_schema.replication_group_member_stats where member_id=my_id();$$
DELIMITER ;

Here is an example of a simple checker script that is good enough for testing purposes. For production use, you might want to develop something more robust.

Shell
tee /usr/local/bin/mysql_gr_routing_check.sh <
#         Frederic -lefred- Descamps <[email protected]>
# Based on the original script from Unai Rodriguez and later work by Olaf van Zandwijk and Raghavendra Prabhu
# version 0.1 - first release
# version 0.2 - add read & write check + queue length check
# mysql_gr_routing_check.sh  <READ|WRITE>
# This password method is insecure and should not be used in a production environment!
MYSQL_USERNAME="clustercheck"
MYSQL_PASSWORD="Clust3rCh3ck@"
MYSQL_HOST=localhost
MYSQL_PORT=3306
MAXQUEUE=${1-100}
ROLE=${2-WRITE}
if ! [ "$MAXQUEUE" -eq "$MAXQUEUE" ] 2>/dev/null;
    # Member is not a viable routing candidate => return HTTP 503
    # Shell return-code is 1
    echo -en "HTTP/1.1 503 Service Unavailablern"
    echo -en "Content-Type: text/plainrn"
    echo -en "Connection: closern"
    echo -en "Content-Length: 44rn"
    echo -en "rn"
    echo -en "Group Replication member is not a viable routing candidate:rn"
    echo -en "maxqueue argument is not a valid value: ($MAXQUEUE).rn"
    exit 1
echo $(mysql --no-defaults -BN --connect-timeout=10 --host=$MYSQL_HOST --port=$MYSQL_PORT --user="$MYSQL_USERNAME" --password="$MYSQL_PASSWORD" -e 'SELECT * FROM sys.gr_member_routing_candidate_status' 2>/dev/null) | while read candidate readonly queue tx_to_cert
if [ "$candidate" == "YES" ]
   if [ "${ROLE^^}" == "READ" ]
       if [ $queue -lt $MAXQUEUE ]
    # Member is a viable routing candidate => return HTTP 200
    # Shell return-code is 0
    echo -en "HTTP/1.1 200 OKrn"
    echo -en "Content-Type: text/plainrn"
    echo -en "Connection: closern"
    echo -en "Content-Length: 40rn"
    echo -en "rn"
    echo -en "Group Replication member is a viable routing candidate for $ROLE.rn"
    exit 0
    # Member is not a viable routing candidate => return HTTP 503
    # Shell return-code is 1
    echo -en "HTTP/1.1 503 Service Unavailablern"
    echo -en "Content-Type: text/plainrn"
    echo -en "Connection: closern"
    echo -en "Content-Length: 44rn"
    echo -en "rn"
    echo -en "Group Replication member is not a viable routing candidate:rn"
            echo -en "queue exceeds ($queue) threshold ($MAXQUEUE).rn"
         exit 1
   elif [ "${ROLE^^}" == "WRITE" ]
       if [ "$readonly" == "YES" ]
    # Member is not a viable routing candidate => return HTTP 503
    # Shell return-code is 1
    echo -en "HTTP/1.1 503 Service Unavailablern"
    echo -en "Content-Type: text/plainrn"
    echo -en "Connection: closern"
    echo -en "Content-Length: 44rn"
    echo -en "rn"
    echo -en "Group Replication member is not a viable routing candidate:rn"
            echo -en "$ROLE cannot be routed to a readonly member.rn"
         exit 1
       if [ $queue -lt $MAXQUEUE ]
    # Member is a viable routing candidate => return HTTP 200
    # Shell return-code is 0
    echo -en "HTTP/1.1 200 OKrn"
    echo -en "Content-Type: text/plainrn"
    echo -en "Connection: closern"
    echo -en "Content-Length: 40rn"
    echo -en "rn"
    echo -en "Group Replication member is a viable routing candidate for $ROLE.rn"
    exit 0
    # Member is not a viable routing candidate => return HTTP 503
    # Shell return-code is 1
    echo -en "HTTP/1.1 503 Service Unavailablern"
    echo -en "Content-Type: text/plainrn"
    echo -en "Connection: closern"
    echo -en "Content-Length: 44rn"
    echo -en "rn"
    echo -en "Group Replication member is not a viable routing candidate:rn"
    echo -en "queue exceeds ($queue) threshold ($MAXQUEUE).rn"
    exit 1
       # Member is not a viable routing candidate => return HTTP 503
       # Shell return-code is 1
       echo -en "HTTP/1.1 503 Service Unavailablern"
       echo -en "Content-Type: text/plainrn"
       echo -en "Connection: closern"
       echo -en "Content-Length: 44rn"
       echo -en "rn"
       echo -en "Group Replication member is not a viable routing candidate:rn"
       echo -en "$ROLE is not a valid argument.rn"
       exit 1

Don’t forget to give the script execute permissions:

Shell
/usr/local/bin/mysql_gr_routing_check.sh

Xinetd service

We need to deploy a custom xinetd service to expose the state of MySQL on each node. We deploy one service using port 6446 to check for the write availability of a node and a service on port 6447 to check if the node is available for reads.

First, install the xinetd package, e.g.,

Shell
yum -y install xinetd

Now we prepare the definitions of the xinetd services:

Shell
tee /etc/xinetd.d/mysql_gr_routing_check_write <<EOF
# default: on
# description: check to see if the node is a viable routing candidate
service mysql_gr_routing_check_write
disable = no
flags = REUSE
socket_type = stream
port = 6446
wait = no
user = mysql
server = /usr/local/bin/mysql_gr_routing_check.sh
server_args = 100 write
log_on_failure += USERID
per_source = UNLIMITED
Shell
tee /etc/xinetd.d/mysql_gr_routing_check_read <<EOF
# default: on
# description: check to see if the node is a viable routing candidate
service mysql_gr_routing_check_read
disable = no
flags = REUSE
socket_type = stream
port = 6447
wait = no
user = mysql
server = /usr/local/bin/mysql_gr_routing_check.sh
server_args = 100 read
log_on_failure += USERID
per_source = UNLIMITED

Start the service:

<span style="font-weight: 400;">service xinetd start</span>

Testing the service

We can verify our service is working by using telnet (or curl) from a remote host. For example, to check if a node is available as a writer, we query the service on port 6446:

Shell
# telnet mysql1-t2 6446
Trying 10.11.100.133...
Connected to mysql1-t2.
Escape character is '^]'.
HTTP/1.1 503 Service Unavailable
Content-Type: text/plain
Connection: close
Content-Length: 44
Group Replication member is not a viable routing candidate:
write cannot be routed to a readonly member.
Connection closed by foreign host.

We can also verify if a node can be a reader using port 6447:

Shell
# telnet mysql1-t2 6447
Trying 10.11.100.133...
Connected to mysql1-t2.
Escape character is '^]'.
HTTP/1.1 200 OK
Content-Type: text/plain
Connection: close
Content-Length: 40
Group Replication member is a viable routing candidate for read.
Connection closed by foreign host.

Haproxy configuration

Haproxy needs to be configured with two dedicated endpoints for reads and writes, respectively. The HTTP checks defined will query our custom xinetd services to check for a node’s read/write availability. 

Here’s an example of the haproxy configuration:

Shell
frontend mysql-gr-front_write
bind *:3307
mode tcp
default_backend mysql-gr-back_write
backend mysql-gr-back_write
mode tcp
balance leastconn
option httpchk
server mysql1-t1 mysql1-t1:3306 check port 6446 inter 1000 rise 1 fall 2 on-marked-up shutdown-backup-sessions
server mysql2-t1 mysql2-t1:3306 check port 6446 inter 1000 rise 1 fall 2 backup
server mysql3-t1 mysql3-t1:3306 check port 6446 inter 1000 rise 1 fall 2 backup
server mysql1-t2 mysql1-t2:3306 check port 6446 inter 1000 rise 1 fall 2 backup
server mysql2-t2 mysql2-t2:3306 check port 6446 inter 1000 rise 1 fall 2 backup
server mysql3-t2 mysql3-t2:3306 check port 6446 inter 1000 rise 1 fall 2 backup
frontend mysql-gr-front_read
bind *:3308
mode tcp
default_backend mysql-gr-back_read
backend mysql-gr-back_read
mode tcp
balance leastconn
option httpchk
server mysql1-t1 mysql1-t1:3306 check port 6447 inter 1000 rise 1 fall 2
server mysql2-t1 mysql2-t1:3306 check port 6447 inter 1000 rise 1 fall 2
server mysql3-t1 mysql3-t1:3306 check port 6447 inter 1000 rise 1 fall 2
server mysql1-t2 mysql1-t2:3306 check port 6447 inter 1000 rise 1 fall 2
server mysql2-t2 mysql2-t2:3306 check port 6447 inter 1000 rise 1 fall 2
server mysql3-t2 mysql3-t2:3306 check port 6447 inter 1000 rise 1 fall 2
frontend stats
    mode http
    bind *:443
    stats enable
    stats uri /stats
    stats refresh 10s
    stats admin if LOCALHOST

Connecting our application

Here’s an example of how the application would connect to the database through Haproxy to write:

mysql -h haproxy1 -P3307 -u testuser -p'testpwd'

For read-only connections, it would use the alternative port as configured above:

mysql -h haproxy1 -P3308 -u testuser -p'testpwd'

Closing thoughts

Using this architecture, both local and regional failover can be triggered through MySQL Shell. Haproxy will automatically adjust the writer/reader nodes without human intervention. 

We can also take individual nodes out of the pool by simply stopping the MySQL service or pausing group replication.

Hopefully, in the future, MySQL router will offer similar performance to other routing solutions so that we can take advantage of the integration with the rest of the components of InnoDB cluster.

Percona Distribution for MySQL is the most complete, stable, scalable, and secure open source MySQL solution available, delivering enterprise-grade database environments for your most critical business applications… and it’s free to use!

Try Percona Distribution for MySQL today!

Share This Post!

Subscribe
Connect with
guest
Label
0 Comments

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK