9

Hidden Cost of Foreign Key Constraints in MySQL

 2 years ago
source link: https://www.percona.com/blog/hidden-cost-of-foreign-key-constraints-in-mysql/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Hidden Cost of Foreign Key Constraints in MySQLDo you wonder if MySQL tells you the truth about writes to tables that have foreign key constraints? The situation is complex, and getting visibility on what is really happening can be a problem.

I found this issue intriguing and decided to share and highlight some examples.

Query Plan

Let us take this example table:

MySQL
CREATE TABLE `product` (
  `category` int NOT NULL,
  `id` int NOT NULL,
  `price` decimal(10,0) DEFAULT NULL,
  PRIMARY KEY (`category`,`id`)
) ENGINE=InnoDB;

We want to know how costly an example UPDATE against this table will be:

MySQL
mysql > EXPLAIN update product set id=id+1 where id=65032158 and category=3741760\G
*************************** 1. row ***************************
  select_type: UPDATE
        table: product
   partitions: NULL
         type: range
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 8
          ref: const,const
         rows: 1
     filtered: 100.00
        Extra: Using where
1 row in set, 1 warning (0.00 sec)

This shows that just one table and one row will be considered. This looks right as we’re using a primary key lookup, and one row matches:

MySQL
mysql > select * from product where id=65032158;
+----------+----------+-------+
| category | id       | price |
+----------+----------+-------+
|  3741760 | 65032158 |     2 |
+----------+----------+-------+
1 row in set (0.02 sec)

In our case, this was not the entire truth as the table has FK relationships:

MySQL
mysql > SELECT TABLE_NAME,COLUMN_NAME,CONSTRAINT_NAME, REFERENCED_TABLE_NAME,REFERENCED_COLUMN_NAME from INFORMATION_SCHEMA.KEY_COLUMN_USAGE where REFERENCED_TABLE_NAME='product' AND REFERENCED_TABLE_SCHEMA='db1'\G
*************************** 1. row ***************************
            TABLE_NAME: product_order
           COLUMN_NAME: product_category
       CONSTRAINT_NAME: product_order_ibfk_1
 REFERENCED_TABLE_NAME: product
REFERENCED_COLUMN_NAME: category
*************************** 2. row ***************************
            TABLE_NAME: product_order
           COLUMN_NAME: product_id
       CONSTRAINT_NAME: product_order_ibfk_1
 REFERENCED_TABLE_NAME: product
REFERENCED_COLUMN_NAME: id
2 rows in set (0.01 sec)

The related table has a defined ON UPDATE CASCADE action linked to our table:

MySQL
CREATE TABLE `product_order` (
  `no` int NOT NULL AUTO_INCREMENT,
  `product_category` int NOT NULL,
  `product_id` int NOT NULL,
  `customer_id` int NOT NULL,
  PRIMARY KEY (`no`),
  KEY `product_category` (`product_category`,`product_id`),
  KEY `customer_id` (`customer_id`),
  CONSTRAINT `product_order_ibfk_1` FOREIGN KEY (`product_category`, `product_id`) REFERENCES `product` (`category`, `id`) ON DELETE RESTRICT ON UPDATE CASCADE,
  CONSTRAINT `product_order_ibfk_2` FOREIGN KEY (`customer_id`) REFERENCES `customer` (`id`)
) ENGINE=InnoDB;

Therefore, the EXPLAIN plan completely fails to recognize this fact, and the plan tries to convince us that such an update will only change one row in our database.

Another typical method to analyze slow queries is checking the per-session status handlers. In this case, it looks like this:

MySQL
mysql > flush status; update product set id=id+1 where id=65032158 and category=3741760\G
Query OK, 0 rows affected (0.00 sec)
Query OK, 1 row affected (0.02 sec)
Rows matched: 1  Changed: 1  Warnings: 0
mysql > show status like 'handler%';
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| Handler_commit             | 2     |
| Handler_delete             | 0     |
| Handler_discover           | 0     |
| Handler_external_lock      | 2     |
| Handler_mrr_init           | 0     |
| Handler_prepare            | 2     |
| Handler_read_first         | 0     |
| Handler_read_key           | 1     |
| Handler_read_last          | 0     |
| Handler_read_next          | 0     |
| Handler_read_prev          | 0     |
| Handler_read_rnd           | 0     |
| Handler_read_rnd_next      | 0     |
| Handler_rollback           | 0     |
| Handler_savepoint          | 0     |
| Handler_savepoint_rollback | 0     |
| Handler_update             | 1     |
| Handler_write              | 0     |
+----------------------------+-------+
18 rows in set (0.01 sec)

Handler_update, as well as the Rows Changed information in the query outcome output, are not taking changes in referencing product_order table into account either. Without additional checks, we don’t even know that additional work has been done! 

Monitoring

Let’s see how foreign constraints can impact monitoring database activities.

We already know monitoring Handler_update won’t work as expected. Let’s check the InnoDB engine-related counter (on an idle MySQL instance as this is a global only counter):

MySQL
mysql > show status like 'Innodb_rows_updated';
+---------------------+-------+
| Variable_name       | Value |
+---------------------+-------+
| Innodb_rows_updated | 21369 |
+---------------------+-------+
1 row in set (0.00 sec)

Here is our UPDATE and how many data rows it really changes:

MySQL
mysql > select product_id,count(*) from product_order where product_id in (65032159,65032160) group by product_id;
+------------+----------+
| product_id | count(*) |
+------------+----------+
|   65032159 |      897 |
+------------+----------+
1 row in set (0.02 sec)
mysql > update product set id=id+1 where id=65032159 and category=3741760\G
Query OK, 1 row affected (0.02 sec)
Rows matched: 1  Changed: 1  Warnings: 0
mysql > select product_id,count(*) from product_order where product_id in (65032159,65032160) group by product_id;
+------------+----------+
| product_id | count(*) |
+------------+----------+
|   65032160 |      897 |
+------------+----------+
1 row in set (0.01 sec)
mysql > show status like 'Innodb_rows_updated';
+---------------------+-------+
| Variable_name       | Value |
+---------------------+-------+
| Innodb_rows_updated | 22267 |
+---------------------+-------+
1 row in set (0.00 sec)
mysql > select 22267-21369;
+-------------+
| 22267-21369 |
+-------------+
|         898 |
+-------------+
1 row in set (0.00 sec)

This confirms that InnoDB exposes the real number of rows updated correctly here.

Child table locks are also exposed for active transactions with SHOW ENGINE INNODB STATUS (after enabling innodb_status_output_locks).

What about Performance Schema, another popular method to monitor the database? 

MySQL
mysql > truncate `performance_schema`.`table_io_waits_summary_by_table`;
Query OK, 0 rows affected (0.00 sec)
mysql > select OBJECT_NAME,COUNT_UPDATE,COUNT_WRITE,COUNT_READ,COUNT_DELETE,COUNT_INSERT from `performance_schema`.`table_io_waits_summary_by_table` where OBJECT_NAME like "prod%";
+---------------+--------------+-------------+------------+--------------+--------------+
| OBJECT_NAME   | COUNT_UPDATE | COUNT_WRITE | COUNT_READ | COUNT_DELETE | COUNT_INSERT |
+---------------+--------------+-------------+------------+--------------+--------------+
| product       |            0 |           0 |          0 |            0 |            0 |
| product_order |            0 |           0 |          0 |            0 |            0 |
+---------------+--------------+-------------+------------+--------------+--------------+
2 rows in set (0.00 sec)
mysql > update product set id=id+1 where id=65032159 and category=3741760\G
Query OK, 1 row affected (0.02 sec)
Rows matched: 1  Changed: 1  Warnings: 0
mysql > select OBJECT_NAME,COUNT_UPDATE,COUNT_WRITE,COUNT_READ,COUNT_DELETE,COUNT_INSERT from `performance_schema`.`table_io_waits_summary_by_table` where OBJECT_NAME like "prod%";
+---------------+--------------+-------------+------------+--------------+--------------+
| OBJECT_NAME   | COUNT_UPDATE | COUNT_WRITE | COUNT_READ | COUNT_DELETE | COUNT_INSERT |
+---------------+--------------+-------------+------------+--------------+--------------+
| product       |            1 |           1 |          1 |            0 |            0 |
| product_order |            0 |           0 |          0 |            0 |            0 |
+---------------+--------------+-------------+------------+--------------+--------------+
2 rows in set (0.00 sec)
mysql > select product_id,count(*) from product_order where product_id in (65032159,65032160) group by product_id;
+------------+----------+
| product_id | count(*) |
+------------+----------+
|   65032160 |      897 |
+------------+----------+
1 row in set (0.02 sec)
mysql > select OBJECT_NAME,COUNT_UPDATE,COUNT_WRITE,COUNT_READ,COUNT_DELETE,COUNT_INSERT from `performance_schema`.`table_io_waits_summary_by_table` where OBJECT_NAME like "prod%";
+---------------+--------------+-------------+------------+--------------+--------------+
| OBJECT_NAME   | COUNT_UPDATE | COUNT_WRITE | COUNT_READ | COUNT_DELETE | COUNT_INSERT |
+---------------+--------------+-------------+------------+--------------+--------------+
| product       |            1 |           1 |          1 |            0 |            0 |
| product_order |            0 |           0 |      54028 |            0 |            0 |
+---------------+--------------+-------------+------------+--------------+--------------+
2 rows in set (0.00 sec)

Unfortunately, Performance Schema completely missed what happened due to the Foreign Key constraint! I think it is a serious issue and so I reported it here: https://bugs.mysql.com/bug.php?id=106012

Summary

I hope I was able to draw your attention to the need to exercise caution when investigating DML queries and system load when Foreign Key constraints are used! Maybe you were surprised that a simple single row update or delete required so much time? It could be that under the hood MySQL changed thousands of rows and hid this fact from you!


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK