4

MySQL 8.0 Functional Indexes

 2 years ago
source link: https://www.percona.com/blog/mysql-8-0-functional-indexes/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client
Percona Database Performance Blog

MySQL 8.0 Functional IndexesWorking with hundreds of different customers I often face similar problems around running queries. One very common problem when trying to optimize a database environment is index usage. A query that cannot use an index is usually a long-running one, consuming more memory or triggering more disk iops.

A very common case is when a query uses a filter condition against a column that is involved in some kind of functional expression. An index on that column can not be used.

Starting from MySQL 8.0.13 functional indexes are supported. In this article, I’m going to show what they are and how they work.

The Well-Known Problem

As already mentioned, a very common problem about index usage is when you have a filter condition against one or more columns involved in some kind of functional expression.

Let’s see a simple example.

You have a table called products containing the details of your products, including a create_time TIMESTAMP column. If you would like to calculate the average price of your products on a specific month you could do the following:

Shell
mysql> SELECT AVG(price) FROM products WHERE MONTH(create_time)=10;
+------------+
| AVG(price) |
+------------+
| 202.982582 |
+------------+

The query returns the right value, but take a look at the EXPLAIN:

Shell
mysql> EXPLAIN SELECT AVG(price) FROM products WHERE MONTH(create_time)=10\G
*************************** 1. row ***************************
  select_type: SIMPLE
        table: products
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 99015
     filtered: 100.00
        Extra: Using where

The query triggers a full scan of the table. Let’s create an index on create_time and check again:

Shell
mysql> ALTER TABLE products ADD INDEX(create_time);
Query OK, 0 rows affected (0.71 sec)
Records: 0  Duplicates: 0  Warnings: 0
Shell
mysql> explain SELECT AVG(price) FROM products WHERE MONTH(create_time)=10\G
*************************** 1. row ***************************
  select_type: SIMPLE
        table: products
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 99015
     filtered: 100.00
        Extra: Using where

A full scan again. The index we have created is not effective. Indeed any time an indexed column is involved in a function the index can not be used.

To optimize the query the workaround is rewriting it differently in order to isolate the indexed column from the function.

Let’s test the following equivalent query:

Shell
mysql> SELECT AVG(price) FROM products WHERE create_time BETWEEN '2019-10-01' AND '2019-11-01';
+------------+
| AVG(price) |
+------------+
| 202.982582 |
+------------+
Shell
mysql> EXPLAIN SELECT AVG(price) FROM products WHERE create_time BETWEEN '2019-10-01' AND '2019-11-01'\G
*************************** 1. row ***************************
  select_type: SIMPLE
        table: products
   partitions: NULL
         type: range
possible_keys: create_time
          key: create_time
      key_len: 5
          ref: NULL
         rows: 182
     filtered: 100.00
        Extra: Using index condition

Cool, now the index is used. Then rewriting the query was the typical suggestion.

Quite a simple solution, but not all the times it was possible to change the application code for many valid reasons. So, what to do then?

MySQL 8.0 Functional Indexes

Starting from version 8.0.13, MySQL supports functional indexes. Instead of indexing a simple column, you can create the index on the result of any function applied to a column or multiple columns.

Long story short, now you can do the following:

Shell
mysql> ALTER TABLE products ADD INDEX((MONTH(create_time)));
Query OK, 0 rows affected (0.74 sec)
Records: 0  Duplicates: 0  Warnings: 0

Be aware of the double parentheses. The syntax is correct since the expression must be enclosed within parentheses to distinguish it from columns or column prefixes.

Indeed the following returns an error:

Shell
mysql> ALTER TABLE products ADD INDEX(MONTH(create_time));
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'create_time))' at line 1

Let’s check now our original query and see what happens to the EXPLAIN

Shell
mysql> SELECT AVG(price) FROM products WHERE MONTH(create_time)=10;
+------------+
| AVG(price) |
+------------+
| 202.982582 |
+------------+
Shell
mysql> EXPLAIN SELECT AVG(price) FROM products WHERE MONTH(create_time)=10\G
*************************** 1. row ***************************
  select_type: SIMPLE
        table: products
   partitions: NULL
         type: ref
possible_keys: functional_index
          key: functional_index
      key_len: 5
          ref: const
         rows: 182
     filtered: 100.00
        Extra: NULL

The query is no longer a full scan and runs faster. The functional_index has been used, with only 182 rows examined. Awesome.

Thanks to the functional index we are no longer forced to rewrite the query.

Which Functional Indexes are Permitted

We have seen an example involving a simple function applied to a column, but you are granted to create more complex indexes.

A functional index may contain any kind of expressions, not only a single function. The following patterns are valid functional indexes:

INDEX( ( col1 + col2 ) )
INDEX( ( FUNC(col1) + col2 – col3 ) )

You can use ASC or DESC as well:

INDEX( ( MONTH(col1) ) DESC )

You can have multiple functional parts, each one included in parentheses:

INDEX( ( col1 + col2 ), ( FUNC(col2) ) )

You can mix functional with nonfunctional parts:

INDEX( (FUNC(col1)), col2, (col2 + col3), col4 )

There are also limitations you should be aware of:

  • A functional key can not contain a single column. The following is not permitted:
    INDEX( (col1), (col2) )
  • The primary key can not include a functional key part
  • The foreign key can not include a functional key part
  • SPATIAL and FULLTEXT indexes can not include functional key parts
  • A functional key part can not refer to a column prefix

At last, remember that the functional index is useful only to optimize the query that uses the exact same expression. An index created with nonfunctional parts can be used instead to solve multiple different queries.

For example, the following conditions can not rely on the functional index we have created:

WHERE YEAR(create_time) = 2019

WHERE create_time > ‘2019-10-01’

WHERE create_time BETWEEN ‘2019-10-01’ AND ‘2019-11-01’

WHERE MONTH(create_time+INTERVAL 1 YEAR)

All these will trigger a full scan.

Functional Index Internal

The functional indexes are implemented as hidden virtual generated columns. For this reason, you can emulate the same behavior even on MySQL 5.7 by explicitly creating the virtual column. We can test this, starting by dropping the indexes we have created so far.

Shell
mysql> SHOW CREATE TABLE products\G
*************************** 1. row ***************************
       Table: products
Create Table: CREATE TABLE `products` (
  `id` int unsigned NOT NULL AUTO_INCREMENT,
  `description` longtext,
  `price` decimal(8,2) DEFAULT NULL,
  `create_time` timestamp NULL DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `create_time` (`create_time`),
  KEY `functional_index` ((month(`create_time`)))
) ENGINE=InnoDB AUTO_INCREMENT=149960 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
Shell
mysql> ALTER TABLE products DROP INDEX `create_time`, DROP INDEX `functional_index`;
Query OK, 0 rows affected (0.03 sec)

We can try now to create the virtual generated column:

Shell
mysql> ALTER TABLE products ADD COLUMN create_month TINYINT GENERATED ALWAYS AS (MONTH(create_time)) VIRTUAL;
Query OK, 0 rows affected (0.04 sec)

Create the index on the virtual column:

Shell
mysql> ALTER TABLE products ADD INDEX(create_month);
Query OK, 0 rows affected (0.55 sec)
Shell
mysql> SHOW CREATE TABLE products\G
*************************** 1. row ***************************
       Table: products
Create Table: CREATE TABLE `products` (
  `id` int unsigned NOT NULL AUTO_INCREMENT,
  `description` longtext,
  `price` decimal(8,2) DEFAULT NULL,
  `create_time` timestamp NULL DEFAULT NULL,
  `create_month` tinyint GENERATED ALWAYS AS (month(`create_time`)) VIRTUAL,
  PRIMARY KEY (`id`),
  KEY `create_month` (`create_month`)
) ENGINE=InnoDB AUTO_INCREMENT=149960 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

We can now try our original query. We expect to see the same behavior as the functional index.

Shell
mysql> SELECT AVG(price) FROM products WHERE MONTH(create_time)=10;
+------------+
| AVG(price) |
+------------+
| 202.982582 |
+------------+
Shell
mysql> EXPLAIN SELECT AVG(price) FROM products WHERE MONTH(create_time)=10\G
*************************** 1. row ***************************
  select_type: SIMPLE
        table: products
   partitions: NULL
         type: ref
possible_keys: create_month
          key: create_month
      key_len: 2
          ref: const
         rows: 182
     filtered: 100.00
        Extra: NULL

Indeed, the behavior is the same. The index on the virtual column can be used and the query is optimized.

The good news is that you can use this workaround to emulate a functional index even on 5.7, getting the same benefits. The advantage of MySQL 8.0 is that it is completely transparent, no need to create the virtual column.

Since the functional index is implemented as a hidden virtual column, there is no additional space needed for the data, only the index space will be added to the table.

By the way, this is the same technique used for creating indexes on JSON documents’ fields.

Conclusion

The functional index support is an interesting improvement you can find in MySQL 8.0. Some of the queries that required rewriting to get optimized don’t require that anymore. Just remember that only the queries having the same filter pattern can rely on the functional index. Then you need to create additional indexes or other functional indexes to improve other search patterns.

The same feature can be implemented on MySQL 5.7 with the explicit creation of a virtual generated column and the index.

For more detailed information, read the following page:

https://dev.mysql.com/doc/refman/8.0/en/create-index.html#create-index-functional-key-parts


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK