Automatic update of InnoDB Persistent Statistics Never Triggers
source link: https://mysql.wisborg.dk/2021/08/22/automatic-update-of-innodb-persistent-statistics-never-triggers/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Automatic update of InnoDB Persistent Statistics Never Triggers
Since MySQL 5.6 InnoDB has supported persistent index statistics. This means that when you restart MySQL, InnoDB does not have to recalculate the statistics for the tables, rather it can read the statistics from its persistent storage. This has several advantages over the transient statistics, but as it turns out, there is also a catch: MySQL may under some circumstances never get around to update the index statistics. This particularly affects instances that are restarted frequently and tables with a large number of rows.
Information
Persistent statistics are the default, so unless you have explicitly chosen not to use them, then you are.
Background
Before I dive into why there are scenarios where persistent statistics are never updated, it is necessary to recap how persistent statistics work. Every time the statistics are updated, the result is stored in the mysql.innodb_index_stats
table with the index statistic of the primary key doubling as table statistics in mysql.innodb_table_stats
. These tables are normal InnoDB tables, so you can query them to learn what information InnoDB stores. For example using the world.city
table (the world
database can be downloaded from https://dev.mysql.com/doc/index-other.html):
mysql> SHOW
CREATE
TABLE
world.city\G
*************************** 1. row ***************************
Table
: city
Create
Table
:
CREATE
TABLE
`city` (
`ID`
int
NOT
NULL
AUTO_INCREMENT,
`
Name
`
char
(35)
NOT
NULL
DEFAULT
''
,
`CountryCode`
char
(3)
NOT
NULL
DEFAULT
''
,
`District`
char
(20)
NOT
NULL
DEFAULT
''
,
`Population`
int
NOT
NULL
DEFAULT
'0'
,
PRIMARY
KEY
(`ID`),
KEY
`CountryCode` (`CountryCode`),
CONSTRAINT
`city_ibfk_1`
FOREIGN
KEY
(`CountryCode`)
REFERENCES
`country` (`Code`)
) ENGINE=InnoDB AUTO_INCREMENT=4080
DEFAULT
CHARSET=utf8mb4
COLLATE
=utf8mb4_0900_ai_ci
1 row
in
set
(0.0019 sec)
mysql>
SELECT
*
FROM
mysql.innodb_table_stats
WHERE
database_name =
'world'
AND
table_name =
'city'
;
+
---------------+------------+---------------------+--------+----------------------+--------------------------+
| database_name | table_name | last_update | n_rows | clustered_index_size | sum_of_other_index_sizes |
+
---------------+------------+---------------------+--------+----------------------+--------------------------+
| world | city | 2021-08-22 12:03:00 | 4046 | 25 | 7 |
+
---------------+------------+---------------------+--------+----------------------+--------------------------+
1 row
in
set
(0.0007 sec)
mysql>
SELECT
*
FROM
mysql.innodb_index_stats
WHERE
database_name =
'world'
AND
table_name =
'city'
;
+
---------------+------------+-------------+---------------------+--------------+------------+-------------+-----------------------------------+
| database_name | table_name | index_name | last_update | stat_name | stat_value | sample_size | stat_description |
+
---------------+------------+-------------+---------------------+--------------+------------+-------------+-----------------------------------+
| world | city | CountryCode | 2021-08-22 12:03:00 | n_diff_pfx01 | 232 | 6 | CountryCode |
| world | city | CountryCode | 2021-08-22 12:03:00 | n_diff_pfx02 | 4079 | 6 | CountryCode,ID |
| world | city | CountryCode | 2021-08-22 12:03:00 | n_leaf_pages | 6 |
NULL
| Number
of
leaf pages
in
the
index
|
| world | city | CountryCode | 2021-08-22 12:03:00 |
size
| 7 |
NULL
| Number
of
pages
in
the
index
|
| world | city |
PRIMARY
| 2021-08-22 12:03:00 | n_diff_pfx01 | 4046 | 20 | ID |
| world | city |
PRIMARY
| 2021-08-22 12:03:00 | n_leaf_pages | 24 |
NULL
| Number
of
leaf pages
in
the
index
|
| world | city |
PRIMARY
| 2021-08-22 12:03:00 |
size
| 25 |
NULL
| Number
of
pages
in
the
index
|
+
---------------+------------+-------------+---------------------+--------------+------------+-------------+-----------------------------------+
7
rows
in
set
(0.0362 sec)
The exact values may differ for you as the calculation of index statistics is not exact.
Notice how the index on the CountryCode
column includes the ID
column. InnoDB organises the rows according to the clustered index (“index-organised table” in Oracle DB terms) using the primary key if present as the clustered index. Since the primary key is what is used to locate the rows, it is required as being part of the index, so for non-unique secondary indexes, InnoDB exposes the primary key at the end of the index, so the optimiser can take advantage of it
Important for this discussion is that InnoDB automatically updates the index statistic in the background every time 10% of the table has been modified, though only changes that actually has potential to change the statistics are counted. For example, if you with the default table definition update the population, you can update the entire table without triggering an update of the index statistics as there is no index on the Population
column. To avoid that the index statistics get updated continuously on small tables, the index statistics calculation can at most be triggered every 10 seconds for a table.
The catch is that the number of rows that have been updates since the last update of the index statistics is stored in memory and not persisted when you restart MySQL. This means that if you restart MySQL often enough compared to the size of your tables, then the index statistics will never get updated. Once scenario where this is particularly likely to occur is for instances that are regularly shut down to create cold backups. In that case you need to be particularly careful as instances restored from the backups will come up with out of date index statistics which can drastically impact performance. However, even for instances that are rarely restarted, you may encounter the issue on tables with a very large number of rows; consider a table with 10 billion rows, then the automatic update of the index statistics only triggers every time 1 billion rows have been updated which may not happen all that frequently.
Danger
If you regularly shut down MySQL to create cold backups, you must update the index statistics manually before bringing an instance into rotation after restoring a backup. Otherwise that instance may have very poor index statistics causing severe performance issues. More on this below.
Let us look at an example to see how this works in practice.
Book
The book is available from Apress (print and DRM free ePub+PDF), Amazon (print and Kindle), Barnes & Noble (print), and others.
Example
In order to illustrate how the background updates of index statistics work, let us look at an example. We will again use the world.city
table as it is small enough to easily trigger the automatic updates. We will add an index to the Population
column as the population is easy to update:
mysql>
ALTER
TABLE
world.city
ADD
INDEX
(Population);
Query OK, 0
rows
affected (0.9746 sec)
Records: 0 Duplicates: 0 Warnings: 0
The number of rows in the city
table for China and India combines to cover more than 10% of the table while each country on its own has less than 10% of the cities:
mysql>
SELECT
CountryCode,
COUNT
(*)
FROM
world.city
WHERE
CountryCode
IN
(
'CHN'
,
'IND'
)
GROUP
BY
CountryCode;
+
-------------+----------+
| CountryCode |
COUNT
(*) |
+
-------------+----------+
| CHN | 363 |
| IND | 341 |
+
-------------+----------+
2
rows
in
set
(0.0013 sec)
Thus we can update the population of the cities in China and India to trigger an update of the index statistics (but remember there must be a 10 second delay between triggering recalculations of the index statistics).
No Restart
First, we will try to update the population of the Chinese and Indian cities without a restart in between updating the cities of each country:
mysql>
SELECT
last_update, stat_name
FROM
mysql.innodb_index_stats
WHERE
database_name =
'world'
AND
table_name =
'city'
AND
index_name =
'Population'
;
+
---------------------+--------------+
| last_update | stat_name |
+
---------------------+--------------+
| 2021-08-22 12:49:25 | n_diff_pfx01 |
| 2021-08-22 12:49:25 | n_diff_pfx02 |
| 2021-08-22 12:49:25 | n_leaf_pages |
| 2021-08-22 12:49:25 |
size
|
+
---------------------+--------------+
4
rows
in
set
(0.0010 sec)
mysql>
UPDATE
world.city
SET
Population = Population * 1.10
WHERE
CountryCode =
'CHN'
;
Query OK, 363
rows
affected (0.4259 sec)
Rows
matched: 363 Changed: 363 Warnings: 0
mysql>
SELECT
last_update, stat_name
FROM
mysql.innodb_index_stats
WHERE
database_name =
'world'
AND
table_name =
'city'
AND
index_name =
'Population'
;
+
---------------------+--------------+
| last_update | stat_name |
+
---------------------+--------------+
| 2021-08-22 12:49:25 | n_diff_pfx01 |
| 2021-08-22 12:49:25 | n_diff_pfx02 |
| 2021-08-22 12:49:25 | n_leaf_pages |
| 2021-08-22 12:49:25 |
size
|
+
---------------------+--------------+
4
rows
in
set
(0.0007 sec)
mysql>
UPDATE
world.city
SET
Population = Population * 1.10
WHERE
CountryCode =
'IND'
;
Query OK, 341
rows
affected (0.1024 sec)
Rows
matched: 341 Changed: 341 Warnings: 0
mysql>
SELECT
last_update, stat_name
FROM
mysql.innodb_index_stats
WHERE
database_name =
'world'
AND
table_name =
'city'
AND
index_name =
'Population'
;
+
---------------------+--------------+
| last_update | stat_name |
+
---------------------+--------------+
| 2021-08-22 12:50:29 | n_diff_pfx01 |
| 2021-08-22 12:50:29 | n_diff_pfx02 |
| 2021-08-22 12:50:29 | n_leaf_pages |
| 2021-08-22 12:50:29 |
size
|
+
---------------------+--------------+
4
rows
in
set
(0.0012 sec)
The result is the expected that InnoDB updates the index statistics after the cities of both countries have been updated.
With Restart
Now let us try the same example, but add a restart between updating the cities of China and the cities of India:
mysql> ANALYZE
TABLE
world.city;
+
------------+---------+----------+----------+
|
Table
| Op | Msg_type | Msg_text |
+
------------+---------+----------+----------+
| world.city | analyze | status | OK |
+
------------+---------+----------+----------+
1 row
in
set
(0.1527 sec)
mysql>
SELECT
last_update, stat_name
FROM
mysql.innodb_index_stats
WHERE
database_name =
'world'
AND
table_name =
'city'
AND
index_name =
'Population'
;
+
---------------------+--------------+
| last_update | stat_name |
+
---------------------+--------------+
| 2021-08-22 12:51:48 | n_diff_pfx01 |
| 2021-08-22 12:51:48 | n_diff_pfx02 |
| 2021-08-22 12:51:48 | n_leaf_pages |
| 2021-08-22 12:51:48 |
size
|
+
---------------------+--------------+
4
rows
in
set
(0.0008 sec)
mysql>
UPDATE
world.city
SET
Population = Population * 1.10
WHERE
CountryCode =
'CHN'
;
Query OK, 363
rows
affected (0.1065 sec)
Rows
matched: 363 Changed: 363 Warnings: 0
mysql> RESTART;
Query OK, 0
rows
affected (0.0011 sec)
The
global
session got disconnected..
Attempting
to
reconnect
to
'mysqlx://myuser@localhost:33060/world'
..
The
global
session was successfully reconnected.
mysql>
SELECT
last_update, stat_name
FROM
mysql.innodb_index_stats
WHERE
database_name =
'world'
AND
table_name =
'city'
AND
index_name =
'Population'
;
+
---------------------+--------------+
| last_update | stat_name |
+
---------------------+--------------+
| 2021-08-22 12:51:48 | n_diff_pfx01 |
| 2021-08-22 12:51:48 | n_diff_pfx02 |
| 2021-08-22 12:51:48 | n_leaf_pages |
| 2021-08-22 12:51:48 |
size
|
+
---------------------+--------------+
4
rows
in
set
(0.0438 sec)
mysql>
UPDATE
world.city
SET
Population = Population * 1.10
WHERE
CountryCode =
'IND'
;
Query OK, 341
rows
affected (0.1299 sec)
Rows
matched: 341 Changed: 341 Warnings: 0
mysql>
SELECT
last_update, stat_name
FROM
mysql.innodb_index_stats
WHERE
database_name =
'world'
AND
table_name =
'city'
AND
index_name =
'Population'
;
+
---------------------+--------------+
| last_update | stat_name |
+
---------------------+--------------+
| 2021-08-22 12:51:48 | n_diff_pfx01 |
| 2021-08-22 12:51:48 | n_diff_pfx02 |
| 2021-08-22 12:51:48 | n_leaf_pages |
| 2021-08-22 12:51:48 |
size
|
+
---------------------+--------------+
4
rows
in
set
(0.0007 sec)
This examples starts by analysing the world.city
table as otherwise some of the rows updated in the previous example would have counted towards the 10% for the next calculation of the index statistics. (The trigger of the automatic updates does not happen at statement boundaries but rather as soon as the 10% has been reached.) The RESTART
command (new in MySQL 8) is used to restart MySQL.
Advice
The RESTART
command only works if MySQL is run under an “angel” process such as systemd
, mysqld_safe
, or as a Microsoft Windows service.
The important point is that this time, the update of the population of the Indian cities does not trigger an update of the index statistics.
Workaround
There are several configuration options for InnoDB persistent statistics both globally and per table, but unfortunately none that can solve this issues. Optimally there would be a way to have InnoDB automatically – as a background task – update the statistics at a regular interval on a per table basis.
The workaround is to create your own scheduled job to regularly look for tables that need their index statistics updated. You can use the last_update
column in mysql.innodb_table_stats
table to look for tables that are candidate for a manual update:
mysql>
SELECT
database_name, table_name, n_rows, last_update
FROM
mysql.innodb_table_stats
WHERE
database_name =
'world'
;
+
---------------+-----------------+--------+---------------------+
| database_name | table_name | n_rows | last_update |
+
---------------+-----------------+--------+---------------------+
| world | city | 4046 | 2021-08-22 12:51:48 |
| world | country | 239 | 2021-08-22 12:49:15 |
| world | countrylanguage | 984 | 2021-08-22 12:49:35 |
+
---------------+-----------------+--------+---------------------+
3
rows
in
set
(0.0007 sec)
Optionally, you can include the estimated number of rows to skip small tables where the automatic update will get to them as required.
You can then update the index statistics using the ANALYZE TABLE
statement like:
mysql> ANALYZE
TABLE
world.city, world.country, world.countrylanguage;
+
-----------------------+---------+----------+----------+
|
Table
| Op | Msg_type | Msg_text |
+
-----------------------+---------+----------+----------+
| world.city | analyze | status | OK |
| world.country | analyze | status | OK |
| world.countrylanguage | analyze | status | OK |
+
-----------------------+---------+----------+----------+
3
rows
in
set
(0.7794 sec)
Alternatively the mysqlcheck
program can be used:
shell$ mysqlcheck --user=root --password --host=localhost --port=3306 --analyze world city country countrylanguage
Enter password: ********
world.city OK
world.country OK
world.countrylanguage OK
If you want to update all tables in a schema, you do not even need to specify the table names.
Warning
Before MySQL 8.0.24 (or 5.6.38/5.7.20 if you use Percona's binaries), it was dangerous to execute ANALYZE TABLE
on a production system with “long running” queries. The reason is that ANALYZE TABLE
would trigger an implicit table flush (but only after ANALYZE TABLE
had returned making it hard to diagnose) meaning that until all queries using the table had completed, the implicit flush would block all new queries using the table from running.
Monitoring
If you want to determine the number of rows that has been updated and counted towards triggering the next automatic calculation of index statistics, you can query the information_schema.INNODB_TABLESTATS
view:
mysql> ANALYZE
TABLE
world.city;
+
------------+---------+----------+----------+
|
Table
| Op | Msg_type | Msg_text |
+
------------+---------+----------+----------+
| world.city | analyze | status | OK |
+
------------+---------+----------+----------+
1 row
in
set
(0.1706 sec)
mysql>
SELECT
*
FROM
information_schema.INNODB_TABLESTATS
WHERE
NAME
=
'world/city'
\G
*************************** 1. row ***************************
TABLE_ID: 4159
NAME
: world/city
STATS_INITIALIZED: Uninitialized
NUM_ROWS: 0
CLUST_INDEX_SIZE: 0
OTHER_INDEX_SIZE: 0
MODIFIED_COUNTER: 0
AUTOINC: 4080
REF_COUNT: 1
1 row
in
set
(0.5694 sec)
mysql>
UPDATE
world.city
SET
Population = Population * 1.10
WHERE
CountryCode =
'CHN'
;
Query OK, 363
rows
affected (0.7183 sec)
Rows
matched: 363 Changed: 363 Warnings: 0
mysql>
SELECT
*
FROM
information_schema.INNODB_TABLESTATS
WHERE
NAME
=
'world/city'
\G
*************************** 1. row ***************************
TABLE_ID: 4159
NAME
: world/city
STATS_INITIALIZED: Initialized
NUM_ROWS: 4046
CLUST_INDEX_SIZE: 25
OTHER_INDEX_SIZE: 12
MODIFIED_COUNTER: 363
AUTOINC: 4080
REF_COUNT: 2
1 row
in
set
(0.0017 sec)
Notice the MODIFIED_COUNTER
column. The STATS_INITIALIZED
means whether the information exposed in this view have been loaded into memory and not whether the index statistics exist. Using this view, you can easily verify that restarting MySQL resets MODIFIED_COUNTER
to 0.
I have worked with MySQL databases since 2006 both as an SQL developer, a database administrator, and for more than eight years as part of the Oracle MySQL Support team. I have spoken at MySQL Connect and Oracle OpenWorld on several occasions. I have contributed to the sys schema and four Oracle Certified Professional (OCP) exams for MySQL 5.6 to 8.0.
I have written four books, all published at Apress.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK