3

Inspecting MySQL Servers Part 3: What MySQL?

 3 years ago
source link: https://www.percona.com/blog/2021/06/03/inspecting-mysql-servers-part-3-what-mysql/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client
Inspecting MySQL Servers Part 3: What MySQL?

Inspecting MySQL ServersIn the previous post of this series, we looked at the hardware specifications and operating system settings of the host server through the lenses of a pt-summary report. Now that we know the foundation on which the database is running, we can turn our focus to MySQL itself. The second of our triad of tools from the Percona Toolkit will help us with that:

pt-mysql-summary conveniently summarizes the status and configuration of a MySQL database server so that you can learn about it at a glance. 

The goal for this part is to find what MySQL distribution and version is being used if the server is a replica or maybe a PXC/Galera node, as well as its main configuration (such as the size of the Buffer Pool, redo log space, table cache, temporary tables) and check if it makes sense for the platform it is running. In order to gather the necessary information, the tool will connect to the MySQL server and run several queries so you must have superuser access to the database.

If you have the Percona Toolkit installed on the server, you can run pt-summary as simply as:

Shell
sudo pt-mysql-summary -- --user=root --password=mysql-root-password

But it is usually better to redirect the report to a file you can save for later, which also allows for better searching:

Shell
sudo pt-summary -- --user=root --password=mysql-root-password > /tmp/pt-mysql-summary.out

You should have one thing in mind though: any configuration adjustments done to MySQL based on the information provided by a pt-summary report must be seen as an initial setup, and not tuning properly said; we tune a database server for the workload it executes and this cannot be done based on static data alone. The next post in this series will dig deep into this part. However, we need a starting point more reasonable than the standard configuration provided by the installation packages “out of the box”, otherwise the diagnostic data collected in the next step won’t be as useful, and more iterations will be required.

Header Information

The first section of the report lists all mysqld running instances, identifies the binary being used, and indicates whether it contains symbols (useful for analyzing stack traces from crashes, for instance) or has been stripped of them:

Shell
# Percona Toolkit MySQL Summary Report #######################
              System time | 2020-10-08 15:30:53 UTC (local TZ: CDT -0500)
# Instances ##################################################
  Port  Data Directory             Nice OOM Socket
  ===== ========================== ==== === ======
   3306 /home/mysqladm/mysql/data  0    0   /mysqladm/tmp/mysql.sock
# MySQL Executable ###########################################
       Path to executable | /mysqladm/Percona-Server-5.7.25-28-Linux.x86_64.ssl101/bin/mysqld
              Has symbols | Yes

I’m not seeing it happening very often nowadays, but in the past, it was more common to find multiple instances of MySQL running on the same server. The report will cater for the first instance in the list:

Shell
# Report On Port 3306 ########################################
                     User | mysqladm@%
                     Time | 2021-04-08 10:30:53 (CDT)
                 Hostname | myservername
                  Version | 5.7.25-28-log Percona Server (GPL), Release 28, Revision c335905
                 Built On | Linux x86_64
                  Started | 2021-04-07 08:53 (up 1+01:37:04)
                Databases | 6
                  Datadir | /home/mysqladm/mysql/data/
                Processes | 2000 connected, 1500 running
              Replication | Is not a slave, has 2 slaves connected
                  Pidfile | /mysqladm/tmp/mysqld.pid (exists)

From the section above we get a set of important information:

  • The exact MySQL version running: we can check the Release Notes page from the respective distribution to find out when it was released, for which bugs it has been fixed and, even more important, which ones it doesn’t (based on exact bug reports and release notes from more recent releases). The one from the example above has been released over two years; it is simply not a best practice to allow a server to become this outdated, missing both on security patches and overall improvements.
  • The server has been (re)started only a day ago, and this suggests the caches (mostly the InnoDB Buffer Pool) could be still warming up.
  • Here we get the instance’s data directory (datadir) path; when it happens to match the root directory of one of the mount points listed on the pt-summary report we may get the exact database size (on disk) from the partition’s usage stats.
  • The number of running Processes is just an “eye-opener”, something to pay more attention to later.
  • The information indicating this server is a master is a crucial one: we probably shouldn’t be making any concessions in terms of relaxing data durability without a “safety net”.

Processes

pt-mysql-summary connects to the database and runs a SHOW FULL PROCESSLIST query in the background: the information appearing in the Processlist section is based on that single sample. The processes in the sample are aggregated according to Command, User, Host, database (db), and State. While it doesn’t provide a good representation of the workload, it may hint at common issues, such as contention at the table cache level or open files limit when there is a majority of connections stuck in the Opening tables state.

Status Variables

The next section processes data collected from two samples of SHOW GLOBAL STATUS taken 10 seconds apart, and “rounded”; the idea here is not to provide accurate metrics but to provide an insight on how the counters increased in this short time span. Thus, the Per day value doesn’t really represent a full 24-hour period, it is simply the initial value for the counter (from the first sample) divided by the number of seconds in a day (86400), with some rounding involved. I tend to pay attention to the 10 secs column only, as I find it more straightforward and useful: it shows how much a given counter increased from the first sample to the second one, taken 10 seconds later.

You should not spend too much time analyzing the Processes and Status Counters sections from a pt-mysql-summary; the time for this will come later, and from more rich and diverse data collected with pt-stalk.

MySQL Settings

Below the Status Counters section, it’s mostly about MySQL settings, but it includes some statistics mixed in. It obviously doesn’t cover all the 500+ variables you can tune on the latest MySQL server, and the sections often don’t carry the formal variable name (e.g., “Buffer Pool Size” instead of innodb_buffer_pool_size). The aim is to highlight some of the most important ones. I will comment about a subset of them, starting with the Table cache:

Shell
# Table cache ################################################
                     Size | 512
                    Usage | 225%

Usage is obtained by dividing the status counter Open_tables by the variable table_open_cache (with some rounding involved). I reckon the excerpt above is not a common one, but it helps illustrate an interesting scenario: in some cases, a variable like table_open_cache was defined in the my.cnf configuration file years ago and left there, forgotten. The server has been upgraded, sometimes multiple times, and the default value for this variable has increased: in 5.5 it was 400, then 2000 in 5.6, and 4000 now in 8.0. In fact, the table cache has since been partitioned to scale better in highly-concurrent environments, now running in modern servers with multiple cores: table_open_cache_instances was introduced in 5.6 but had the default value of 1 then. In 5.7, the default value changed to 16. For our example above, with table_open_cache still defined on the configuration file as 512 but with table_open_cache_instances not defined, it results, in practice, in a server operating with 16 table cache instances, but each with only (512/16=) 32 entries only!

We will be discussing how to evaluate the efficiency of the table cache size further in the next post. For now, make an exercise of reviewing your my.cnf configuration file and check how the variables defined there relate to default values for your server’s version.

The Query Cache

Speaking of the server’s version, a feature that has existed for a long time but which has been deprecated in MySQL 5.7 and effectively removed in 8.0 is the Query cache:

Shell
# Query cache ################################################
         query_cache_type | ON
                     Size | 100.0M
                    Usage | 6%
         HitToInsertRatio | 200%

It is reminiscent of a time when CPU was scarce and concurrency was low. That is no longer the case for a long time, and the query cache became a real bottleneck: it operates out of a single mutex so every SELECT needs to wait until it can acquire it to execute. Can you believe the potential impact of this setting on highly concurrent workloads?

The query cache is disabled by default since MySQL 5.6 (query_cache_type=0) but we still find that variable set in the configuration files of many MySQL 5.7 servers. (Hint: you should also remove the definition for query_cache_size if you have it in place as that much memory will be allocated for the query cache even if it is disabled).

The Main Storage Engine

MySQL became a transactional database with the addition of the InnoDB storage engine many years ago, first as a plugin, then as the default engine. With MySQL 8.0, MyISAM remains available but even system tables are now InnoDB. There are alternatives, such as MyRocks, which can be easily installed for Percona server for MySQL, but for most environments, much of the tuning done to MySQL is InnoDB-specific. There are about 170 innodb_ prefixed variables in MySQL 8.0 and pt-mysql-summary only covers some of these in the InnoDB section:

Shell
# InnoDB #####################################################
                  Version | 5.7.29
         Buffer Pool Size | 50.0G
         Buffer Pool Fill | 100%
        Buffer Pool Dirty | 1%
           File Per Table | ON
                Page Size | 16k
            Log File Size | 2 * 2.0G = 4.0G
          Log Buffer Size | 16M
             Flush Method | O_DIRECT
      Flush Log At Commit | 1
               XA Support | ON
                Checksums | ON
              Doublewrite | ON
          R/W I/O Threads | 4 4
             I/O Capacity | 200
       Thread Concurrency | 0
      Concurrency Tickets | 5000
       Commit Concurrency | 0
      Txn Isolation Level | READ-COMMITTED
        Adaptive Flushing | ON
      Adaptive Checkpoint |
           Checkpoint Age | 1G
             InnoDB Queue | 0 queries inside InnoDB, 0 queries in queue
       Oldest Transaction | 0 Seconds
         History List Len | 95
               Read Views | 1
         Undo Log Entries | 2 transactions, 17441 total undo, 15747 max undo
        Pending I/O Reads | 0 buf pool reads, 0 normal AIO, 0 ibuf AIO, 0 preads
       Pending I/O Writes | 0 buf pool (0 LRU, 0 flush list, 0 page); 0 AIO, 0 sync, 0 log IO (0 log, 0 chkp); 0 pwrites
      Pending I/O Flushes | 0 buf pool, 0 log
       Transaction States | 2xACTIVE, 65xnot started

The most important of them is undoubtedly the Buffer Pool Size (innodb_buffer_pool_size). The buffer pool is a data cache: the ideal situation in a database server is if you could fit all your dataset in memory, inside the buffer pool; this way it wouldn’t be necessary to “ever” fetch data from disk, limiting I/O to basically writes. This is, of course, not totally accurate in practice, but neither is the statement “fit all your dataset in memory”: in fact, if you can fit all your hot data, or the data that is most frequently accessed, in memory, the effect would be similar.

If you have been in MySQL for a long time, you probably heard about the rule of thumb that says one should allocate 80% of the server’s memory to the Buffer Pool. This may have been a reasonable starting point when the servers didn’t have as much memory, but nowadays it would often be considered too conservative: take a server with 512G of RAM, for example, reserve 80% to the Buffer Pool, and there would remain up to 100G of RAM left. You surely need to reserve some memory for the OS (Linux) and for MySQL to manage connections, run queries (particularly temporary tables, which we will be discussing later), and other structures, such as the Adaptive Hash Index (AHI), but most of the memory surplus will end up going to the OS cache, out of a better use for it. And since InnoDB has direct access to the memory (O_DIRECT) it bypasses the OS cache to read and write data to disk. Only other parts of MySQL rely on the OS cache to interact with the storage so watch out to make the most of the memory that is available.

Buffer Pool Fill is an estimation of how “full” is the Buffer Pool at the moment, obtained by simply subtracting the status counter Innodb_buffer_pool_pages_free from Innodb_buffer_pool_pages_total. Similarly,  Buffer Pool Dirty, or the number of pages in the Buffer Pool that have been modified (in memory) but have not yet been flushed (written) to the destination tablespace, is calculated by subtracting Innodb_buffer_pool_pages_dirty from Innodb_buffer_pool_pages_total. These metrics shouldn’t be considered for a server that has just been restarted though, when the Buffer Pool is still “warming up”, getting filled with the “hot data” mentioned above. Besides, there is a better evaluation of how good enough is the Buffer Pool (size) for production workload, which we will cover in the next post.

Now, you may have read the words “dirty pages” and “not yet been written” and freaked out a bit, which is understandable. It is time to talk about data durability in MySQL (or, more precisely, in InnoDB). Flush Log At Commit represents the variable innodb_flush_log_at_trx_commit: it determines how often modified data is written to disk. For full ACID compliance, it should happen at each transaction commit (a value of 1). Why would anyone run away from this, you may ask. Because of the impact on I/O: depending on the workload, this may become a severe bottleneck. Thus, it is common to have this setting “relaxed” in replicas and operate with a value of 0 for this variable, meaning modified data is flushed to disk once per second. In the worst-case scenario, a server crash means the possible loss of (approximately) one-second worth of data: if there were 100 transactions committed in that last second preceding the crash, they might not be there when the server is restarted.

This compromise is acceptable in replicas because they can be rebuilt from scratch. On a master, operating with innodb_flush_log_at_trx_commit set to anything other than one would be taking the risk described in the previous paragraph. Note, however, that if a BBU is available and the write-back cache is enabled, the BBU will merge fsync calls and orchestrate the actual “sync to disk” process, optimizing I/O and, thus, reducing the cost of running InnoDB in fully durable mode with innodb_flush_log_at_trx_commit=1.

I should note that dirty pages aren’t written from the Buffer Pool straight to their respective tablespace files: that would be a very costly operation as the target blocks are spread across the disk. To optimize on this front, the pages are first written to a log file, sequentially, and only later, over time, they are copied from there to the tablespace files. That log is in fact a circular data structure composed of two or more files and is known as redo log. The size of the redo log is, then, a product of innodb_log_files_in_group by innodb_log_file_size:

Shell
            Log File Size | 2 * 2.0G = 4.0G

The process of copying data from the redo log to the tablespace files is called checkpointing. This process is constantly running in the background, moving dirty pages to their destination data files at a steady but “gentle” rate to optimize I/O. The exception is when the amount of dirty pages in the Buffer Pool exceeds the number of free pages in the redo log. When that happens, a sharp checkpoint process is required: off goes the gentleness and all I/O is put at the service of the checkpoint process, which often stalls the server. You can read more about this process in InnoDB Flushing in Action for Percona Server for MySQL.

The redo log also plays an important role in InnoDB’s crash recovery process, when MySQL is restarting following a crash: all non-free blocks need to be inspected to correct any data written by incomplete transactions, which need to be rolled back. Thus, the redo log space should be big enough to avoid any sharp checkpointing being triggered. However, a too big redo log that is full may delay the MySQL initialization process following a crash (though this has been improved significantly in the latest version of MySQL).

There are another 170 other InnoDB variables we could tune, but getting innodb_buffer_pool_size, innodb_flush_log_at_trx_commit, and innodb_log_file_size “right” is a great start.

Binary Logging

The main use of binary logs in MySQL is for replication so you should retain, at the very least, enough binlogs to allow any replica that is temporarily stopped (for maintenance purposes, for instance) or otherwise lagging to recover the missing replication events. expire_logs_days was the variable that used to control “the number of days before automatic removal of binary log files”; in MySQL 8.0 it has been deprecated in favor of binlog_expire_logs_seconds.

Shell
# Binary Logging #############################################
                  Binlogs | 108
               Zero-Sized | 0
               Total Size | 104.4G
            binlog_format | ROW
         expire_logs_days | 5
              sync_binlog | 1
                server_id | 1
             binlog_do_db | 
         binlog_ignore_db |

From the above list, sync_binlog is one to pay attention to: similar to innodb_flush_log_at_trx_commit, it controls how often the binary log is synchronized to disk, with a value of 1 also providing the safest measure at the expense of higher I/O demand.

Binary logs are also used for Point-In-Time-Recovery (PITR): if you have retained all binlogs since your last backup was taken, be it a binary backup (taken with Percona XtraBackup, for instance) or a logical backup (mysqldump), you can restore the backup and then “play” (or load) all the binary logs that succeeded it, sequentially, down to the last event recorded. For this reason, binary logs play an important role in any backup strategy.

Best of the Rest

There is a section called Noteworthy Variables in the pt-summary-report that highlights a selection of other MySQL variables. Here are the ones in this group we pay most attention to, and why:

Temporary Tables

There are two kinds of temporary tables in MySQL:

The following variables control, respectively, how big each kind of temporary table is allowed to grow in memory:

Shell
     max_heap_table_size | 64M
          tmp_table_size | 64M

Once a temporary table reaches its in-memory table, it is converted to an on-disk memory table. These tables are not allowed to grow in memory indefinitely for a very simple reason: a query that produces big temporary tables on a highly concurrent environment can have devastating effects on memory usage. Picture this: how much extra memory would be used in total by MySQL to process a query that makes use of 256M temporary tables for 500 concurrent connections? Operating with too-high-values for these variables is the most common cause for having the kernel’s OOM monitor to kill mysqld. The long-term solution is to optimize your queries to avoid the need for big temporary tables in the first place.

Some internal temporary tables are created straight on disk though, independent of their size: it’s the case for operations on tables involving BLOB or TEXT columns, for example. Once in a while, we see attempts to work around this limitation by having the tmpdir point to a tmpfs partition that maintains files in memory: be careful with that.

MySQL Error Log

The MySQL error log is the #1 source of information about your database and the first place you should check when looking to understand what is going on with your server. It doesn’t have all the answers but it very often provides enough information to steer you in the right direction. Whenever MySQL crashes due to a bug, there is a stack trace left in the error log with more information. If you see MySQL restarting without first going through a clean shutdown process, you can be sure that either the mysqld process has been killed or the host suffered a major crash/power failure. 

Shell
                     log | ON
                log_error | /var/lib/mysql/mysql.err
             log_warnings | 2

log_warnings was the name of the variable that controlled the level of verbosity present in the error log. It has been deprecated in favor of log_error_verbosity in MySQL 5.7 and removed from MySQL 8.0. The higher the value set, the more information the log will contain.

Make a habit out of checking the error log from time to time and you may catch issues before they become real problems.

Slow Query Log

Remember the long-term solution to avoid the use of big temporary tables mentioned a few paragraphs up? The quest for query optimization starts with identifying the slow queries. Make sure that slow_query_log is enabled and tune long_query_time down accordingly; set it to zero to capture all queries. If you are running Percona Server for MySQL you can take advantage of a few extra options, such as log_slow_rate_limit, to mitigate the additional overhead from logging all queries while still obtaining a good representation of them. log_slow_verbosity allows you to customize what you want to have included in the slow log, such as query_plan to log information about the query’s execution plan.

Query optimization is a subject for another full blog series. My very simple but useful advice on this front is to capture your slow queries for at least a couple of days (ideally more than that), including peak time, and then processing the slow query log with pt-query-digest (another tool from the Percona Toolkit): it will create a rank of the top slow queries, taking into consideration not only the queries’ duration but also their frequency. You can use that list and the details provided in the report that is generated to focus on optimizing those that will bring you the most gains.

Configuration File

This section will list all settings defined in a MySQL configuration file if mysqld has been started with the option –defaults-file specifying one. This is less and less the case with modern Linux distributions that make use of systemd to manage services. Plus, it became a common practice to group sections of similar settings in distinct configuration files so this part of the pt-mysql-summary requires some reworking.

Memory Allocator

The very last section in the pt-mysql-summary report indicates whether MySQL has been configured to start with the memory allocator jemalloc and if the respective library is in fact available in the system. jemalloc is an alternative to the standard glibc library that focuses on performance, reducing memory fragmentation to help improve scalability and increase throughput. It is not a change I would recommend for all and every setup but something to really consider for high-loaded servers where memory access is a bottleneck.

In the Next Post …

Now that we got to know the server and have potentially made a few adjustments to the MySQL configuration, it’s time to capture diagnostics data from production workload and start tuning the database for it.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK