3

Introduction to MySQL Query Tuning for Dev[Op]s

 2 years ago
source link: https://www.percona.com/blog/qa-on-the-webinar-introduction-to-mysql-query-tuning-for-devops/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Q&A on the Webinar “Introduction to MySQL Query Tuning for Dev[Op]s”

MySQL Query Tuning for Dev[Op]s webinarFirst, I want to thank everyone who attended my November 9, 2021 webinar “Introduction to MySQL Query Tuning for Dev[Op]s“. Recording and slides are available on our webinars page.

Here are answers to the questions from participants which I was not able to provide during the webinar due to a technical issue we experienced at the end.

Q: If a large/complex SELECT is run from inside a Stored Procedure, will it use/not use the same optimizations as it would if run as a direct query? Is there anything to bear in mind when writing Stored Procedures that need to run large/complex SELECTs from within them?

Just wanted to clarify, using a stored procedure does not utilize any of the indexes on the corresponding tables?

A: Any query inside a stored procedure will be optimized the same way as if it was called outside of the routine. We cannot run EXPLAIN  on stored procedures but we can prove it with the help of Performance Schema.

For example, let’s take a standard test database employees and perform a quite ineffective query on it:

MySQL
mysql> SELECT emp_no, first_name, last_name, title, salary FROM employees
    -> JOIN titles USING(emp_no) JOIN salaries USING(emp_no)
    -> WHERE salary = (SELECT MAX(salary) FROM salaries)
    -> OR salary = (SELECT MIN(salary) FROM salaries);
+--------+------------+-----------+------------------+--------+
| emp_no | first_name | last_name | title            | salary |
+--------+------------+-----------+------------------+--------+
|  43624 | Tokuyasu   | Pesch     | Senior Staff     | 158220 |
|  43624 | Tokuyasu   | Pesch     | Staff            | 158220 |
| 253406 | Olivera    | Baek      | Technique Leader |  38623 |
+--------+------------+-----------+------------------+--------+
3 rows in set (4,38 sec)

Then let’s create a stored routine that uses this query:

MySQL
mysql> \d |
mysql> CREATE PROCEDURE my_test()
    -> BEGIN
    ->     SELECT emp_no, first_name, last_name, title, salary
    ->     FROM employees JOIN titles USING(emp_no)
    ->     JOIN salaries USING(emp_no)
    ->     WHERE salary = (SELECT MAX(salary) FROM salaries)
    ->     OR salary = (SELECT MIN(salary) FROM salaries);
    -> END
    -> |
Query OK, 0 rows affected (0,02 sec)

This routine call takes about the same time on my laptop as if I run the query separately:

MySQL
mysql> CALL my_test();
+--------+------------+-----------+------------------+--------+
| emp_no | first_name | last_name | title            | salary |
+--------+------------+-----------+------------------+--------+
|  43624 | Tokuyasu   | Pesch     | Senior Staff     | 158220 |
|  43624 | Tokuyasu   | Pesch     | Staff            | 158220 |
| 253406 | Olivera    | Baek      | Technique Leader |  38623 |
+--------+------------+-----------+------------------+--------+
3 rows in set (4,35 sec)
Query OK, 0 rows affected (4,35 sec)

If I then enable statement instrumentation in Performance Schema and run the query again I would have the following output:

MySQL
mysql> USE performance_schema
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> UPDATE setup_instruments SET enabled='yes' WHERE name LIKE 'statement/%';
Query OK, 0 rows affected (0,00 sec)
Rows matched: 213  Changed: 0  Warnings: 0
mysql> UPDATE setup_consumers SET enabled='yes' WHERE name LIKE '%statement%';
Query OK, 1 row affected (0,00 sec)
Rows matched: 4  Changed: 1  Warnings: 0
mysql> USE employees;
Database changed
mysql> TRUNCATE performance_schema.events_statements_history;
Query OK, 0 rows affected (0,01 sec)
mysql> SELECT emp_no, first_name, last_name, title, salary FROM employees
    -> JOIN titles USING(emp_no) JOIN salaries USING(emp_no)
    -> WHERE salary = (SELECT MAX(salary) FROM salaries)
    -> OR salary = (SELECT MIN(salary) FROM salaries);
+--------+------------+-----------+------------------+--------+
| emp_no | first_name | last_name | title            | salary |
+--------+------------+-----------+------------------+--------+
|  43624 | Tokuyasu   | Pesch     | Senior Staff     | 158220 |
|  43624 | Tokuyasu   | Pesch     | Staff            | 158220 |
| 253406 | Olivera    | Baek      | Technique Leader |  38623 |
+--------+------------+-----------+------------------+--------+
3 rows in set (4,41 sec)
mysql> SELECT * FROM performance_schema.events_statements_history
    -> WHERE thread_id=(SELECT thread_id FROM performance_schema.threads WHERE processlist_id=@@pseudo_thread_id)
    -> AND event_name != 'statement/sql/truncate'\G
*************************** 1. row ***************************
              THREAD_ID: 100414
               EVENT_ID: 24392721
           END_EVENT_ID: 26360924
             EVENT_NAME: statement/sql/select
                 SOURCE: init_net_server_extension.cc:93
            TIMER_START: 271361154464092000
              TIMER_END: 271365562450546000
             TIMER_WAIT: 4407986454000
              LOCK_TIME: 325000000
               SQL_TEXT: SELECT emp_no, first_name, last_name, title, salary FROM employees JOIN titles USING(emp_no)
                         JOIN salaries USING(emp_no) WHERE salary = (SELECT MAX(salary) FROM  salaries) OR salary = (SELECT MIN(salary) FROM salaries)
                 DIGEST: 4fe5ea4a8a96539aa9e837031c7c4ea59c1b772272a83c1db5e2833170c2d563
            DIGEST_TEXT: SELECT `emp_no` , `first_name` , `last_name` , `title` , `salary` FROM `employees` JOIN `titles` USING ( `emp_no` )
                         JOIN `salaries` USING ( `emp_no` ) WHERE `salary` = ( SELECT MAX ( `salary` )
                         FROM `salaries` ) OR `salary` = ( SELECT MIN ( `salary` ) FROM `salaries` )
         CURRENT_SCHEMA: employees
            OBJECT_TYPE: NULL
          OBJECT_SCHEMA: NULL
            OBJECT_NAME: NULL
  OBJECT_INSTANCE_BEGIN: NULL
            MYSQL_ERRNO: 0
      RETURNED_SQLSTATE: NULL
           MESSAGE_TEXT: NULL
                 ERRORS: 0
               WARNINGS: 0
          ROWS_AFFECTED: 0
              ROWS_SENT: 3
          ROWS_EXAMINED: 11069933
CREATED_TMP_DISK_TABLES: 0
     CREATED_TMP_TABLES: 0
       SELECT_FULL_JOIN: 0
SELECT_FULL_RANGE_JOIN: 0
           SELECT_RANGE: 0
     SELECT_RANGE_CHECK: 0
            SELECT_SCAN: 3
      SORT_MERGE_PASSES: 0
             SORT_RANGE: 0
              SORT_ROWS: 0
              SORT_SCAN: 0
          NO_INDEX_USED: 1
     NO_GOOD_INDEX_USED: 0
       NESTING_EVENT_ID: NULL
     NESTING_EVENT_TYPE: NULL
    NESTING_EVENT_LEVEL: 0
           STATEMENT_ID: 428707
1 row in set (0,00 sec)

The interesting part of the output is the performance fields with non-zero values. Namely,

MySQL
    ROWS_SENT: 3
ROWS_EXAMINED: 11069933
  SELECT_SCAN: 3
NO_INDEX_USED: 1

Now let’s call the stored procedure and compare these values.

MySQL
mysql> TRUNCATE performance_schema.events_statements_history;
Query OK, 0 rows affected (0,00 sec)
mysql> CALL my_test();
+--------+------------+-----------+------------------+--------+
| emp_no | first_name | last_name | title            | salary |
+--------+------------+-----------+------------------+--------+
|  43624 | Tokuyasu   | Pesch     | Senior Staff     | 158220 |
|  43624 | Tokuyasu   | Pesch     | Staff            | 158220 |
| 253406 | Olivera    | Baek      | Technique Leader |  38623 |
+--------+------------+-----------+------------------+--------+
3 rows in set (4,55 sec)
Query OK, 0 rows affected (4,55 sec)
mysql> SELECT * FROM performance_schema.events_statements_history
    -> WHERE thread_id=(SELECT thread_id FROM performance_schema.threads WHERE processlist_id=@@pseudo_thread_id)
    -> AND event_name != 'statement/sql/truncate'\G
*************************** 1. row ***************************
              THREAD_ID: 100414
               EVENT_ID: 28336202
           END_EVENT_ID: 30292242
             EVENT_NAME: statement/sp/stmt
                 SOURCE: sp_head.cc:2210
            TIMER_START: 271549571279270000
              TIMER_END: 271554128099300000
             TIMER_WAIT: 4556820030000
              LOCK_TIME: 139000000
               SQL_TEXT: SELECT emp_no, first_name, last_name, title, salary FROM employees JOIN titles USING(emp_no)
                         JOIN salaries USING(emp_no) WHERE salary = (SELECT MAX(salary) FROM salaries) OR salary = (SELECT MIN(salary) FROM salaries)
                 DIGEST: NULL
            DIGEST_TEXT: NULL
         CURRENT_SCHEMA: employees
            OBJECT_TYPE: PROCEDURE
          OBJECT_SCHEMA: employees
            OBJECT_NAME: my_test
  OBJECT_INSTANCE_BEGIN: NULL
            MYSQL_ERRNO: 0
      RETURNED_SQLSTATE: NULL
           MESSAGE_TEXT: NULL
                 ERRORS: 0
               WARNINGS: 0
          ROWS_AFFECTED: 0
              ROWS_SENT: 3
          ROWS_EXAMINED: 11069933
CREATED_TMP_DISK_TABLES: 0
     CREATED_TMP_TABLES: 0
       SELECT_FULL_JOIN: 0
SELECT_FULL_RANGE_JOIN: 0
           SELECT_RANGE: 0
     SELECT_RANGE_CHECK: 0
            SELECT_SCAN: 3
      SORT_MERGE_PASSES: 0
             SORT_RANGE: 0
              SORT_ROWS: 0
              SORT_SCAN: 0
          NO_INDEX_USED: 1
     NO_GOOD_INDEX_USED: 0
       NESTING_EVENT_ID: 28336201
     NESTING_EVENT_TYPE: STATEMENT
    NESTING_EVENT_LEVEL: 1
           STATEMENT_ID: 429597
*************************** 2. row ***************************
              THREAD_ID: 100414
               EVENT_ID: 28336201
           END_EVENT_ID: 30292243
             EVENT_NAME: statement/sql/call_procedure
                 SOURCE: init_net_server_extension.cc:93
            TIMER_START: 271549571185646000
              TIMER_END: 271554128135521000
             TIMER_WAIT: 4556949875000
              LOCK_TIME: 0
               SQL_TEXT: CALL my_test()
                 DIGEST: 070cb7d4d2eba9690a3a993ec61700fb6d6c3e46166e329408d7284e45fd58c2
            DIGEST_TEXT: CALL `my_test` ( )
         CURRENT_SCHEMA: employees
            OBJECT_TYPE: NULL
          OBJECT_SCHEMA: NULL
            OBJECT_NAME: NULL
  OBJECT_INSTANCE_BEGIN: NULL
            MYSQL_ERRNO: 0
      RETURNED_SQLSTATE: 00000
           MESSAGE_TEXT: NULL
                 ERRORS: 0
               WARNINGS: 0
          ROWS_AFFECTED: 0
              ROWS_SENT: 0
          ROWS_EXAMINED: 0
CREATED_TMP_DISK_TABLES: 0
     CREATED_TMP_TABLES: 0
       SELECT_FULL_JOIN: 0
SELECT_FULL_RANGE_JOIN: 0
           SELECT_RANGE: 0
     SELECT_RANGE_CHECK: 0
            SELECT_SCAN: 0
      SORT_MERGE_PASSES: 0
             SORT_RANGE: 0
              SORT_ROWS: 0
              SORT_SCAN: 0
          NO_INDEX_USED: 0
     NO_GOOD_INDEX_USED: 0
       NESTING_EVENT_ID: NULL
     NESTING_EVENT_TYPE: NULL
    NESTING_EVENT_LEVEL: 0
           STATEMENT_ID: 429596
2 rows in set (0,00 sec)

This time, the select from the performance_schema.events_statements_history returned two rows: one for the SELECT  statement and one for the CALL  command. We are interested in the data for the SELECT  statement. It is:

MySQL
    ROWS_SENT: 3
ROWS_EXAMINED: 11069933
  SELECT_SCAN: 3
NO_INDEX_USED: 1

So performance metrics are absolutely the same as for the original query.

OK, this was for the slow, not effective query. But what about the one that uses indexes?

Let’s craft another example. Consider a query:

MySQL
SELECT COUNT(*), title FROM titles GROUP BY title;

And the stored procedure:

MySQL
mysql> CREATE PROCEDURE my_second_test()
    -> BEGIN
    ->     SELECT COUNT(*), title FROM titles GROUP BY title;
    -> END
    -> |
Query OK, 0 rows affected (0,02 sec)

Now let’s call the query and examine data in the Performance Schema:

MySQL
mysql> TRUNCATE performance_schema.events_statements_history;
Query OK, 0 rows affected (0,00 sec)
mysql> SELECT COUNT(*), title FROM titles GROUP BY title;
+----------+--------------------+
| COUNT(*) | title              |
+----------+--------------------+
|    97750 | Senior Engineer    |
|   107391 | Staff              |
|   115003 | Engineer           |
|    92853 | Senior Staff       |
|    15128 | Assistant Engineer |
|    15159 | Technique Leader   |
|       24 | Manager            |
+----------+--------------------+
7 rows in set (0,28 sec)
mysql> SELECT * FROM performance_schema.events_statements_history
    -> WHERE thread_id=(SELECT thread_id FROM performance_schema.threads WHERE processlist_id=@@pseudo_thread_id)
    -> AND event_name != 'statement/sql/truncate'\G
*************************** 1. row ***************************
              THREAD_ID: 100414
               EVENT_ID: 30340666
           END_EVENT_ID: 30346280
             EVENT_NAME: statement/sql/select
                 SOURCE: init_net_server_extension.cc:93
            TIMER_START: 271987423077801000
              TIMER_END: 271987697673114000
             TIMER_WAIT: 274595313000
              LOCK_TIME: 137000000
               SQL_TEXT: SELECT COUNT(*), title FROM titles GROUP BY title
                 DIGEST: 3b63daf99fa8586784dcbc73f82b79e777e4fc074c37de3ebd25753f0d684a46
            DIGEST_TEXT: SELECT COUNT ( * ) , `title` FROM `titles` GROUP BY `title`
         CURRENT_SCHEMA: employees
            OBJECT_TYPE: NULL
          OBJECT_SCHEMA: NULL
            OBJECT_NAME: NULL
  OBJECT_INSTANCE_BEGIN: NULL
            MYSQL_ERRNO: 0
      RETURNED_SQLSTATE: NULL
           MESSAGE_TEXT: NULL
                 ERRORS: 0
               WARNINGS: 0
          ROWS_AFFECTED: 0
              ROWS_SENT: 7
          ROWS_EXAMINED: 443308
CREATED_TMP_DISK_TABLES: 0
     CREATED_TMP_TABLES: 1
       SELECT_FULL_JOIN: 0
SELECT_FULL_RANGE_JOIN: 0
           SELECT_RANGE: 0
     SELECT_RANGE_CHECK: 0
            SELECT_SCAN: 1
      SORT_MERGE_PASSES: 0
             SORT_RANGE: 0
              SORT_ROWS: 0
              SORT_SCAN: 0
          NO_INDEX_USED: 0
     NO_GOOD_INDEX_USED: 0
       NESTING_EVENT_ID: NULL
     NESTING_EVENT_TYPE: NULL
    NESTING_EVENT_LEVEL: 0
           STATEMENT_ID: 431634
1 row in set (0,00 sec)

In this case, the picture is different:

MySQL
         ROWS_SENT: 7
     ROWS_EXAMINED: 443308
CREATED_TMP_TABLES: 1
       SELECT_SCAN: 1
     NO_INDEX_USED: 0
NO_GOOD_INDEX_USED: 0

This is still not a very effective query: it uses an index scan but it certainly uses the index.

Let’s check data for the stored procedure:

MySQL
mysql> TRUNCATE performance_schema.events_statements_history;
Query OK, 0 rows affected (0,00 sec)
mysql> CALL my_second_test();
+----------+--------------------+
| COUNT(*) | title              |
+----------+--------------------+
|    97750 | Senior Engineer    |
|   107391 | Staff              |
|   115003 | Engineer           |
|    92853 | Senior Staff       |
|    15128 | Assistant Engineer |
|    15159 | Technique Leader   |
|       24 | Manager            |
+----------+--------------------+
7 rows in set (0,28 sec)
Query OK, 0 rows affected (0,28 sec)
mysql> SELECT * FROM performance_schema.events_statements_history
    -> WHERE thread_id=(SELECT thread_id FROM performance_schema.threads WHERE processlist_id=@@pseudo_thread_id)
    -> AND event_name != 'statement/sql/truncate'\G
*************************** 1. row ***************************
              THREAD_ID: 100414
               EVENT_ID: 30354940
           END_EVENT_ID: 30360553
             EVENT_NAME: statement/sp/stmt
                 SOURCE: sp_head.cc:2210
            TIMER_START: 272124909174908000
              TIMER_END: 272125181823179000
             TIMER_WAIT: 272648271000
              LOCK_TIME: 222000000
               SQL_TEXT: SELECT COUNT(*), title FROM titles GROUP BY title
                 DIGEST: NULL
            DIGEST_TEXT: NULL
         CURRENT_SCHEMA: employees
            OBJECT_TYPE: PROCEDURE
          OBJECT_SCHEMA: employees
            OBJECT_NAME: my_second_test
  OBJECT_INSTANCE_BEGIN: NULL
            MYSQL_ERRNO: 0
      RETURNED_SQLSTATE: NULL
           MESSAGE_TEXT: NULL
                 ERRORS: 0
               WARNINGS: 0
          ROWS_AFFECTED: 0
              ROWS_SENT: 7
          ROWS_EXAMINED: 443308
CREATED_TMP_DISK_TABLES: 0
     CREATED_TMP_TABLES: 1
       SELECT_FULL_JOIN: 0
SELECT_FULL_RANGE_JOIN: 0
           SELECT_RANGE: 0
     SELECT_RANGE_CHECK: 0
            SELECT_SCAN: 1
      SORT_MERGE_PASSES: 0
             SORT_RANGE: 0
              SORT_ROWS: 0
              SORT_SCAN: 0
          NO_INDEX_USED: 0
     NO_GOOD_INDEX_USED: 0
       NESTING_EVENT_ID: 30354939
     NESTING_EVENT_TYPE: STATEMENT
    NESTING_EVENT_LEVEL: 1
           STATEMENT_ID: 432318
*************************** 2. row ***************************
              THREAD_ID: 100414
               EVENT_ID: 30354939
           END_EVENT_ID: 30360554
             EVENT_NAME: statement/sql/call_procedure
<I omitted data for the CALL statement>

Counters again have exactly the same values as for the standalone statement:

MySQL
         ROWS_SENT: 7
     ROWS_EXAMINED: 443308
CREATED_TMP_TABLES: 1
       SELECT_SCAN: 1
     NO_INDEX_USED: 0
NO_GOOD_INDEX_USED: 0

So we have proof that the optimizer creates the same query plan no matter if the query was called inside the stored procedure or not.

Q: A few times when I was building a query for a table with multiple indexes, the EXPLAIN command shows me an index that is not clearly the best choice and I had to use FORCE INDEX in the query. I never understand why sometimes this happens, is it possible that sometimes the engine makes that mistake?

A: The engine can make a mistake for sure. If you want to understand more about such mistakes I recommend you to try EXPLAIN FORMAT=JSON  as described in these series as well as studying the Optimizer Trace. While the different formats of the EXPLAIN do not change the query plan, EXPLAIN FORMAT=JSON  provides more information on the optimizations that are used to resolve the query.

Q: I have a question about statuses. It’s about the Handler_read you mentioned. It keeps on increasing when queries are being executed. If the server has an uptime of more than 14 days, having 1000qps, the handler will be in the range of millions. Once we do a query tuning workshop, I would like to reset the Handlers counter. How to perform that? (just to see: for the same time range, whether the Handler_read decreases)

A: To reset counters for the session use the FLUSH STATUS  statement. Then you can run the original statement and examine which job needs to be done inside the engine to resolve it.

Here is the pseudo-code, showing how to do it:

MySQL
FLUSH STATUS;
<RUN YOUR QUERY>
SHOW STATUS LIKE ‘Handler%’;

Q: Can you talk a bit on explain extended? How can we use that to further help us in tuning queries? Does it show the full query that the server/optimizer will execute in the correct format?

A: EXPLAIN EXTENDED  is included in the standard output of the EXPLAIN  command since version 5.7. If you started using MySQL earlier and preferred to run EXPLAIN  command you would now notice two differences.

First, this is the “Filtered”  column. This column shows how many rows the optimizer expects to be discarded from the output. For example, for the query:

MySQL
mysql> EXPLAIN SELECT salary FROM salaries WHERE salary BETWEEN 158000 AND 159000\G
*************************** 1. row ***************************
  select_type: SIMPLE
        table: salaries
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 2838426
     filtered: 11.11
        Extra: Using where
1 row in set, 1 warning (0,00 sec)

It expects that the engine will read 2838426 rows but only 11.11% of them would be used to get the final result. This usually indicates that the query is not effective.

On the other way around, the query:

MySQL
mysql> EXPLAIN SELECT DISTINCT(title) FROM titles\G
*************************** 1. row ***************************
  select_type: SIMPLE
        table: titles
   partitions: NULL
         type: index
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 209
          ref: NULL
         rows: 442189
     filtered: 100.00
        Extra: Using index; Using temporary
1 row in set, 1 warning (0,00 sec)

Would use all 442189 retrieved rows to create the final result set (filtered: 100.00).

Another feature of the EXPLAIN EXTENDED  before version 5.7 and regular EXPLAIN  since version 5.7 is that it returns a query as it was rewritten by the optimizer in its diagnostic area, accessible by the SHOW WARNINGS  command.

For example, let’s take a query:

MySQL
SELECT emp_no, first_name, last_name FROM employees
WHERE emp_no IN (SELECT emp_no FROM salaries WHERE salary = (SELECT MAX(salary) FROM salaries));

And then run EXPLAIN  on it, followed by the SHOW WARNINGS :

MySQL
mysql> EXPLAIN SELECT emp_no, first_name, last_name FROM employees
    -> WHERE emp_no IN (SELECT emp_no FROM salaries WHERE salary = (SELECT MAX(salary) FROM salaries))\G
*************************** 1. row ***************************
  select_type: PRIMARY
        table: employees
   partitions: NULL
         type: ALL
possible_keys: PRIMARY
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 299113
     filtered: 100.00
        Extra: NULL
*************************** 2. row ***************************
  select_type: PRIMARY
        table: <subquery2>
   partitions: NULL
         type: eq_ref
possible_keys: <auto_distinct_key>
          key: <auto_distinct_key>
      key_len: 4
          ref: employees.employees.emp_no
         rows: 1
     filtered: 100.00
        Extra: NULL
*************************** 3. row ***************************
  select_type: MATERIALIZED
        table: salaries
   partitions: NULL
         type: ALL
possible_keys: PRIMARY
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 2838426
     filtered: 10.00
        Extra: Using where
*************************** 4. row ***************************
  select_type: SUBQUERY
        table: salaries
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 2838426
     filtered: 100.00
        Extra: NULL
4 rows in set, 1 warning (0,00 sec)
mysql> SHOW WARNINGS\G
*************************** 1. row ***************************
Level: Note
Code: 1003
Message: /* select#1 */ select `employees`.`employees`.`emp_no` AS `emp_no`,`employees`.`employees`.`first_name` AS `first_name`,
`employees`.`employees`.`last_name` AS `last_name` from `employees`.`employees` semi join (`employees`.`salaries`)
where ((`<subquery2>`.`emp_no` = `employees`.`employees`.`emp_no`)
and (`employees`.`salaries`.`salary` = (/* select#3 */ select max(`employees`.`salaries`.`salary`) from `employees`.`salaries`)))
1 row in set (0,00 sec)

In the SHOW WARNINGS  output, you see that the optimizer used semi-join optimization for the query: it practically converted one of the subqueries into a JOIN. The same query could be written as:

MySQL
SELECT emp_no, first_name, last_name FROM employees JOIN salaries USING(emp_no) WHERE salary = (SELECT MAX(salary) FROM salaries);

If we turn semi-join optimization off we will see a different warning:

MySQL
mysql> SET optimizer_switch='semijoin=off';
Query OK, 0 rows affected (0,00 sec)
mysql> EXPLAIN SELECT emp_no, first_name, last_name FROM employees WHERE emp_no IN (SELECT emp_no FROM salaries WHERE salary = (SELECT MAX(salary) FROM salaries))\G
*************************** 1. row ***************************
  select_type: PRIMARY
        table: employees
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 299113
     filtered: 100.00
        Extra: Using where
*************************** 2. row ***************************
  select_type: SUBQUERY
        table: salaries
   partitions: NULL
         type: ALL
possible_keys: PRIMARY
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 2838426
     filtered: 10.00
        Extra: Using where
*************************** 3. row ***************************
  select_type: SUBQUERY
        table: salaries
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 2838426
     filtered: 100.00
        Extra: NULL
3 rows in set, 1 warning (0,00 sec)
mysql> SHOW WARNINGS\G
*************************** 1. row ***************************
Level: Note
Code: 1003
Message: /* select#1 */ select `employees`.`employees`.`emp_no` AS `emp_no`,`employees`.`employees`.`first_name` AS `first_name`,
`employees`.`employees`.`last_name` AS `last_name` from `employees`.`employees`
where <in_optimizer>(`employees`.`employees`.`emp_no`,`employees`.`employees`.`emp_no`
in ( <materialize> (/* select#2 */ select `employees`.`salaries`.`emp_no` from `employees`.`salaries`
where (`employees`.`salaries`.`salary` = (/* select#3 */ select max(`employees`.`salaries`.`salary`) from `employees`.`salaries`)) ),
<primary_index_lookup>(`employees`.`employees`.`emp_no` in <temporary table> on <auto_distinct_key>
where ((`employees`.`employees`.`emp_no` = `<materialized_subquery>`.`emp_no`)))))
1 row in set (0,00 sec)

This feature could help to understand why one or particular optimization was used.

Q: Have you ever used the mysqltuner perl script and if so, would you suggest it as a short-term option?

A: Do you mean https://github.com/major/MySQLTuner-perl ?

I just run it on my laptop and here are the recommendations I got:

——– Recommendations —————————————————————————

General recommendations:

Reduce or eliminate unclosed connections and network issues

Configure your accounts with ip or subnets only, then update your configuration with skip-name-resolve=1

This is, probably, OK, unless you want to provide access to your MySQL server from the outside.

    We will suggest raising the ‘join_buffer_size’ until JOINs not using indexes are found.

See https://dev.mysql.com/doc/internals/en/join-buffer-size.html

(specially the conclusions at the bottom of the page).

I do not understand why to raise join_buffer_size  if it is not required by queries I use. It also could be dangerous if the number of connections increases.

    Before changing innodb_log_file_size and/or innodb_log_files_in_group read this: https://bit.ly/2TcGgtU

It is always good to read the user manual before changing options. I recommend you to do it every time you follow performance tuning advice. Even mine.

Variables to adjust:

join_buffer_size (> 256.0K, or always use indexes with JOINs)

Again, this could not be suggested without examining the queries.

    innodb_buffer_pool_size (>= 713.2M) if possible.

This conclusion is based on my data size and this is sane for my laptop with 32G RAM. But if the amount of data is larger than the amount of RAM on the machine this advice would not help you to identify the ideal InnoDB buffer pool size. In this case, I recommend you to start from this blog post and follow the links at the end of it.

innodb_log_file_size should be (=16M) if possible, so InnoDB total log files size equals to 25% of buffer pool size.

InnoDB redo log file size should hold as much data, so InnoDB can flush the data from the buffer pool and reuse log space and does not have to perform aggressive flushing.

If you have a small active data set but you write a lot you may have InnoDB’s total log files size greater than 25% of the buffer pool size. Or, if you have a large dataset, but your workload is almost read, you may have very small redo log files. This advice does not make any sense by itself.

In conclusion, I can say that MySQL Tuner is a product that performs static analysis of your MySQL or MariaDB instance and makes suggestions, based on what its authors think the best practices are.

Unfortunately, it is not possible to tune MySQL the same way for all use cases. InnoDB redo log file size above is just one example. There are other options that should be tuned differently depending on which workload you have.

I suggest you study the product you use and understand what and why you are changing. And it is better to start from the question: “Which problem do I need to solve?” instead of modifying random options.

For a general-purpose server, running on the dedicated machine, you can use option –innodb-dedicated-server. Though, in my opinion, it is far from the ideal too.

Q: Mixed based?

A: Is this the question for the slide “Asynchronous Replica”, explaining how binary logging format affects replication performance? Mixed-based binary log format instructs MySQL to log everything in the STATEMENT  format by default and automatically switch to the ROW  format when a user issues commands that are not safe for the STATEMENT  format. This means that safe queries that were slow on the source server and replicated in the STATEMENT  format, will experience the same performance issues on the replica too. Unsafe queries are not affected by this behavior, but tables that they modify have to have PRIMARY KEY  defined. Otherwise, the replica would have to do a full table scan for each row updated.

STAY UP-TO-DATE With Percona!

p

Join 50,000+ of your fellow open-source enthusiasts! Our newsletter provides updates on Percona open source software releases, technical resources, and valuable MySQL, MariaDB, PostgreSQL, and MongoDB-related articles. Get information about Percona Live, our technical webinars, and upcoming events and meetups where you can talk with our experts.

Enter your work email address:*

By submitting my information I agree that Percona may use my personal data in send communication to me about Percona services. I understand that I can unsubscribe from the communication at any time in accordance with the Percona Privacy Policy.

Author

Sveta Smirnova

Sveta joined Percona in 2015. Her main professional interests are problem solving, working with tricky issues, bugs, finding patterns that can solve typical issues quicker and teaching others how to deal with MySQL issues, bugs and gotchas effectively. Before joining Percona Sveta worked as a Support Engineer in the MySQL Bugs Analysis Support Group in MySQL AB-Sun-Oracle.

She is the author of the book "MySQL Troubleshooting" and JSON UDF functions for MySQL.


Leave a Reply Cancel reply


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK