This is the final part of our webinar trilogy on MySQL Query Tuning, in which we looked at query tuning process and tools to help with that. We’ve covered topics such as SQL tuning, indexing, the optimizer and how to leverage EXPLAIN to gain insight into execution plans. Part 3: Working with the optimizer and SQL tuning.
AGENDA
Optimizer
- How execution plans are calculated
- InnoDB statistics
Hinting the optimizer
- Index hints
- JOIN order modifications
- Tweakable optimizations
Optimizing SQL
SPEAKER
Krzysztof Książek, Senior Support Engineer at Severalnines, is a MySQL DBA with experience managing complex database environments for companies like Zendesk, Chegg, Pinterest and Flipboard.
Webinar slides: MySQL Query Tuning Trilogy: Working with optimizer and SQL tuning
1. Copyright 2016 Severalnines AB
1
Your host & some logistics
I'm Jean-Jérôme from the Severalnines Team
and I'm your host for today's webinar!
Feel free to ask any questions in the Questions
section of this application or via the Chat box.
You can also contact me directly via the chat
box or via email: jj@severalnines.com during
or after the webinar.
7. Copyright 2016 Severalnines AB
MySQL Query Tuning - Hinting the optimizer and improving query
performance
October 25, 2016
Krzysztof Książek
Severalnines
krzysztof@severalnines.com
7
8. Copyright 2016 Severalnines AB
8
Agenda
! InnoDB index statistics
! MySQL cost model
! Hints
! Index hints
! Legacy optimizer hints syntax
! New (5.7) optimizer hint syntax
! Optimizing SQL
10. Copyright 2016 Severalnines AB
10
InnoDB index statistics
! Query execution plan is calculated based on InnoDB index statistics
! Up to 5.6, default behavior is that statistics are recalculated when
! ANALYZE TABLE has been explicitly executed
! SHOW TABLE STATUS, SHOW TABLES or SHOW INDEX were executed
! Either 1/16th or 2 billion rows were modified in a table
! To calculate statistics, InnoDB performs a lookup into 8 index pages
! This is 128KB of data to calculate stats for, i.e. 100GB index
! Use innodb_stats_transient_sample_pages to change that
! Query execution plan may change after statistics recalculation
11. Copyright 2016 Severalnines AB
11
InnoDB index statistics
! In MySQL 5.6 statistics became (by default) more persistent
! They are not recalculated for every SHOW TABLE STATUS and similar commands
! They are updated when an explicit ANALYZE TABLE is run on the table or more than 10% of
rows in the table were modified
! As a result, query execution plans became more stable
! They are also calculated from a larger sample - 20 index pages
! Manageable through innodb_stats_persistent_sample_pages variable
! You can disable persistent statistics using innodb_stats_persistent
13. Copyright 2016 Severalnines AB
13
MySQL cost model
! To determine most efficient query execution plan, MySQL has to assess costs of different plans
! The least expensive one is picked
! Each operation - reading data from memory or from disk, creating temporary table in memory
and on disk, comparing rows, evaluating row conditions, has its own cost assigned
! Historically, those numbers were hardcoded and couldn’t be changed
! This changed with MySQL 5.7 - new tables were added in mysql schema
! server_cost
! engine_cost
14. Copyright 2016 Severalnines AB
14
MySQL cost model
! disk_temptable_create_cost,
disk_temptable_row_cost - cost to create and
maintain on-disk temporary table - by default 40 and 1
! memory_temptable_create_cost,
memory_temptable_row_cost - cost to create and
maintain in-memory temporary table - by default 2
and 0.2
! key_compare_cost - cost to compare record keys
(more expensive - less likely filesort will be used) - by
default - 0.1
! row_evaluate_cost - cost to evaluate rows (more
expensive - more likely index will be used for scan) - by
default - 0.2
15. Copyright 2016 Severalnines AB
15
MySQL cost model
! engine_name - InnoDB/MyISAM - by default all are affected
! device_type - not used, but in the future you could have different costs for different types of I/O
devices
! io_block_read_cost - cost of reading an index or data page from disk - by default 1
! memory_block_read_cost - cost of reading index or data page from memory - by default 1
16. Copyright 2016 Severalnines AB
16
MySQL cost model
! Optimizer is undergoing refactoring and rewriting - new features will follow
! Even now, in MySQL 5.7, you can modify costs which used to be hardcoded and tweak them
according to your hardware
! Disk operations will be less expensive on PCIe SSD than on spindles
! You can tweak engine_cost and server_cost to reflect that
! You can always revert your changes through updating costs to ‘NULL’
! Make sure you run FLUSH OPTIMIZER_COSTS; to apply your changes
! Use SHOW STATUS LIKE ‘Last_query_cost'; to check the cost of last executed query
18. Copyright 2016 Severalnines AB
18
! USE INDEX - tells the optimizer that it should use one of the listed indexes
! FORCE INDEX - a full table scan is marked as extremely expensive operation and therefore won’t
be used by the optimizer - as long as any of the listed indexes could be used for our particular
query
! IGNORE INDEX - tells the optimizer which indexes we don’t want it to consider
Index hints
20. Copyright 2016 Severalnines AB
20
Index hints
! Hints can be located in different places
! JOIN actor AS a IGNORE INDEX FOR JOIN (idx_actor_last_name)
! FORCE INDEX FOR ORDER BY(idx_actor_first_name)
! Following options are available:
! FORCE INDEX FOR JOIN (idx_myindex)
! FORCE INDEX FOR ORDER BY (idx_myindex)
! FORCE INDEX FOR GROUP BY (idx_myindex)
! FORCE INDEX (idx_myindex) aggregates all of those above
21. Copyright 2016 Severalnines AB
21
Index hints
! When you are executing any query with JOINs, the MySQL optimizer has to decide the order in
which those tables should be joined
! A result is not always optimal
! STRAIGHT_JOIN can be used to force order in which tables will be joined
! Works for JOIN only - LEFT or RIGHT JOIN’s already enforce some order
! Let’s assume this query on Sakila database:
EXPLAIN SELECT actor_id, title FROM film_actor AS fa JOIN film AS f ON fa.film_id = f.film_id
ORDER BY fa.actor_idG
23. Copyright 2016 Severalnines AB
23
Index hints - join order modificators
! Let’s say we want to avoid temporary table
! Following query will do the trick - note that STRAIGHT_JOIN is used:
! EXPLAIN SELECT STRAIGHT_JOIN actor_id, title FROM film_actor AS fa JOIN film AS f ON
fa.film_id = f.film_id ORDER BY fa.actor_idG
! Tables will be joined in a film_actor -> film order
25. Copyright 2016 Severalnines AB
25
Index hints - join order modificators
! You can manipulate the join order also within the query
! SELECT STRAIGHT_JOIN * FROM tab1 JOIN tab2 ON tab1.a = tab2.a JOIN tab3 ON tab2.b =
tab3.b;
! Only option of the optimizer will be: tab1, tab2, tab3
! SELECT * FROM tab1 JOIN tab2 ON tab1.a = tab2.a STRAIGHT_JOIN tab3 ON tab2.b = tab3.b;
! Two different options are possible now
! tab1, tab2, tab3
! tab2, tab3, tab1
27. Copyright 2016 Severalnines AB
27
Controlling the optimizer - optimizer switch
! With time MySQL optimizer got improved and new algorithms were added
! MariaDB added their own set of optimizations and optimizer features
! Some of those features can be disabled by user on global and session level
! SET GLOBAL optimizer_switch=“index_merge=off";
! SET SESSION optimizer_switch=“index_merge=off";
! Sometimes this is the only way to make sure your query will be executed in an optimal way
29. Copyright 2016 Severalnines AB
29
Controlling the optimizer - optimizer hints (5.7)
! As of MySQL 5.7.7, new way of controlling
optimizer has been added
! Hints use /*+ … */ syntax within query
! Takes precedence over optimizer_switch
variable
! Work on multiple levels:
! Global
! Query block
! Table
! Index
30. Copyright 2016 Severalnines AB
30
Controlling the optimizer - optimizer hints (5.7)
Hint Name Description Applicable Scopes
BKA, NO_BKA Affects Batched Key Access join processing Query block, table
BNL, NO_BNL Affects Block Nested-Loop join processing Query block, table
MAX_EXECUTION_TIME Limits statement execution time Global
MRR, NO_MRR Affects Multi-Range Read optimization Table, index
NO_ICP Affects Index Condition Pushdown optimization Table, index
NO_RANGE_OPTIMIZATION Affects range optimization Table, index
QB_NAME Assigns name to query block Query block
SEMIJOIN, NO_SEMIJOIN Affects semi-join strategies Query block
SUBQUERY Affects materialization, IN-to-EXISTS subquery stratgies Query block
31. Copyright 2016 Severalnines AB
31
! Can be used at the beginning of a statement:
! SELECT /*+ ... */ ...
! INSERT /*+ ... */ ...
! REPLACE /*+ ... */ ...
! UPDATE /*+ ... */ ...
! DELETE /*+ ... */ ...
Controlling the optimizer - optimizer hints (5.7)
! Can be used in subqueries:
! (SELECT /*+ ... */ ... )
! (SELECT ... ) UNION (SELECT /*+ ... */ ... )
! (SELECT /*+ ... */ ... ) UNION (SELECT /*+ ...
*/ ... )
! UPDATE ... WHERE x IN (SELECT /*+ ... */ ...)
! INSERT ... SELECT /*+ ... */ ...
32. Copyright 2016 Severalnines AB
! Can be used on a table level:
! SELECT /*+ NO_BKA(t1, t2) */ t1.* FROM t1
INNER JOIN t2 INNER JOIN t3;
! SELECT /*+ NO_BNL() BKA(t1) */ t1.* FROM t1
INNER JOIN t2 INNER JOIN t3;
32
Controlling the optimizer - optimizer hints (5.7)
! Can be used on an index level:
! SELECT /*+ MRR(t1) */ * FROM t1 WHERE f2 <= 3
AND 3 <= f3;
! SELECT /*+ NO_RANGE_OPTIMIZATION(t3 PRIMARY,
f2_idx) */ f1 FROM t3 WHERE f1 > 30 AND f1 < 33;
! INSERT INTO t3(f1, f2, f3) (SELECT /*+ NO_ICP(t2) */
t2.f1, t2.f2, t2.f3 FROM t1,t2 WHERE t1.f1=t2.f1 AND
t2.f2 BETWEEN t1.f1 AND t1.f2 AND t2.f2 + 1 >= t1.f1
+ 1);
33. Copyright 2016 Severalnines AB
33
Controlling the optimizer - optimizer hints (5.7)
! SELECT /*+ MAX_EXECUTION_TIME(1000) */ * …
! Applies to the whole SELECT query
! Only applies to read-only SELECTs (does not apply to SELECTs which invoke stored routine)
! Does not apply to SELECTs in stored routines
! Very convenient way of adding safety - if you are not sure how long a query will take, limit its
maximum execution time
34. Copyright 2016 Severalnines AB
34
Pros and cons of using hints
! Enable you to fix optimizer mistakes
! Sometimes it’s the fastest way of solving a
performance issue
! Faster than, for example, adding an index
! Allow you to disable some parts of the
functionality of the optimizer
! Hardcoded hints can become a problem
! When you remove indexes: (ERROR 1176
(42000): Key 'idx_b' doesn't exist in table
‘tab')
! When you upgrade to next major MySQL
version (hint syntax may change)
! When data distribution changes and new
plan becomes optimal
36. Copyright 2016 Severalnines AB
36
Optimizing SQL
! MySQL is getting better in executing queries with every release
! What didn’t work in the past may work better in the latest version. Subqueries, for example
! MySQL 5.5: MySQL 5.6/5.7:
37. Copyright 2016 Severalnines AB
37
Optimizing SQL
! MySQL 5.5 usually requires rewrite of subquery into JOIN:
38. Copyright 2016 Severalnines AB
38
! When looking at JOIN queries, make sure
columns used to join tables are properly indexed
! Not indexed joins are the most common SQL
anti-pattern, and the most expensive one too
Optimizing SQL
40. Copyright 2016 Severalnines AB
40
Optimizing SQL
! Be aware of LIMIT - it may not actually limit number of rows scanned - use ranges instead
! LIMIT 9000,10 would access 9010 rows
41. Copyright 2016 Severalnines AB
41
Optimizing SQL
! When using UNION in your query, make sure you use UNION ALL, otherwise a DISTINCT clause is
added and it requires additional processing of the data - temporary table is created with index
on it
! UNION ALL also requires temporary table (removed in MySQL 5.7), but no index is created
43. Copyright 2016 Severalnines AB
43
Optimizing SQL
! For GROUP BY and ORDER BY - try to make sure index is used, otherwise a temporary table will
be created or filesort has to be performed
44. Copyright 2016 Severalnines AB
44
Optimizing SQL
! When used in JOIN, try to sort and aggregate only using columns from a single table - such case
can be indexed. If you GROUP BY or ORDER BY using columns from both, it can’t be indexed
! The only way MySQL can use multiple indexes (but from the same table) is through index
merge
! And it’s not the fastest way of retrieving the data (more details on another slide)
! There’s definitely no way to use indexes across multiple tables
46. Copyright 2016 Severalnines AB
46
Optimizing SQL
! Avoid ORDER BY RAND() - it will create a temporary table
! Always
! ORDER BY RAND() is evil
! In app - generate random numbers from MIN(pk), MAX(pk) range
! Use them in WHERE pk=… or pk IN ( … )
! PK lookup - fast and efficient
! Verify you got correct number of rows, if not - repeat the process
47. Copyright 2016 Severalnines AB
47
Optimizing SQL
! Parallelize queries and aggregate them within application - MySQL cannot use multiple cores
per query (although there are worklogs regarding that so it may change in the future)
! Use home-grown scripts
! Use https://shardquery.com
! Parallel processing may not always be feasible, but, if it could be used, it can speed up data
processing significantly.
48. Copyright 2016 Severalnines AB
48
Thank You!
! Blog posts covering query tuning process:
! http://severalnines.com/blog/become-mysql-dba-blog-series-optimizer-hints-faster-query-
execution
! Register for other upcoming webinars:
! http://severalnines.com/upcoming-webinars
! Install ClusterControl:
! http://severalnines.com/getting-started
! Contact: jj@severalnines.com