SlideShare une entreprise Scribd logo
1  sur  36
Télécharger pour lire hors ligne
Sergei Petrunia, MariaDB
New features
in MariaDB/MySQL
query optimizer
12:49:092
MySQL/MariaDB optimizer development
● Some features have common heritage
● Big releases:
– MariaDB 5.3/5.5
– MySQL 5.6
– (upcoming) MariaDB 10.0
12:49:093
New optimizer features
Subqueries Batched Key Access
(MRR)
Index Condition Pushdown
Extended Keys
EXPLAIN UPDATE/
DELETE
Subqueries
FROM IN Others
PERFORMANCE_SCHEMA
Engine-independent
statistics
InnoDB persistent statistics
12:49:094
New optimizer features
Subqueries Batched Key Access
(MRR)
Index Condition Pushdown
Extended Keys
EXPLAIN UPDATE/
DELETE
Subqueries
FROM IN Others
Engine-independent
statistics
InnoDB persistent statistics
PERFORMANCE_SCHEMA
12:49:095
Subqueries in MySQL
● Subqueries are practially unusable
● e.g. Facebook disabled them in the parser
● Reason - “naive execution”.
12:49:096
Naive subquery execution
● For IN (SELECT... ) subqueries:
select * from hotel
where
hotel.country='USA' and
hotel.name IN (select hotel_stays.hotel
from hotel_stays
where hotel_stays.customer='John Smith')
for (each hotel in USA ) {
if (john smith stayed here) {
…
}
}
● Naive execution:
● Slow!
12:49:097
Naive subquery execution (2)
● For FROM(SELECT …) subquereis:
1. Retrieve all hotels with > 500 rooms, store in a temporary
table big_hotel;
2. Search in big_hotel for hotels near AMS.
● Naive execution:
● Slow!
select *
from
(select *
from hotel
where hotel.rooms > 500
) as big_hotel
where
big_hotel.nearest_aiport='AMS';
12:49:098
New subquery optimizations
● Handle IN (SELECT ...)
● Handle FROM (SELECT …)
● Handle a lot of cases
● Comparison with
PostgreSQL
– ~1000x slower before
– ~same order of magnitude now
● Releases
– MySQL 6.0
– MariaDB 5.5
● Sheeri Kritzer @ Mozilla seems
happy with this one
– MySQL 5.6
● Subset of MariaDB 5.5's
features
12:49:099
Subquery optimizations - summary
● Subqueries were generally unusable before MariaDB
5.3/5.5
● “Core” subquery optimizations are in
– MariaDB 5.3/5.5
– MySQL 5.6
● MariaDB has extra additions
● Further information:
https://kb.askmonty.org/en/subquery-optimizations/
12:49:0910
Subqueries Batched Key Access
(MRR)
Index Condition Pushdown
Extended Keys
EXPLAIN UPDATE/
DELETE
Subqueries
FROM IN Others
Engine-independent
statistics
InnoDB persistent statistics
PERFORMANCE_SCHEMA
12:49:0911
Batched Key Access - background
● Big, IO-bound joins were slow
– DBT-3 benchmark could not finish*
● Reason?
● Nested Loops join hits the second table at random
locations.
12:49:0912
Batched Key Access idea
Nested Loops Join Batched Key Access
Speedup reasons
● Fewer disk head movements
● Cache-friendliness
● Prefetch-friendliness
12:49:0913
Batched Key Access benchmark
set join_cache_level=6; – enable BKA
select max(l_extendedprice)
from orders, lineitem
where
l_orderkey=o_orderkey and
o_orderdate between $DATE1 and $DATE2
Run with
● Various join_buffer_size settings
● Various size of $DATE1...$DATE2 range
12:49:0914
Batched Key Access benchmark (2)
-2,000,000 3,000,000 8,000,000 13,000,000 18,000,000 23,000,000 28,000,000 33,000,000
0
500
1000
1500
2000
2500
3000
BKA join performance depending on buffer size
query_size=1, regular
query_size=1, BKA
query_size=2, regular
query_size=2, BKA
query_size=3, regular
query_size=3, BKA
Buffer size, bytes
Querytime,sec
Performance without BKA
Performance with BKA,
given sufficient buffer size
12:49:0915
Batched Key Access summary
● Optimization for big, IO-bound joins
– Orders-of-magnitude speedups
● Available in
– MariaDB 5.3/5.5 (more advanced)
– MySQL 5.6
● Not fully automatic yet
– Needs to be manually enabled
– Need to set buffer sizes.
12:49:0916
Subqueries Batched Key Access
(MRR)
Index Condition Pushdown
Extended Keys
EXPLAIN UPDATE/
DELETE
Subqueries
FROM IN Others
Engine-independent
statistics
InnoDB persistent statistics
PERFORMANCE_SCHEMA
12:49:0917
Index Condition Pushdown
alter table lineitem add index s_r (l_shipdate, l_receiptdate);
select count(*) from lineitem
where
l_shipdate between '1993-01-01' and '1993-02-01' and
datediff(l_receiptdate,l_shipdate) > 25 and
l_quantity > 40
● A new feature in MariaDB 5.3/ MySQL 5.6
+----+-------------+----------+-------+---------------+------+---------+------+--------+------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+-------+---------------+------+---------+------+--------+------------------------------------+
| 1 | SIMPLE | lineitem | range | s_r | s_r | 4 | NULL | 158854 | Using index condition; Using where |
+----+-------------+----------+-------+---------------+------+---------+------+--------+------------------------------------+
1.Read index records in the range
l_shipdate between '1993-01-01' and '1993-02-01'
2.Check the index condition
datediff(l_receiptdate,l_shipdate) > 25
3.Read full table rows
4.Check the WHERE condition
l_quantity > 40
← New!
← Filters out records before
table rows are read
12:49:0918
Index Condition Pushdown - conclusions
Summary
● Applicable to any index-based access (ref, range, etc)
● Checks parts of WHERE after reading the index
● Reduces number of table records to be read
● Speedup can be like in “Using index”
– Great for IO-bound load (5x, 10x)
– Some for CPU-bound workload (2x)
Conclusions
● Have a selective condition on column?
– Put the column into index, at the end.
12:49:0919
Extended keys
● Before: optimizer has limited support for “tail” columns
– 'Using index' supports it
– ORDER BY col1, col2, pk1 support it
● After MariaDB 5.5/ MySQL 5.6
– all parts of optimizer (ref access, range access, etc) can use the “tail”
CREATE TABLE tbl (
pk1 sometype,
pk2 sometype,
...
col1 sometype,
col2 sometype,
...
KEY indexA (col1, col2)
...
PRIMARY KEY (pk1, pk2)
) ENGINE=InnoDB
indexA col1 col2 pk1 pk2
● Secondary indexes in InnoDB have invisible “tail”
12:49:0920
Subqueries Batched Key Access
(MRR)
Index Condition Pushdown
Extended Keys
EXPLAIN UPDATE/
DELETE
Subqueries
FROM IN Others
Engine-independent
statistics
InnoDB persistent statistics
PERFORMANCE_SCHEMA
12:49:0921
Better EXPLAIN in MySQL 5.6
● EXPLAIN for UPDATE/DELETE/INSERT … SELECT
– shows query plan for the finding records to update/delete
mysql> explain update customer set c_acctbal = c_acctbal - 100 where c_custkey=12354;
+----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+
| 1 | SIMPLE | customer | range | PRIMARY | PRIMARY | 4 | NULL | 1 | Using where |
+----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+
● EXPLAIN FORMAT=JSON
– Produces [big] JSON output
– Shows more information:
● Shows conditions attached to tables
● Shows whether “Using temporary; using filesort” is done to handle
GROUP BY or ORDER BY.
● Shows where subqueries are attached
– No other known additions
– Will be in MariaDB 10.0
The most useful addition!
12:49:0922
EXPLAIN FORMAT=JSON
What are the “conditions attached to tables”?
explain
select
count(*)
from
orders, customer
where
customer.c_custkey=orders.o_custkey and
customer.c_mktsegment='BUILDING' and
orders.o_totalprice > customer.c_acctbal and
orders.o_orderpriority='1-URGENT'
+----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+
| 1 | SIMPLE | customer | ALL | PRIMARY | NULL | NULL | NULL | 1509871 | Using where |
| 1 | SIMPLE | orders | ref | i_o_custkey | i_o_custkey | 5 | dbt3sf10.customer.c_custkey | 7 | Using where |
+----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+
?
12:49:0923
EXPLAIN FORMAT=JSON (2)
{
"query_block": {
"select_id": 1,
"nested_loop": [
{
"table": {
"table_name": "customer",
"access_type": "ALL",
"possible_keys": [
"PRIMARY"
],
"rows": 1509871,
"filtered": 100,
"attached_condition": "(`dbt3sf10`.`customer`.`c_mktsegment` = 'BUILDING')"
}
},
{
"table": {
"table_name": "orders",
"access_type": "ref",
"possible_keys": [
"i_o_custkey"
],
"key": "i_o_custkey",
"used_key_parts": [
"o_custkey"
],
"key_length": "5",
"ref": [
"dbt3sf10.customer.c_custkey"
],
"rows": 7,
"filtered": 100,
"attached_condition": "((`dbt3sf10`.`orders`.`o_orderpriority` = '1-URGENT') and (`dbt3sf10`.`orders`.`o_totalprice` >
`dbt3sf10`.`customer`.`c_acctbal`))"
}
}
]
}
}
+----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+
| 1 | SIMPLE | customer | ALL | PRIMARY | NULL | NULL | NULL | 1509871 | Using where |
| 1 | SIMPLE | orders | ref | i_o_custkey | i_o_custkey | 5 | dbt3sf10.customer.c_custkey | 7 | Using where |
+----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+
12:49:0924
EXPLAIN ANALYZE (kind of)
● Does EXPLAIN match the reality?
● Where is most of the time spent?
● MySQL/MariaDB don't have “EXPLAIN ANALYZE” ...
select
count(*)
from
orders, customer
where
customer.c_custkey=orders.o_custkey and
customer.c_mktsegment='BUILDING' and orders.o_orderpriority='1-URGENT'
+------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+
| 1 | SIMPLE | customer | ALL | PRIMARY | NULL | NULL | NULL | 149415 | Using where |
| 1 | SIMPLE | orders | ref | i_o_custkey | i_o_custkey | 5 | customer.c_custkey | 7 | Using index |
+------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+
12:49:0925
Traditional solution: Status variables
Problems:
● Only #rows counters
● all tables are counted together
mysql> flush status;
Query OK, 0 rows affected (0.00 sec)
mysql> {run query}
mysql> show status like 'Handler%';
+----------------------------+--------+
| Variable_name | Value |
+----------------------------+--------+
| Handler_commit | 1 |
| Handler_delete | 0 |
| Handler_discover | 0 |
| Handler_icp_attempts | 0 |
| Handler_icp_match | 0 |
| Handler_mrr_init | 0 |
| Handler_mrr_key_refills | 0 |
| Handler_mrr_rowid_refills | 0 |
| Handler_prepare | 0 |
| Handler_read_first | 0 |
| Handler_read_key | 30142 |
| Handler_read_last | 0 |
| Handler_read_next | 303959 |
| Handler_read_prev | 0 |
| Handler_read_rnd | 0 |
| Handler_read_rnd_deleted | 0 |
| Handler_read_rnd_next | 150001 |
| Handler_rollback | 0 |
...
. . .
12:49:0926
Newer solution: userstat
● In Facebook patch, Percona, MariaDB:
mysql> set global userstat=1;
mysql> flush table_statistics;
mysql> flush index_statistics;
mysql> {query}
mysql> show table_statistics;
+--------------+------------+-----------+--------------+-------------------------+
| Table_schema | Table_name | Rows_read | Rows_changed | Rows_changed_x_#indexes |
+--------------+------------+-----------+--------------+-------------------------+
| dbt3sf1 | orders | 303959 | 0 | 0 |
| dbt3sf1 | customer | 150000 | 0 | 0 |
+--------------+------------+-----------+--------------+-------------------------+
mysql> show index_statistics;
+--------------+------------+-------------+-----------+
| Table_schema | Table_name | Index_name | Rows_read |
+--------------+------------+-------------+-----------+
| dbt3sf1 | orders | i_o_custkey | 303959 |
+--------------+------------+-------------+-----------+
● Counters are per-table
– Ok as long as you don't have self-joins
● Overhead is negligible
● Counters are server-wide (other queries affect them, too)
12:49:0927
Latest addition: PERFORMANCE_SCHEMA
● Allows to measure *time* spent reading each table
● Has some visible overhead (Facebook's tests: 7%)
● Counters are system-wide
● Still no luck with self-joins
mysql> truncate performance_schema.table_io_waits_summary_by_table;
mysql> {query}
mysql> select
object_schema,
object_name,
count_read,
sum_timer_read, -- this is picoseconds
sum_timer_read / (1000*1000*1000*1000) as read_seconds -- this is seconds
from
performance_schema.table_io_waits_summary_by_table
where
object_schema = 'dbt3sf1' and object_name in ('orders','customer');
+---------------+-------------+------------+----------------+--------------+
| object_schema | object_name | count_read | sum_timer_read | read_seconds |
+---------------+-------------+------------+----------------+--------------+
| dbt3sf1 | orders | 334101 | 5739345397323 | 5.7393 |
| dbt3sf1 | customer | 150001 | 1273653046701 | 1.2737 |
+---------------+-------------+------------+----------------+--------------+
12:49:0928
Subqueries Batched Key Access
(MRR)
Index Condition Pushdown
Extended Keys
EXPLAIN UPDATE/
DELETE
Subqueries
FROM IN Others
Engine-independent
statistics
InnoDB persistent statistics
PERFORMANCE_SCHEMA
12:49:0929
What is table/index statistics?
select
count(*)
from
customer, orders
where
customer.c_custkey=orders.o_custkey and customer.c_mktsegment='BUILDING';
+------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+
| 1 | SIMPLE | customer | ALL | PRIMARY | NULL | NULL | NULL | 148305 | Using where |
| 1 | SIMPLE | orders | ref | i_o_custkey | i_o_custkey | 5 | customer.c_custkey | 7 | Using index |
+------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+
MariaDB > show table status like 'orders'G
*************************** 1. row ***************************
Name: orders
Engine: InnoDB
Version: 10
Row_format: Compact
Rows: 1495152
.............
MariaDB > show keys from orders where key_name='i_o_custkey'G
*************************** 1. row ***************************
Table: orders
Non_unique: 1
Key_name: i_o_custkey
Seq_in_index: 1
Column_name: o_custkey
Collation: A
Cardinality: 212941
Sub_part: NULL
.................
?
1495152 / 212941 = 7
“There are on average 7 orders
for a given c_custkey”
12:49:0930
The problem with index statistics and InnoDB
MySQL 5.5, InnoDB
● Statistics is calculated on-the-fly
– When the table is opened (server restart, DDL)
– When sufficient number of records have been updated
– ...
● Calculation uses random sampling
– @@innodb_stats_sample_pages
● Result:
– Statistics changes without warning
=> Query plans change, without warning
● For example, DBT-3 benchmark
– 22 analytics queries
– Plans-per-query: avg=2.8, max=7.
12:49:0931
Persistent table statistics
Persistent statistics v1
● Percona Server 5.5 (ported to MariaDB 5.5)
– Need to enable it: innodb_use_sys_stats_table=1
● Statistics is stored inside InnoDB
– User-visible through information_schema.innodb_sys_stats (read-only)
● Setting innodb_stats_auto_update=OFF prevents unexpected updates
Persistent statistics v2
● MySQL 5.6
– Enabled by default: innodb_stats_persistent=1
● Stored in regular InnoDB tables
– mysql.innodb_table_stats, mysql.innodb_index_stats
● Setting innodb_stats_auto_recalc=OFF prevents unexpected updates
● Can also specify persistence/auto-recalc as a table option
12:49:0932
Persistent table statistics - summary
● Percona, then MySQL
– Made statistics persistent
– Disallowed automatic updates
● Remaining issue #1: it's still random sampling
– DBT-3 benchmark
– scale=30
– Re-ran EXPLAINS for
benchmark queries
– Counted different query
plans
● Remaining issue #2: limited amount of statistics
– Only on index columns
– Only AVG(#different_values)
12:49:0933
Upcoming: Engine-independent statistics
MariaDB 10.0: Engine-independent statistics
● Collected/used on SQL layer
● No auto updates, only ANALYZE TABLE
– 100% precise statics
● More statistics
– Index statistics (like before)
– Table statistics (like before)
– Column statistics
● MIN/MAX values
● Number of NULL / not NULL values
● Histograms
● => Optimizer will be smarter and more reliable
12:49:0934
Conclusions
● Lots of new query optimizer features recently
– Subqueries now just work
– Big joins are much faster
● Need to turn it on
– More diagnostics
● Even more is coming
● Releases with features
– MariaDB 5.5
– MySQL 5.6,
– (upcoming) MariaDB 10.0
12:49:0935
New optimizer features
Subqueries Batched Key Access
(MRR)
Index Condition Pushdown
Extended Keys
EXPLAIN UPDATE/
DELETE
Subqueries
FROM IN Others
PERFORMANCE_SCHEMA
Engine-independent
statistics
InnoDB persistent statistics
12:49:0936
Thanks
Q & A

Contenu connexe

Tendances

Mysqlconf2013 mariadb-cassandra-interoperability
Mysqlconf2013 mariadb-cassandra-interoperabilityMysqlconf2013 mariadb-cassandra-interoperability
Mysqlconf2013 mariadb-cassandra-interoperability
Sergey Petrunya
 
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
Sergey Petrunya
 

Tendances (20)

Mysqlconf2013 mariadb-cassandra-interoperability
Mysqlconf2013 mariadb-cassandra-interoperabilityMysqlconf2013 mariadb-cassandra-interoperability
Mysqlconf2013 mariadb-cassandra-interoperability
 
Playing with the CONNECT storage engine
Playing with the CONNECT storage enginePlaying with the CONNECT storage engine
Playing with the CONNECT storage engine
 
Introduction into MySQL Query Tuning for Dev[Op]s
Introduction into MySQL Query Tuning for Dev[Op]sIntroduction into MySQL Query Tuning for Dev[Op]s
Introduction into MySQL Query Tuning for Dev[Op]s
 
Character Encoding - MySQL DevRoom - FOSDEM 2015
Character Encoding - MySQL DevRoom - FOSDEM 2015Character Encoding - MySQL DevRoom - FOSDEM 2015
Character Encoding - MySQL DevRoom - FOSDEM 2015
 
Performance Schema for MySQL Troubleshooting
Performance Schema for MySQL TroubleshootingPerformance Schema for MySQL Troubleshooting
Performance Schema for MySQL Troubleshooting
 
MySQL and MariaDB Backups
MySQL and MariaDB BackupsMySQL and MariaDB Backups
MySQL and MariaDB Backups
 
Query Optimizer in MariaDB 10.4
Query Optimizer in MariaDB 10.4Query Optimizer in MariaDB 10.4
Query Optimizer in MariaDB 10.4
 
Optimizer Trace Walkthrough
Optimizer Trace WalkthroughOptimizer Trace Walkthrough
Optimizer Trace Walkthrough
 
Introducing new SQL syntax and improving performance with preparse Query Rewr...
Introducing new SQL syntax and improving performance with preparse Query Rewr...Introducing new SQL syntax and improving performance with preparse Query Rewr...
Introducing new SQL syntax and improving performance with preparse Query Rewr...
 
New features in Performance Schema 5.7 in action
New features in Performance Schema 5.7 in actionNew features in Performance Schema 5.7 in action
New features in Performance Schema 5.7 in action
 
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
 
Using histograms to get better performance
Using histograms to get better performanceUsing histograms to get better performance
Using histograms to get better performance
 
Optimizer features in recent releases of other databases
Optimizer features in recent releases of other databasesOptimizer features in recent releases of other databases
Optimizer features in recent releases of other databases
 
Performance Schema for MySQL Troubleshooting
Performance Schema for MySQL TroubleshootingPerformance Schema for MySQL Troubleshooting
Performance Schema for MySQL Troubleshooting
 
Efficient Pagination Using MySQL
Efficient Pagination Using MySQLEfficient Pagination Using MySQL
Efficient Pagination Using MySQL
 
MySQL Query tuning 101
MySQL Query tuning 101MySQL Query tuning 101
MySQL Query tuning 101
 
0888 learning-mysql
0888 learning-mysql0888 learning-mysql
0888 learning-mysql
 
Why Use EXPLAIN FORMAT=JSON?
 Why Use EXPLAIN FORMAT=JSON?  Why Use EXPLAIN FORMAT=JSON?
Why Use EXPLAIN FORMAT=JSON?
 
MySQLinsanity
MySQLinsanityMySQLinsanity
MySQLinsanity
 
Common Table Expressions in MariaDB 10.2 (Percona Live Amsterdam 2016)
Common Table Expressions in MariaDB 10.2 (Percona Live Amsterdam 2016)Common Table Expressions in MariaDB 10.2 (Percona Live Amsterdam 2016)
Common Table Expressions in MariaDB 10.2 (Percona Live Amsterdam 2016)
 

En vedette

Илья Космодемьянский (PostgreSQL-Consulting.com)
Илья Космодемьянский (PostgreSQL-Consulting.com)Илья Космодемьянский (PostgreSQL-Consulting.com)
Илья Космодемьянский (PostgreSQL-Consulting.com)
Ontico
 
Сергей Житинский, Александр Чистяков (Git in Sky)
Сергей Житинский, Александр Чистяков (Git in Sky)Сергей Житинский, Александр Чистяков (Git in Sky)
Сергей Житинский, Александр Чистяков (Git in Sky)
Ontico
 
NVMf: 5 млн IOPS по сети своими руками / Андрей Николаенко (IBS)
NVMf: 5 млн IOPS по сети своими руками / Андрей Николаенко (IBS)NVMf: 5 млн IOPS по сети своими руками / Андрей Николаенко (IBS)
NVMf: 5 млн IOPS по сети своими руками / Андрей Николаенко (IBS)
Ontico
 

En vedette (11)

Эволюция репликации в MySQL и MariaDB
Эволюция репликации в MySQL и MariaDBЭволюция репликации в MySQL и MariaDB
Эволюция репликации в MySQL и MariaDB
 
Илья Космодемьянский (PostgreSQL-Consulting.com)
Илья Космодемьянский (PostgreSQL-Consulting.com)Илья Космодемьянский (PostgreSQL-Consulting.com)
Илья Космодемьянский (PostgreSQL-Consulting.com)
 
Сергей Житинский, Александр Чистяков (Git in Sky)
Сергей Житинский, Александр Чистяков (Git in Sky)Сергей Житинский, Александр Чистяков (Git in Sky)
Сергей Житинский, Александр Чистяков (Git in Sky)
 
MyRocks: табличный движок для MySQL на основе RocksDB
MyRocks: табличный движок для MySQL на основе RocksDBMyRocks: табличный движок для MySQL на основе RocksDB
MyRocks: табличный движок для MySQL на основе RocksDB
 
Павел Лузанов, Postgres Professional. «PostgreSQL для пользователей Oracle»
Павел Лузанов, Postgres Professional. «PostgreSQL для пользователей Oracle»Павел Лузанов, Postgres Professional. «PostgreSQL для пользователей Oracle»
Павел Лузанов, Postgres Professional. «PostgreSQL для пользователей Oracle»
 
Профилирование кода на C/C++ в *nix-системах / Александр Алексеев (Postgres P...
Профилирование кода на C/C++ в *nix-системах / Александр Алексеев (Postgres P...Профилирование кода на C/C++ в *nix-системах / Александр Алексеев (Postgres P...
Профилирование кода на C/C++ в *nix-системах / Александр Алексеев (Postgres P...
 
NVMf: 5 млн IOPS по сети своими руками / Андрей Николаенко (IBS)
NVMf: 5 млн IOPS по сети своими руками / Андрей Николаенко (IBS)NVMf: 5 млн IOPS по сети своими руками / Андрей Николаенко (IBS)
NVMf: 5 млн IOPS по сети своими руками / Андрей Николаенко (IBS)
 
ZSON, или прозрачное сжатие JSON
ZSON, или прозрачное сжатие JSONZSON, или прозрачное сжатие JSON
ZSON, или прозрачное сжатие JSON
 
Профилирование кода на C/C++ в *nix системах
Профилирование кода на C/C++ в *nix системахПрофилирование кода на C/C++ в *nix системах
Профилирование кода на C/C++ в *nix системах
 
Функциональное программирование - Александр Алексеев
Функциональное программирование - Александр АлексеевФункциональное программирование - Александр Алексеев
Функциональное программирование - Александр Алексеев
 
Новые технологии репликации данных в PostgreSQL - Александр Алексеев
Новые технологии репликации данных в PostgreSQL - Александр АлексеевНовые технологии репликации данных в PostgreSQL - Александр Алексеев
Новые технологии репликации данных в PostgreSQL - Александр Алексеев
 

Similaire à New features-in-mariadb-and-mysql-optimizers

11thingsabout11g 12659705398222 Phpapp01
11thingsabout11g 12659705398222 Phpapp0111thingsabout11g 12659705398222 Phpapp01
11thingsabout11g 12659705398222 Phpapp01
Karam Abuataya
 

Similaire à New features-in-mariadb-and-mysql-optimizers (20)

2012 09 MariaDB Boston Meetup - MariaDB 是 Mysql 的替代者吗
2012 09 MariaDB Boston Meetup - MariaDB 是 Mysql 的替代者吗2012 09 MariaDB Boston Meetup - MariaDB 是 Mysql 的替代者吗
2012 09 MariaDB Boston Meetup - MariaDB 是 Mysql 的替代者吗
 
介绍 Percona 服务器 XtraDB 和 Xtrabackup
介绍 Percona 服务器 XtraDB 和 Xtrabackup介绍 Percona 服务器 XtraDB 和 Xtrabackup
介绍 Percona 服务器 XtraDB 和 Xtrabackup
 
Advanced Query Optimizer Tuning and Analysis
Advanced Query Optimizer Tuning and AnalysisAdvanced Query Optimizer Tuning and Analysis
Advanced Query Optimizer Tuning and Analysis
 
What’s new in MariaDB ColumnStore
What’s new in MariaDB ColumnStoreWhat’s new in MariaDB ColumnStore
What’s new in MariaDB ColumnStore
 
Need for Speed: Mysql indexing
Need for Speed: Mysql indexingNeed for Speed: Mysql indexing
Need for Speed: Mysql indexing
 
IT Tage 2019 MariaDB 10.4 New Features
IT Tage 2019 MariaDB 10.4 New FeaturesIT Tage 2019 MariaDB 10.4 New Features
IT Tage 2019 MariaDB 10.4 New Features
 
What to expect from MariaDB Platform X5, part 2
What to expect from MariaDB Platform X5, part 2What to expect from MariaDB Platform X5, part 2
What to expect from MariaDB Platform X5, part 2
 
11thingsabout11g 12659705398222 Phpapp01
11thingsabout11g 12659705398222 Phpapp0111thingsabout11g 12659705398222 Phpapp01
11thingsabout11g 12659705398222 Phpapp01
 
11 Things About11g
11 Things About11g11 Things About11g
11 Things About11g
 
MySQL 5.7 in a Nutshell
MySQL 5.7 in a NutshellMySQL 5.7 in a Nutshell
MySQL 5.7 in a Nutshell
 
Percona xtra db cluster(pxc) non blocking operations, what you need to know t...
Percona xtra db cluster(pxc) non blocking operations, what you need to know t...Percona xtra db cluster(pxc) non blocking operations, what you need to know t...
Percona xtra db cluster(pxc) non blocking operations, what you need to know t...
 
12c for Developers - Feb 2014
12c for Developers - Feb 201412c for Developers - Feb 2014
12c for Developers - Feb 2014
 
PostgreSQL 9.5 - Major Features
PostgreSQL 9.5 - Major FeaturesPostgreSQL 9.5 - Major Features
PostgreSQL 9.5 - Major Features
 
Dbmsmanual
DbmsmanualDbmsmanual
Dbmsmanual
 
What’s New in MariaDB Server 10.2
What’s New in MariaDB Server 10.2What’s New in MariaDB Server 10.2
What’s New in MariaDB Server 10.2
 
OpenWorld Sep14 12c for_developers
OpenWorld Sep14 12c for_developersOpenWorld Sep14 12c for_developers
OpenWorld Sep14 12c for_developers
 
MariaDB 10.4 New Features
MariaDB 10.4 New FeaturesMariaDB 10.4 New Features
MariaDB 10.4 New Features
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
 
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
 

Plus de Sergey Petrunya

Plus de Sergey Petrunya (20)

New optimizer features in MariaDB releases before 10.12
New optimizer features in MariaDB releases before 10.12New optimizer features in MariaDB releases before 10.12
New optimizer features in MariaDB releases before 10.12
 
MariaDB's join optimizer: how it works and current fixes
MariaDB's join optimizer: how it works and current fixesMariaDB's join optimizer: how it works and current fixes
MariaDB's join optimizer: how it works and current fixes
 
Improved histograms in MariaDB 10.8
Improved histograms in MariaDB 10.8Improved histograms in MariaDB 10.8
Improved histograms in MariaDB 10.8
 
Improving MariaDB’s Query Optimizer with better selectivity estimates
Improving MariaDB’s Query Optimizer with better selectivity estimatesImproving MariaDB’s Query Optimizer with better selectivity estimates
Improving MariaDB’s Query Optimizer with better selectivity estimates
 
JSON Support in MariaDB: News, non-news and the bigger picture
JSON Support in MariaDB: News, non-news and the bigger pictureJSON Support in MariaDB: News, non-news and the bigger picture
JSON Support in MariaDB: News, non-news and the bigger picture
 
ANALYZE for Statements - MariaDB's hidden gem
ANALYZE for Statements - MariaDB's hidden gemANALYZE for Statements - MariaDB's hidden gem
ANALYZE for Statements - MariaDB's hidden gem
 
MariaDB 10.4 - что нового
MariaDB 10.4 - что новогоMariaDB 10.4 - что нового
MariaDB 10.4 - что нового
 
MariaDB Optimizer - further down the rabbit hole
MariaDB Optimizer - further down the rabbit holeMariaDB Optimizer - further down the rabbit hole
MariaDB Optimizer - further down the rabbit hole
 
Lessons for the optimizer from running the TPC-DS benchmark
Lessons for the optimizer from running the TPC-DS benchmarkLessons for the optimizer from running the TPC-DS benchmark
Lessons for the optimizer from running the TPC-DS benchmark
 
MariaDB 10.3 Optimizer - where does it stand
MariaDB 10.3 Optimizer - where does it standMariaDB 10.3 Optimizer - where does it stand
MariaDB 10.3 Optimizer - where does it stand
 
MyRocks in MariaDB | M18
MyRocks in MariaDB | M18MyRocks in MariaDB | M18
MyRocks in MariaDB | M18
 
New Query Optimizer features in MariaDB 10.3
New Query Optimizer features in MariaDB 10.3New Query Optimizer features in MariaDB 10.3
New Query Optimizer features in MariaDB 10.3
 
MyRocks in MariaDB
MyRocks in MariaDBMyRocks in MariaDB
MyRocks in MariaDB
 
Histograms in MariaDB, MySQL and PostgreSQL
Histograms in MariaDB, MySQL and PostgreSQLHistograms in MariaDB, MySQL and PostgreSQL
Histograms in MariaDB, MySQL and PostgreSQL
 
Say Hello to MyRocks
Say Hello to MyRocksSay Hello to MyRocks
Say Hello to MyRocks
 
Common Table Expressions in MariaDB 10.2
Common Table Expressions in MariaDB 10.2Common Table Expressions in MariaDB 10.2
Common Table Expressions in MariaDB 10.2
 
MyRocks in MariaDB: why and how
MyRocks in MariaDB: why and howMyRocks in MariaDB: why and how
MyRocks in MariaDB: why and how
 
MariaDB 10.1 - что нового.
MariaDB 10.1 - что нового.MariaDB 10.1 - что нового.
MariaDB 10.1 - что нового.
 
Window functions in MariaDB 10.2
Window functions in MariaDB 10.2Window functions in MariaDB 10.2
Window functions in MariaDB 10.2
 
MariaDB: ANALYZE for statements (lightning talk)
MariaDB:  ANALYZE for statements (lightning talk)MariaDB:  ANALYZE for statements (lightning talk)
MariaDB: ANALYZE for statements (lightning talk)
 

Dernier

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Dernier (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 

New features-in-mariadb-and-mysql-optimizers

  • 1. Sergei Petrunia, MariaDB New features in MariaDB/MySQL query optimizer
  • 2. 12:49:092 MySQL/MariaDB optimizer development ● Some features have common heritage ● Big releases: – MariaDB 5.3/5.5 – MySQL 5.6 – (upcoming) MariaDB 10.0
  • 3. 12:49:093 New optimizer features Subqueries Batched Key Access (MRR) Index Condition Pushdown Extended Keys EXPLAIN UPDATE/ DELETE Subqueries FROM IN Others PERFORMANCE_SCHEMA Engine-independent statistics InnoDB persistent statistics
  • 4. 12:49:094 New optimizer features Subqueries Batched Key Access (MRR) Index Condition Pushdown Extended Keys EXPLAIN UPDATE/ DELETE Subqueries FROM IN Others Engine-independent statistics InnoDB persistent statistics PERFORMANCE_SCHEMA
  • 5. 12:49:095 Subqueries in MySQL ● Subqueries are practially unusable ● e.g. Facebook disabled them in the parser ● Reason - “naive execution”.
  • 6. 12:49:096 Naive subquery execution ● For IN (SELECT... ) subqueries: select * from hotel where hotel.country='USA' and hotel.name IN (select hotel_stays.hotel from hotel_stays where hotel_stays.customer='John Smith') for (each hotel in USA ) { if (john smith stayed here) { … } } ● Naive execution: ● Slow!
  • 7. 12:49:097 Naive subquery execution (2) ● For FROM(SELECT …) subquereis: 1. Retrieve all hotels with > 500 rooms, store in a temporary table big_hotel; 2. Search in big_hotel for hotels near AMS. ● Naive execution: ● Slow! select * from (select * from hotel where hotel.rooms > 500 ) as big_hotel where big_hotel.nearest_aiport='AMS';
  • 8. 12:49:098 New subquery optimizations ● Handle IN (SELECT ...) ● Handle FROM (SELECT …) ● Handle a lot of cases ● Comparison with PostgreSQL – ~1000x slower before – ~same order of magnitude now ● Releases – MySQL 6.0 – MariaDB 5.5 ● Sheeri Kritzer @ Mozilla seems happy with this one – MySQL 5.6 ● Subset of MariaDB 5.5's features
  • 9. 12:49:099 Subquery optimizations - summary ● Subqueries were generally unusable before MariaDB 5.3/5.5 ● “Core” subquery optimizations are in – MariaDB 5.3/5.5 – MySQL 5.6 ● MariaDB has extra additions ● Further information: https://kb.askmonty.org/en/subquery-optimizations/
  • 10. 12:49:0910 Subqueries Batched Key Access (MRR) Index Condition Pushdown Extended Keys EXPLAIN UPDATE/ DELETE Subqueries FROM IN Others Engine-independent statistics InnoDB persistent statistics PERFORMANCE_SCHEMA
  • 11. 12:49:0911 Batched Key Access - background ● Big, IO-bound joins were slow – DBT-3 benchmark could not finish* ● Reason? ● Nested Loops join hits the second table at random locations.
  • 12. 12:49:0912 Batched Key Access idea Nested Loops Join Batched Key Access Speedup reasons ● Fewer disk head movements ● Cache-friendliness ● Prefetch-friendliness
  • 13. 12:49:0913 Batched Key Access benchmark set join_cache_level=6; – enable BKA select max(l_extendedprice) from orders, lineitem where l_orderkey=o_orderkey and o_orderdate between $DATE1 and $DATE2 Run with ● Various join_buffer_size settings ● Various size of $DATE1...$DATE2 range
  • 14. 12:49:0914 Batched Key Access benchmark (2) -2,000,000 3,000,000 8,000,000 13,000,000 18,000,000 23,000,000 28,000,000 33,000,000 0 500 1000 1500 2000 2500 3000 BKA join performance depending on buffer size query_size=1, regular query_size=1, BKA query_size=2, regular query_size=2, BKA query_size=3, regular query_size=3, BKA Buffer size, bytes Querytime,sec Performance without BKA Performance with BKA, given sufficient buffer size
  • 15. 12:49:0915 Batched Key Access summary ● Optimization for big, IO-bound joins – Orders-of-magnitude speedups ● Available in – MariaDB 5.3/5.5 (more advanced) – MySQL 5.6 ● Not fully automatic yet – Needs to be manually enabled – Need to set buffer sizes.
  • 16. 12:49:0916 Subqueries Batched Key Access (MRR) Index Condition Pushdown Extended Keys EXPLAIN UPDATE/ DELETE Subqueries FROM IN Others Engine-independent statistics InnoDB persistent statistics PERFORMANCE_SCHEMA
  • 17. 12:49:0917 Index Condition Pushdown alter table lineitem add index s_r (l_shipdate, l_receiptdate); select count(*) from lineitem where l_shipdate between '1993-01-01' and '1993-02-01' and datediff(l_receiptdate,l_shipdate) > 25 and l_quantity > 40 ● A new feature in MariaDB 5.3/ MySQL 5.6 +----+-------------+----------+-------+---------------+------+---------+------+--------+------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------+-------+---------------+------+---------+------+--------+------------------------------------+ | 1 | SIMPLE | lineitem | range | s_r | s_r | 4 | NULL | 158854 | Using index condition; Using where | +----+-------------+----------+-------+---------------+------+---------+------+--------+------------------------------------+ 1.Read index records in the range l_shipdate between '1993-01-01' and '1993-02-01' 2.Check the index condition datediff(l_receiptdate,l_shipdate) > 25 3.Read full table rows 4.Check the WHERE condition l_quantity > 40 ← New! ← Filters out records before table rows are read
  • 18. 12:49:0918 Index Condition Pushdown - conclusions Summary ● Applicable to any index-based access (ref, range, etc) ● Checks parts of WHERE after reading the index ● Reduces number of table records to be read ● Speedup can be like in “Using index” – Great for IO-bound load (5x, 10x) – Some for CPU-bound workload (2x) Conclusions ● Have a selective condition on column? – Put the column into index, at the end.
  • 19. 12:49:0919 Extended keys ● Before: optimizer has limited support for “tail” columns – 'Using index' supports it – ORDER BY col1, col2, pk1 support it ● After MariaDB 5.5/ MySQL 5.6 – all parts of optimizer (ref access, range access, etc) can use the “tail” CREATE TABLE tbl ( pk1 sometype, pk2 sometype, ... col1 sometype, col2 sometype, ... KEY indexA (col1, col2) ... PRIMARY KEY (pk1, pk2) ) ENGINE=InnoDB indexA col1 col2 pk1 pk2 ● Secondary indexes in InnoDB have invisible “tail”
  • 20. 12:49:0920 Subqueries Batched Key Access (MRR) Index Condition Pushdown Extended Keys EXPLAIN UPDATE/ DELETE Subqueries FROM IN Others Engine-independent statistics InnoDB persistent statistics PERFORMANCE_SCHEMA
  • 21. 12:49:0921 Better EXPLAIN in MySQL 5.6 ● EXPLAIN for UPDATE/DELETE/INSERT … SELECT – shows query plan for the finding records to update/delete mysql> explain update customer set c_acctbal = c_acctbal - 100 where c_custkey=12354; +----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+ | 1 | SIMPLE | customer | range | PRIMARY | PRIMARY | 4 | NULL | 1 | Using where | +----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+ ● EXPLAIN FORMAT=JSON – Produces [big] JSON output – Shows more information: ● Shows conditions attached to tables ● Shows whether “Using temporary; using filesort” is done to handle GROUP BY or ORDER BY. ● Shows where subqueries are attached – No other known additions – Will be in MariaDB 10.0 The most useful addition!
  • 22. 12:49:0922 EXPLAIN FORMAT=JSON What are the “conditions attached to tables”? explain select count(*) from orders, customer where customer.c_custkey=orders.o_custkey and customer.c_mktsegment='BUILDING' and orders.o_totalprice > customer.c_acctbal and orders.o_orderpriority='1-URGENT' +----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+ | 1 | SIMPLE | customer | ALL | PRIMARY | NULL | NULL | NULL | 1509871 | Using where | | 1 | SIMPLE | orders | ref | i_o_custkey | i_o_custkey | 5 | dbt3sf10.customer.c_custkey | 7 | Using where | +----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+ ?
  • 23. 12:49:0923 EXPLAIN FORMAT=JSON (2) { "query_block": { "select_id": 1, "nested_loop": [ { "table": { "table_name": "customer", "access_type": "ALL", "possible_keys": [ "PRIMARY" ], "rows": 1509871, "filtered": 100, "attached_condition": "(`dbt3sf10`.`customer`.`c_mktsegment` = 'BUILDING')" } }, { "table": { "table_name": "orders", "access_type": "ref", "possible_keys": [ "i_o_custkey" ], "key": "i_o_custkey", "used_key_parts": [ "o_custkey" ], "key_length": "5", "ref": [ "dbt3sf10.customer.c_custkey" ], "rows": 7, "filtered": 100, "attached_condition": "((`dbt3sf10`.`orders`.`o_orderpriority` = '1-URGENT') and (`dbt3sf10`.`orders`.`o_totalprice` > `dbt3sf10`.`customer`.`c_acctbal`))" } } ] } } +----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+ | 1 | SIMPLE | customer | ALL | PRIMARY | NULL | NULL | NULL | 1509871 | Using where | | 1 | SIMPLE | orders | ref | i_o_custkey | i_o_custkey | 5 | dbt3sf10.customer.c_custkey | 7 | Using where | +----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+
  • 24. 12:49:0924 EXPLAIN ANALYZE (kind of) ● Does EXPLAIN match the reality? ● Where is most of the time spent? ● MySQL/MariaDB don't have “EXPLAIN ANALYZE” ... select count(*) from orders, customer where customer.c_custkey=orders.o_custkey and customer.c_mktsegment='BUILDING' and orders.o_orderpriority='1-URGENT' +------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+ | 1 | SIMPLE | customer | ALL | PRIMARY | NULL | NULL | NULL | 149415 | Using where | | 1 | SIMPLE | orders | ref | i_o_custkey | i_o_custkey | 5 | customer.c_custkey | 7 | Using index | +------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+
  • 25. 12:49:0925 Traditional solution: Status variables Problems: ● Only #rows counters ● all tables are counted together mysql> flush status; Query OK, 0 rows affected (0.00 sec) mysql> {run query} mysql> show status like 'Handler%'; +----------------------------+--------+ | Variable_name | Value | +----------------------------+--------+ | Handler_commit | 1 | | Handler_delete | 0 | | Handler_discover | 0 | | Handler_icp_attempts | 0 | | Handler_icp_match | 0 | | Handler_mrr_init | 0 | | Handler_mrr_key_refills | 0 | | Handler_mrr_rowid_refills | 0 | | Handler_prepare | 0 | | Handler_read_first | 0 | | Handler_read_key | 30142 | | Handler_read_last | 0 | | Handler_read_next | 303959 | | Handler_read_prev | 0 | | Handler_read_rnd | 0 | | Handler_read_rnd_deleted | 0 | | Handler_read_rnd_next | 150001 | | Handler_rollback | 0 | ... . . .
  • 26. 12:49:0926 Newer solution: userstat ● In Facebook patch, Percona, MariaDB: mysql> set global userstat=1; mysql> flush table_statistics; mysql> flush index_statistics; mysql> {query} mysql> show table_statistics; +--------------+------------+-----------+--------------+-------------------------+ | Table_schema | Table_name | Rows_read | Rows_changed | Rows_changed_x_#indexes | +--------------+------------+-----------+--------------+-------------------------+ | dbt3sf1 | orders | 303959 | 0 | 0 | | dbt3sf1 | customer | 150000 | 0 | 0 | +--------------+------------+-----------+--------------+-------------------------+ mysql> show index_statistics; +--------------+------------+-------------+-----------+ | Table_schema | Table_name | Index_name | Rows_read | +--------------+------------+-------------+-----------+ | dbt3sf1 | orders | i_o_custkey | 303959 | +--------------+------------+-------------+-----------+ ● Counters are per-table – Ok as long as you don't have self-joins ● Overhead is negligible ● Counters are server-wide (other queries affect them, too)
  • 27. 12:49:0927 Latest addition: PERFORMANCE_SCHEMA ● Allows to measure *time* spent reading each table ● Has some visible overhead (Facebook's tests: 7%) ● Counters are system-wide ● Still no luck with self-joins mysql> truncate performance_schema.table_io_waits_summary_by_table; mysql> {query} mysql> select object_schema, object_name, count_read, sum_timer_read, -- this is picoseconds sum_timer_read / (1000*1000*1000*1000) as read_seconds -- this is seconds from performance_schema.table_io_waits_summary_by_table where object_schema = 'dbt3sf1' and object_name in ('orders','customer'); +---------------+-------------+------------+----------------+--------------+ | object_schema | object_name | count_read | sum_timer_read | read_seconds | +---------------+-------------+------------+----------------+--------------+ | dbt3sf1 | orders | 334101 | 5739345397323 | 5.7393 | | dbt3sf1 | customer | 150001 | 1273653046701 | 1.2737 | +---------------+-------------+------------+----------------+--------------+
  • 28. 12:49:0928 Subqueries Batched Key Access (MRR) Index Condition Pushdown Extended Keys EXPLAIN UPDATE/ DELETE Subqueries FROM IN Others Engine-independent statistics InnoDB persistent statistics PERFORMANCE_SCHEMA
  • 29. 12:49:0929 What is table/index statistics? select count(*) from customer, orders where customer.c_custkey=orders.o_custkey and customer.c_mktsegment='BUILDING'; +------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+ | 1 | SIMPLE | customer | ALL | PRIMARY | NULL | NULL | NULL | 148305 | Using where | | 1 | SIMPLE | orders | ref | i_o_custkey | i_o_custkey | 5 | customer.c_custkey | 7 | Using index | +------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+ MariaDB > show table status like 'orders'G *************************** 1. row *************************** Name: orders Engine: InnoDB Version: 10 Row_format: Compact Rows: 1495152 ............. MariaDB > show keys from orders where key_name='i_o_custkey'G *************************** 1. row *************************** Table: orders Non_unique: 1 Key_name: i_o_custkey Seq_in_index: 1 Column_name: o_custkey Collation: A Cardinality: 212941 Sub_part: NULL ................. ? 1495152 / 212941 = 7 “There are on average 7 orders for a given c_custkey”
  • 30. 12:49:0930 The problem with index statistics and InnoDB MySQL 5.5, InnoDB ● Statistics is calculated on-the-fly – When the table is opened (server restart, DDL) – When sufficient number of records have been updated – ... ● Calculation uses random sampling – @@innodb_stats_sample_pages ● Result: – Statistics changes without warning => Query plans change, without warning ● For example, DBT-3 benchmark – 22 analytics queries – Plans-per-query: avg=2.8, max=7.
  • 31. 12:49:0931 Persistent table statistics Persistent statistics v1 ● Percona Server 5.5 (ported to MariaDB 5.5) – Need to enable it: innodb_use_sys_stats_table=1 ● Statistics is stored inside InnoDB – User-visible through information_schema.innodb_sys_stats (read-only) ● Setting innodb_stats_auto_update=OFF prevents unexpected updates Persistent statistics v2 ● MySQL 5.6 – Enabled by default: innodb_stats_persistent=1 ● Stored in regular InnoDB tables – mysql.innodb_table_stats, mysql.innodb_index_stats ● Setting innodb_stats_auto_recalc=OFF prevents unexpected updates ● Can also specify persistence/auto-recalc as a table option
  • 32. 12:49:0932 Persistent table statistics - summary ● Percona, then MySQL – Made statistics persistent – Disallowed automatic updates ● Remaining issue #1: it's still random sampling – DBT-3 benchmark – scale=30 – Re-ran EXPLAINS for benchmark queries – Counted different query plans ● Remaining issue #2: limited amount of statistics – Only on index columns – Only AVG(#different_values)
  • 33. 12:49:0933 Upcoming: Engine-independent statistics MariaDB 10.0: Engine-independent statistics ● Collected/used on SQL layer ● No auto updates, only ANALYZE TABLE – 100% precise statics ● More statistics – Index statistics (like before) – Table statistics (like before) – Column statistics ● MIN/MAX values ● Number of NULL / not NULL values ● Histograms ● => Optimizer will be smarter and more reliable
  • 34. 12:49:0934 Conclusions ● Lots of new query optimizer features recently – Subqueries now just work – Big joins are much faster ● Need to turn it on – More diagnostics ● Even more is coming ● Releases with features – MariaDB 5.5 – MySQL 5.6, – (upcoming) MariaDB 10.0
  • 35. 12:49:0935 New optimizer features Subqueries Batched Key Access (MRR) Index Condition Pushdown Extended Keys EXPLAIN UPDATE/ DELETE Subqueries FROM IN Others PERFORMANCE_SCHEMA Engine-independent statistics InnoDB persistent statistics