SlideShare une entreprise Scribd logo
1  sur  32
Télécharger pour lire hors ligne
Sergei Petrunia
MariaDB devroom
FOSDEM 2021
Join Optimizer
1. How it works
2. What we’re working on to improve it
Optimizer Call
July 2022
Sergei Petrunia
MariaDB
2
Join order search
●
Total number of possible join orders
for N-table join is:
N * (N-1) * (N-2) *… = N!
●
Join orders are built left-to-right
●
Cannot enumerate all possible join
orders.
t1
t1
t2
t2
t3
t4
t3
t4
t2
t4
t2
t3
t4
t3
t4
t2
t3
t2
t1
t3
t4
t3
t4
t1
t4
t1
t3
t4
t3
t4
t1
t3
t1
t3
t4
3
Pruning
●
Enumerate promising join orders
first.
●
Do not explore join orders that are
apparently worse.
t1
t1
t2
t2
t3
t4
t3
t4
t2
t4
t2
t3
t4
t3
t4
t2
t3
t2
t1
t3
t4
t3
t4
t1
t4
t1
t3
t4
t3
t4
t1
t3
t1
t3
t4
4
Pruning # 1: by cost
●
Cost of the current_prefix is
already higher than total cost of
best plan.
– Adding tables will make it even
higher
– No point to try.
●
This pruning is always done (no
switch)
●
Optimizer trace: pruned_by_cost
t1
t1
t2
t2
t3
t4
t3
t4
t2
t4
t2
t3
t4
t3
t4
t2
t3
t2
t1
t3
t4
t3
t4
t1
t4
t1
t3
t4
t3
t4
t1
t3
t1
t3
t4
5
pruned_by_cost weaknesses
●
A really expensive table at the
end of the join order.
●
Any prefix that doesn’t include it is
relatively cheap
– Even if its comparably worse:
–
–
●
=> No pruning.
t1
t1
t2
t2
t3
t4
t3
t4
t2
t4
t2
t3
t4
t3
t4
t2
t3
t2
t1
t4
t3
t3
t4
t1
t3
t1
t4
t4
t3
t3
t1
t4
t1
t3
t4
t1
t1 t4
t2 t4
6
Pruning # 2: by heuristic
●
Adding a table tX to a join prefix
– Adds read_time (time to read tX)
– Produces record_count row
combinations to be joined with further
tables (aka “join suffix”).
– Both have an effect on the total cost:
●
read_time is time spent right now.
●
record_count will affect cost of join
suffix.
– We don’t know the “exchange ratio”
because we don’t know the costs of
“join suffix”.
t0
t1
t2
t3
incoming_record_count
record_count_t1
record_count_t2
record_count_t3
read_time_t1
7
The idea behind the heuristic
– … we don’t know the “exchange ratio”
because we don’t know the costs of “join
suffix”
●
Also the suffixes are different!
– Let’s assume the suffixes have similar costs.
– Then, if
●
read_time_t1 < read_time_t2, AND
●
record_count_t1 < record_count_t2
– Then t1 “is better” than t2.
– Can prune away t2.
t0
t1
t2
t3
incoming_record_count
record_count_t1
record_count_t2
record_count_t3
read_time_t1
8
Applying heuristic pruning
●
Do it locally in each join prefix
●
First, consider more promising
tables first.
●
Less-promising tables second
– And try to prune them away.
t1
t1
t2
t2
t3
t4
t3
t4
t2
t4
t2
t3
t4
t3
t4
t2
t3
t2
t1
t3
t4
t3
t4
t1
t4
t1
t3
t4
t3
t4
t1
t3
t1
t3
t4
9
Pruning # 2: by heuristic
●
A Model Table (yes, I’ve just invented this term):
– Lowest read_time AND record_count seen so
far
– Either
●
record_count < 2.0, or
●
there are no possible "key dependencies" on
tables not in the prefix
– A “possible key dependency” is an eqality in form:
tbl.keyXpartY=expr(tables_no_in_prefix)
●
^^ this is a “heuristic” to apply the heuristic.
●
Prune away tables that have both worse read_time
and record_count than the Model Table.
t1
t1
t2
t2
t3
t4
t3
t4
t2
t4
t2
t3
t4
t3
t4
t2
t3
t2
t1
t3
t4
t3
t4
t1
t4
t1
t3
t4
t3
t4
t1
t3
t1
t3
t4
10
How one can see heuristic pruning
●
@@optimizer_prune_level
– 0 – not enabled.
– 1 – enabled (the default)
●
Optimizer trace: grep for
“pruned_by_heuristic”
t1
t1
t2
t2
t3
t4
t3
t4
t2
t4
t2
t3
t4
t3
t4
t2
t3
t2
t1
t3
t4
t3
t4
t1
t4
t1
t3
t4
t3
t4
t1
t3
t1
t3
t4
11
Greedy Search
12
Greedy search
●
Consider only prefixes of limited size
– Based on that, pick the first table
– Repeat
●
@@optimizer_search_depth
– Default: 62
(both MySQL and MariaDB)
– 0 – “pick depth automatically”
●
Why is this not default yet?
t1
t1
t2
t2
t3
t4
t3
t4
t2
t4
t2
t3
t4
t3
t4
t2
t3
t2
t1
t3
t4
t3
t4
t1
t4
t1
t3
t4
t3
t4
t1
t3
t1
t3
t4
13
MDEV-28073
(fixed in 10.6)
14
MDEV-28073: patch #1: “edge tables”
●
If the suffix t1-t4-t3 uses only eq_ref or similar:
– It is [nearly] the best
– Don’t enumerate other table combinations.
●
They can’t be much better.
●
Optimizer trace: pruned_by_hanging_leaf
t1
t1
t2
t2
t3
t4
t3
t4
t2
t4
t2
t3
t4
t3
t4
t2
t3
t2
t1
t3
t4
t3
t4
t1
t4
t1
t3
t4
t3
t4
t1
t3
t1
commit b729896d00e022f6205399376c0cc107e1ee0704
Author: Monty <monty@mariadb.org>
Date: Tue May 10 11:47:20 2022 +0300
MDEV-28073 Query performance degradation in newer MariaDB versions when
using many tables
The issue was that best_extension_by_limited_search() had to go through
too many plans with the same cost as there where many EQ_REF tables.
Fixed by shortcutting EQ_REF (AND REF) when the result only contains one
row. This got the optimization time down from hours to sub seconds.
t0
15
MDEV-28073: patch #2: key_dependent
select ...
from
person, car_rides, bicycle_rides
where
person.name=car_rides.rider and
person.name=bicycle_rides.rider and
...
car_rides bicycle_rides
person
●
Remember the “heuristics to apply the heuristic” a few slides above:
– there are no possible "key dependencies" on tables not in the prefix
It can be false due to multi-equalities:
●
person.name=bicycle_rides.riders is a “possible key dependency”.
●
But we already have person.name from car_rides.rider (the equality is “bound”)
– Trying join orders with bicycle_rides before person won’t produce a better plan.
●
Solution: adjust the heuristic: there are no possible key_dependencies on tables not in
the prefix that are not already bound.
name
16
MDEV-28073: patch #3: table order de-scrambling
●
The optimizer should try good tables first
●
Implemented by taking tables off the unused portion of
join->best_ref array.
– Initially it’s ordered (“promising” tables first)
– But due to bug eventually gets out of order
●
Plan searches that enumerate many options could
suffer from poor pruning towards the end.
t1
t1
t2
t2
t3
t4
t3
t4
t2
t4
t2
t3
t4
t3
t4
t2
t3
t2
t1
t3
t4
t3
t4
t1
t4
t1
t3
t4
t3
t4
t1
t3
t1
Author: Michael Widenius <monty@mariadb.org>
Date: Sun May 15 15:46:29 2022 +0300
greedy_search() and best_extension_by_limited_search() scrambled table order
best_extension_by_limited_search() assumes that tables should be sorted
according to size to be able to quickly disregard bad plans. However the
current usage of swap_variables() will change the table order to a not
sorted one for the next recursive call. This breaks the assumtion and
causes performance issues when using many tables (we have to examine
many more plans).
t0
17
MDEV-28852
(MariaDB 10.10)
18
Local table pre-sorting
19
In which order do we try the tables?
●
Current:
– join_tab_cmp() orders all tables by their
JOIN_TAB::found_records
(records after table’s condition is checked)
– The same ordering is used everywhere
– This *ignores* the join prefix and efficien
table read plans we can use
– e.g. here, ignores the prefix of t1:
t1
t1
t2
t2
t3
t4
t3
t4
t2
t4
t2
t3
t4
t3
t4
t2
t3
t2
t1
t3
t4
t3
t4
t1
t4
t1
t3
t4
t3
t4
t1
t3
t1
t3
t4
t1
t1
t2
t3
t4
20
In which order do we try the tables?
●
First, evaluate possible table accesses for {t2,t3,t4}.
●
Sort them by #found_rows
●
Then try extending join orders
– Do all kinds of pruning while doing this
t1
t1
t2
t3
t4
t3
t4
t2
t4
t2
t3
t4
t3
t4
t2
t3
t2
commit 0762dd9283185c72c6955f44fc4d862a0a928569
Author: Monty <monty@mariadb.org>
Date: Tue May 31 17:36:32 2022 +0300
Improve pruning in greedy_search by sorting tables during search
MDEV-28073 Slow query performance in MariaDB when using many tables
The faster we can find a good query plan, the more options we have for
finding and pruning (ignoring) bad plans.
This patch adds sorting of plans to best_extension_by_limited_search().
21
Improving the pruning
(MDEV-28929)
22
Remember: the idea behind the heuristic
– … we don’t know the “exchange ratio”
because we don’t know the costs of “join
suffix”
●
Also the suffixes are different!
– Let’s assume the suffixes have similar costs.
– Then, if
●
read_time_t1 < read_time_t2, AND
●
record_count_t1 < record_count_t2
– Then t1 “is better” than t2.
– Can prune away t2.
–
t0
t1
t2
t3
incoming_record_count
record_count_t1
record_count_t2
record_count_t3
read_time_t1
23
Let’s plot the tables
Let’s plot
records_read
read_time
t1
24
Let’s plot the tables
records_read
read_time
t1
Better than t1
Worse than t1
25
Let’s plot the tables
●
“Typical” situation: plan with
higher cost produce more #rows.
– A lot of opportunities to do
pruning
●
The optimizer orders plans by
records_read
– (that is, goes left-to-right)
– Pick the first plan as “Model”,
prune those that are worse.
records_read
read_time
t1
t0
t2
t3
t4
26
When pruning doesn’t work
records_read
read_time
t1
t0
t2
t3
t4
●
“Bad” situation:
– plans with high cost produce
few rows
– And vice versa
●
Can’t do pruning.
27
When pruning could work but doesn’t
records_read
read_time
t1
t2
t3
t4
t0
●
Walking left-to-right, the optimizer
picks t0 as Model table.
●
And then can’t prune away any
other table.
Tables that are worse than t0 are here
28
How to do as much pruning as possible?
records_read
read_time
●
Pick a minimal set of Model
tables that allow to prune away
the rest?
●
Complexity seems to be at least
N^2.
●
Some approximate algorithm?
– Use the table with min_cost
– Use the table with
min_records_read
●
Have a patch with some
approximate implementation
29
eq_ref chaining
30
Motivation
●
Tables with “attributes” that are joined using Primary Key
select * from base_table, attr1, attr2, ... attrN
where
attr1.pk = base_table.pk and
attr2.pk = base_table.pk and
...
attrN.pk = base_table.pk
●
Lots of nearly-identical query plans: There are factorial(n_attributes) permutations
– Have the same or very close cost
●
=> Can’t do pruning
●
The fix with “Edge tables” aka pruned_by_hanging_leaf helps but only if the
attributes are at the end of the join order.
31
eq_ref chaining
●
The idea: if we see a eq_ref access, try
considering only eq_refs as long as we can.
●
MySQL 5.7 has a similar optimization
– TODO: describe the differences.
t1
t1
t2
t2
t3
t4
t3
t4
t2
t4
t2
t3
t4
t3
t4
t2
t3
t2
t1
t3
t4
t3
t4
t1
t4
t1
t3
t4
t3
t4
t1
t3
t1
t3
t4
commit 5abb6bff6cfb5cb5d87520f1e32e9b41db46bd7b
Author: Monty <monty@mariadb.org>
Date: Thu Jun 2 19:47:23 2022 +0300
Added EQ_REF chaining to the greedy_optimizer
MDEV-28073 Slow query performance in MariaDB when using many table
The idea is to prefer and chain EQ_REF tables (tables that uses an
unique key to find a row) when searching for the best table combination.
This significantly reduces row combinations that has to be examined.
This is optimization is enabled when setting optimizer_prune_level=2 (default)
32
Thanks for your attention!

Contenu connexe

Tendances

How to use histograms to get better performance
How to use histograms to get better performanceHow to use histograms to get better performance
How to use histograms to get better performanceMariaDB plc
 
Advanced MySQL Query Tuning
Advanced MySQL Query TuningAdvanced MySQL Query Tuning
Advanced MySQL Query TuningAlexander Rubin
 
Using Optimizer Hints to Improve MySQL Query Performance
Using Optimizer Hints to Improve MySQL Query PerformanceUsing Optimizer Hints to Improve MySQL Query Performance
Using Optimizer Hints to Improve MySQL Query Performanceoysteing
 
Histograms in MariaDB, MySQL and PostgreSQL
Histograms in MariaDB, MySQL and PostgreSQLHistograms in MariaDB, MySQL and PostgreSQL
Histograms in MariaDB, MySQL and PostgreSQLSergey Petrunya
 
MySQL partitions tutorial
MySQL partitions tutorialMySQL partitions tutorial
MySQL partitions tutorialGiuseppe Maxia
 
How to analyze and tune sql queries for better performance percona15
How to analyze and tune sql queries for better performance percona15How to analyze and tune sql queries for better performance percona15
How to analyze and tune sql queries for better performance percona15oysteing
 
ANALYZE for Statements - MariaDB's hidden gem
ANALYZE for Statements - MariaDB's hidden gemANALYZE for Statements - MariaDB's hidden gem
ANALYZE for Statements - MariaDB's hidden gemSergey Petrunya
 
MySQL 8.0 Optimizer Guide
MySQL 8.0 Optimizer GuideMySQL 8.0 Optimizer Guide
MySQL 8.0 Optimizer GuideMorgan Tocker
 
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013Sergey Petrunya
 
MySQL Indexing - Best practices for MySQL 5.6
MySQL Indexing - Best practices for MySQL 5.6MySQL Indexing - Best practices for MySQL 5.6
MySQL Indexing - Best practices for MySQL 5.6MYXPLAIN
 
MySQL Performance Tuning: Top 10 Tips
MySQL Performance Tuning: Top 10 TipsMySQL Performance Tuning: Top 10 Tips
MySQL Performance Tuning: Top 10 TipsOSSCube
 
Real-time, Exactly-once Data Ingestion from Kafka to ClickHouse at eBay
Real-time, Exactly-once Data Ingestion from Kafka to ClickHouse at eBayReal-time, Exactly-once Data Ingestion from Kafka to ClickHouse at eBay
Real-time, Exactly-once Data Ingestion from Kafka to ClickHouse at eBayAltinity Ltd
 
Mysql Explain Explained
Mysql Explain ExplainedMysql Explain Explained
Mysql Explain ExplainedJeremy Coates
 

Tendances (20)

How to use histograms to get better performance
How to use histograms to get better performanceHow to use histograms to get better performance
How to use histograms to get better performance
 
Advanced MySQL Query Tuning
Advanced MySQL Query TuningAdvanced MySQL Query Tuning
Advanced MySQL Query Tuning
 
Using Optimizer Hints to Improve MySQL Query Performance
Using Optimizer Hints to Improve MySQL Query PerformanceUsing Optimizer Hints to Improve MySQL Query Performance
Using Optimizer Hints to Improve MySQL Query Performance
 
Sql query patterns, optimized
Sql query patterns, optimizedSql query patterns, optimized
Sql query patterns, optimized
 
Histograms in MariaDB, MySQL and PostgreSQL
Histograms in MariaDB, MySQL and PostgreSQLHistograms in MariaDB, MySQL and PostgreSQL
Histograms in MariaDB, MySQL and PostgreSQL
 
MySQL partitions tutorial
MySQL partitions tutorialMySQL partitions tutorial
MySQL partitions tutorial
 
How to analyze and tune sql queries for better performance percona15
How to analyze and tune sql queries for better performance percona15How to analyze and tune sql queries for better performance percona15
How to analyze and tune sql queries for better performance percona15
 
ANALYZE for Statements - MariaDB's hidden gem
ANALYZE for Statements - MariaDB's hidden gemANALYZE for Statements - MariaDB's hidden gem
ANALYZE for Statements - MariaDB's hidden gem
 
MySQL 8.0 Optimizer Guide
MySQL 8.0 Optimizer GuideMySQL 8.0 Optimizer Guide
MySQL 8.0 Optimizer Guide
 
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
 
MySQL Indexing - Best practices for MySQL 5.6
MySQL Indexing - Best practices for MySQL 5.6MySQL Indexing - Best practices for MySQL 5.6
MySQL Indexing - Best practices for MySQL 5.6
 
SQL
SQLSQL
SQL
 
Software Testing
Software TestingSoftware Testing
Software Testing
 
SQL JOINS
SQL JOINSSQL JOINS
SQL JOINS
 
How to Use JSON in MySQL Wrong
How to Use JSON in MySQL WrongHow to Use JSON in MySQL Wrong
How to Use JSON in MySQL Wrong
 
MySQL Performance Tuning: Top 10 Tips
MySQL Performance Tuning: Top 10 TipsMySQL Performance Tuning: Top 10 Tips
MySQL Performance Tuning: Top 10 Tips
 
2D Array
2D Array2D Array
2D Array
 
Merge sort
Merge sortMerge sort
Merge sort
 
Real-time, Exactly-once Data Ingestion from Kafka to ClickHouse at eBay
Real-time, Exactly-once Data Ingestion from Kafka to ClickHouse at eBayReal-time, Exactly-once Data Ingestion from Kafka to ClickHouse at eBay
Real-time, Exactly-once Data Ingestion from Kafka to ClickHouse at eBay
 
Mysql Explain Explained
Mysql Explain ExplainedMysql Explain Explained
Mysql Explain Explained
 

Similaire à MariaDB's join optimizer: how it works and current fixes

Algorithim lec1.pptx
Algorithim lec1.pptxAlgorithim lec1.pptx
Algorithim lec1.pptxrediet43
 
Advanced query optimization
Advanced query optimizationAdvanced query optimization
Advanced query optimizationMYXPLAIN
 
Basics in algorithms and data structure
Basics in algorithms and data structure Basics in algorithms and data structure
Basics in algorithms and data structure Eman magdy
 
PostgreSQL query planner's internals
PostgreSQL query planner's internalsPostgreSQL query planner's internals
PostgreSQL query planner's internalsAlexey Ermakov
 
Query Compilation in Impala
Query Compilation in ImpalaQuery Compilation in Impala
Query Compilation in ImpalaCloudera, Inc.
 
Merge sort analysis and its real time applications
Merge sort analysis and its real time applicationsMerge sort analysis and its real time applications
Merge sort analysis and its real time applicationsyazad dumasia
 
MySQL Query Optimisation 101
MySQL Query Optimisation 101MySQL Query Optimisation 101
MySQL Query Optimisation 101Federico Razzoli
 
Press the link to see the book from my google drive.https.docx
Press the link to see the book from my google drive.https.docxPress the link to see the book from my google drive.https.docx
Press the link to see the book from my google drive.https.docxChantellPantoja184
 
5 Cool Things About PLSQL
5 Cool Things About PLSQL5 Cool Things About PLSQL
5 Cool Things About PLSQLConnor McDonald
 
Machines constrained flow shop scheduling processing time, setup time each as...
Machines constrained flow shop scheduling processing time, setup time each as...Machines constrained flow shop scheduling processing time, setup time each as...
Machines constrained flow shop scheduling processing time, setup time each as...Alexander Decker
 
11.machines constrained flow shop scheduling processing time, setup time each...
11.machines constrained flow shop scheduling processing time, setup time each...11.machines constrained flow shop scheduling processing time, setup time each...
11.machines constrained flow shop scheduling processing time, setup time each...Alexander Decker
 
11.machines constrained flow shop scheduling processing time, setup time each...
11.machines constrained flow shop scheduling processing time, setup time each...11.machines constrained flow shop scheduling processing time, setup time each...
11.machines constrained flow shop scheduling processing time, setup time each...Alexander Decker
 
Electrical Engineering Exam Help
Electrical Engineering Exam HelpElectrical Engineering Exam Help
Electrical Engineering Exam HelpLive Exam Helper
 
Algorithm analysis
Algorithm analysisAlgorithm analysis
Algorithm analysisAkshay Dagar
 
Rtl design optimizations and tradeoffs
Rtl design optimizations and tradeoffsRtl design optimizations and tradeoffs
Rtl design optimizations and tradeoffsGrace Abraham
 
Oracle Join Methods and 12c Adaptive Plans
Oracle Join Methods and 12c Adaptive PlansOracle Join Methods and 12c Adaptive Plans
Oracle Join Methods and 12c Adaptive PlansFranck Pachot
 
How to tune a query - ODTUG 2012
How to tune a query - ODTUG 2012How to tune a query - ODTUG 2012
How to tune a query - ODTUG 2012Connor McDonald
 

Similaire à MariaDB's join optimizer: how it works and current fixes (20)

Algorithim lec1.pptx
Algorithim lec1.pptxAlgorithim lec1.pptx
Algorithim lec1.pptx
 
Advanced query optimization
Advanced query optimizationAdvanced query optimization
Advanced query optimization
 
Basics in algorithms and data structure
Basics in algorithms and data structure Basics in algorithms and data structure
Basics in algorithms and data structure
 
PostgreSQL query planner's internals
PostgreSQL query planner's internalsPostgreSQL query planner's internals
PostgreSQL query planner's internals
 
Query Compilation in Impala
Query Compilation in ImpalaQuery Compilation in Impala
Query Compilation in Impala
 
Merge sort analysis and its real time applications
Merge sort analysis and its real time applicationsMerge sort analysis and its real time applications
Merge sort analysis and its real time applications
 
MySQL Query Optimisation 101
MySQL Query Optimisation 101MySQL Query Optimisation 101
MySQL Query Optimisation 101
 
Data Structures 6
Data Structures 6Data Structures 6
Data Structures 6
 
Flowshop scheduling
Flowshop schedulingFlowshop scheduling
Flowshop scheduling
 
Press the link to see the book from my google drive.https.docx
Press the link to see the book from my google drive.https.docxPress the link to see the book from my google drive.https.docx
Press the link to see the book from my google drive.https.docx
 
5 Cool Things About PLSQL
5 Cool Things About PLSQL5 Cool Things About PLSQL
5 Cool Things About PLSQL
 
Machines constrained flow shop scheduling processing time, setup time each as...
Machines constrained flow shop scheduling processing time, setup time each as...Machines constrained flow shop scheduling processing time, setup time each as...
Machines constrained flow shop scheduling processing time, setup time each as...
 
11.machines constrained flow shop scheduling processing time, setup time each...
11.machines constrained flow shop scheduling processing time, setup time each...11.machines constrained flow shop scheduling processing time, setup time each...
11.machines constrained flow shop scheduling processing time, setup time each...
 
11.machines constrained flow shop scheduling processing time, setup time each...
11.machines constrained flow shop scheduling processing time, setup time each...11.machines constrained flow shop scheduling processing time, setup time each...
11.machines constrained flow shop scheduling processing time, setup time each...
 
Electrical Engineering Exam Help
Electrical Engineering Exam HelpElectrical Engineering Exam Help
Electrical Engineering Exam Help
 
Algorithm analysis
Algorithm analysisAlgorithm analysis
Algorithm analysis
 
Rtl design optimizations and tradeoffs
Rtl design optimizations and tradeoffsRtl design optimizations and tradeoffs
Rtl design optimizations and tradeoffs
 
Oracle Join Methods and 12c Adaptive Plans
Oracle Join Methods and 12c Adaptive PlansOracle Join Methods and 12c Adaptive Plans
Oracle Join Methods and 12c Adaptive Plans
 
How to tune a query - ODTUG 2012
How to tune a query - ODTUG 2012How to tune a query - ODTUG 2012
How to tune a query - ODTUG 2012
 
Unit i
Unit iUnit i
Unit i
 

Plus de Sergey Petrunya

New optimizer features in MariaDB releases before 10.12
New optimizer features in MariaDB releases before 10.12New optimizer features in MariaDB releases before 10.12
New optimizer features in MariaDB releases before 10.12Sergey Petrunya
 
Improved histograms in MariaDB 10.8
Improved histograms in MariaDB 10.8Improved histograms in MariaDB 10.8
Improved histograms in MariaDB 10.8Sergey Petrunya
 
Improving MariaDB’s Query Optimizer with better selectivity estimates
Improving MariaDB’s Query Optimizer with better selectivity estimatesImproving MariaDB’s Query Optimizer with better selectivity estimates
Improving MariaDB’s Query Optimizer with better selectivity estimatesSergey Petrunya
 
JSON Support in MariaDB: News, non-news and the bigger picture
JSON Support in MariaDB: News, non-news and the bigger pictureJSON Support in MariaDB: News, non-news and the bigger picture
JSON Support in MariaDB: News, non-news and the bigger pictureSergey Petrunya
 
Optimizer Trace Walkthrough
Optimizer Trace WalkthroughOptimizer Trace Walkthrough
Optimizer Trace WalkthroughSergey Petrunya
 
Optimizer features in recent releases of other databases
Optimizer features in recent releases of other databasesOptimizer features in recent releases of other databases
Optimizer features in recent releases of other databasesSergey Petrunya
 
MariaDB 10.4 - что нового
MariaDB 10.4 - что новогоMariaDB 10.4 - что нового
MariaDB 10.4 - что новогоSergey Petrunya
 
Using histograms to get better performance
Using histograms to get better performanceUsing histograms to get better performance
Using histograms to get better performanceSergey Petrunya
 
MariaDB Optimizer - further down the rabbit hole
MariaDB Optimizer - further down the rabbit holeMariaDB Optimizer - further down the rabbit hole
MariaDB Optimizer - further down the rabbit holeSergey Petrunya
 
Query Optimizer in MariaDB 10.4
Query Optimizer in MariaDB 10.4Query Optimizer in MariaDB 10.4
Query Optimizer in MariaDB 10.4Sergey Petrunya
 
MariaDB 10.3 Optimizer - where does it stand
MariaDB 10.3 Optimizer - where does it standMariaDB 10.3 Optimizer - where does it stand
MariaDB 10.3 Optimizer - where does it standSergey Petrunya
 
MyRocks in MariaDB | M18
MyRocks in MariaDB | M18MyRocks in MariaDB | M18
MyRocks in MariaDB | M18Sergey Petrunya
 
New Query Optimizer features in MariaDB 10.3
New Query Optimizer features in MariaDB 10.3New Query Optimizer features in MariaDB 10.3
New Query Optimizer features in MariaDB 10.3Sergey Petrunya
 
Common Table Expressions in MariaDB 10.2
Common Table Expressions in MariaDB 10.2Common Table Expressions in MariaDB 10.2
Common Table Expressions in MariaDB 10.2Sergey Petrunya
 
MyRocks in MariaDB: why and how
MyRocks in MariaDB: why and howMyRocks in MariaDB: why and how
MyRocks in MariaDB: why and howSergey Petrunya
 
Эволюция репликации в MySQL и MariaDB
Эволюция репликации в MySQL и MariaDBЭволюция репликации в MySQL и MariaDB
Эволюция репликации в MySQL и MariaDBSergey Petrunya
 
Common Table Expressions in MariaDB 10.2 (Percona Live Amsterdam 2016)
Common Table Expressions in MariaDB 10.2 (Percona Live Amsterdam 2016)Common Table Expressions in MariaDB 10.2 (Percona Live Amsterdam 2016)
Common Table Expressions in MariaDB 10.2 (Percona Live Amsterdam 2016)Sergey Petrunya
 
MariaDB 10.1 - что нового.
MariaDB 10.1 - что нового.MariaDB 10.1 - что нового.
MariaDB 10.1 - что нового.Sergey Petrunya
 

Plus de Sergey Petrunya (20)

New optimizer features in MariaDB releases before 10.12
New optimizer features in MariaDB releases before 10.12New optimizer features in MariaDB releases before 10.12
New optimizer features in MariaDB releases before 10.12
 
Improved histograms in MariaDB 10.8
Improved histograms in MariaDB 10.8Improved histograms in MariaDB 10.8
Improved histograms in MariaDB 10.8
 
Improving MariaDB’s Query Optimizer with better selectivity estimates
Improving MariaDB’s Query Optimizer with better selectivity estimatesImproving MariaDB’s Query Optimizer with better selectivity estimates
Improving MariaDB’s Query Optimizer with better selectivity estimates
 
JSON Support in MariaDB: News, non-news and the bigger picture
JSON Support in MariaDB: News, non-news and the bigger pictureJSON Support in MariaDB: News, non-news and the bigger picture
JSON Support in MariaDB: News, non-news and the bigger picture
 
Optimizer Trace Walkthrough
Optimizer Trace WalkthroughOptimizer Trace Walkthrough
Optimizer Trace Walkthrough
 
Optimizer features in recent releases of other databases
Optimizer features in recent releases of other databasesOptimizer features in recent releases of other databases
Optimizer features in recent releases of other databases
 
MariaDB 10.4 - что нового
MariaDB 10.4 - что новогоMariaDB 10.4 - что нового
MariaDB 10.4 - что нового
 
Using histograms to get better performance
Using histograms to get better performanceUsing histograms to get better performance
Using histograms to get better performance
 
MariaDB Optimizer - further down the rabbit hole
MariaDB Optimizer - further down the rabbit holeMariaDB Optimizer - further down the rabbit hole
MariaDB Optimizer - further down the rabbit hole
 
Query Optimizer in MariaDB 10.4
Query Optimizer in MariaDB 10.4Query Optimizer in MariaDB 10.4
Query Optimizer in MariaDB 10.4
 
MariaDB 10.3 Optimizer - where does it stand
MariaDB 10.3 Optimizer - where does it standMariaDB 10.3 Optimizer - where does it stand
MariaDB 10.3 Optimizer - where does it stand
 
MyRocks in MariaDB | M18
MyRocks in MariaDB | M18MyRocks in MariaDB | M18
MyRocks in MariaDB | M18
 
New Query Optimizer features in MariaDB 10.3
New Query Optimizer features in MariaDB 10.3New Query Optimizer features in MariaDB 10.3
New Query Optimizer features in MariaDB 10.3
 
MyRocks in MariaDB
MyRocks in MariaDBMyRocks in MariaDB
MyRocks in MariaDB
 
Say Hello to MyRocks
Say Hello to MyRocksSay Hello to MyRocks
Say Hello to MyRocks
 
Common Table Expressions in MariaDB 10.2
Common Table Expressions in MariaDB 10.2Common Table Expressions in MariaDB 10.2
Common Table Expressions in MariaDB 10.2
 
MyRocks in MariaDB: why and how
MyRocks in MariaDB: why and howMyRocks in MariaDB: why and how
MyRocks in MariaDB: why and how
 
Эволюция репликации в MySQL и MariaDB
Эволюция репликации в MySQL и MariaDBЭволюция репликации в MySQL и MariaDB
Эволюция репликации в MySQL и MariaDB
 
Common Table Expressions in MariaDB 10.2 (Percona Live Amsterdam 2016)
Common Table Expressions in MariaDB 10.2 (Percona Live Amsterdam 2016)Common Table Expressions in MariaDB 10.2 (Percona Live Amsterdam 2016)
Common Table Expressions in MariaDB 10.2 (Percona Live Amsterdam 2016)
 
MariaDB 10.1 - что нового.
MariaDB 10.1 - что нового.MariaDB 10.1 - что нового.
MariaDB 10.1 - что нового.
 

Dernier

Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
Software Coding for software engineering
Software Coding for software engineeringSoftware Coding for software engineering
Software Coding for software engineeringssuserb3a23b
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf31events.com
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 

Dernier (20)

Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
Odoo Development Company in India | Devintelle Consulting Service
Odoo Development Company in India | Devintelle Consulting ServiceOdoo Development Company in India | Devintelle Consulting Service
Odoo Development Company in India | Devintelle Consulting Service
 
Software Coding for software engineering
Software Coding for software engineeringSoftware Coding for software engineering
Software Coding for software engineering
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your Business
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 

MariaDB's join optimizer: how it works and current fixes

  • 1. Sergei Petrunia MariaDB devroom FOSDEM 2021 Join Optimizer 1. How it works 2. What we’re working on to improve it Optimizer Call July 2022 Sergei Petrunia MariaDB
  • 2. 2 Join order search ● Total number of possible join orders for N-table join is: N * (N-1) * (N-2) *… = N! ● Join orders are built left-to-right ● Cannot enumerate all possible join orders. t1 t1 t2 t2 t3 t4 t3 t4 t2 t4 t2 t3 t4 t3 t4 t2 t3 t2 t1 t3 t4 t3 t4 t1 t4 t1 t3 t4 t3 t4 t1 t3 t1 t3 t4
  • 3. 3 Pruning ● Enumerate promising join orders first. ● Do not explore join orders that are apparently worse. t1 t1 t2 t2 t3 t4 t3 t4 t2 t4 t2 t3 t4 t3 t4 t2 t3 t2 t1 t3 t4 t3 t4 t1 t4 t1 t3 t4 t3 t4 t1 t3 t1 t3 t4
  • 4. 4 Pruning # 1: by cost ● Cost of the current_prefix is already higher than total cost of best plan. – Adding tables will make it even higher – No point to try. ● This pruning is always done (no switch) ● Optimizer trace: pruned_by_cost t1 t1 t2 t2 t3 t4 t3 t4 t2 t4 t2 t3 t4 t3 t4 t2 t3 t2 t1 t3 t4 t3 t4 t1 t4 t1 t3 t4 t3 t4 t1 t3 t1 t3 t4
  • 5. 5 pruned_by_cost weaknesses ● A really expensive table at the end of the join order. ● Any prefix that doesn’t include it is relatively cheap – Even if its comparably worse: – – ● => No pruning. t1 t1 t2 t2 t3 t4 t3 t4 t2 t4 t2 t3 t4 t3 t4 t2 t3 t2 t1 t4 t3 t3 t4 t1 t3 t1 t4 t4 t3 t3 t1 t4 t1 t3 t4 t1 t1 t4 t2 t4
  • 6. 6 Pruning # 2: by heuristic ● Adding a table tX to a join prefix – Adds read_time (time to read tX) – Produces record_count row combinations to be joined with further tables (aka “join suffix”). – Both have an effect on the total cost: ● read_time is time spent right now. ● record_count will affect cost of join suffix. – We don’t know the “exchange ratio” because we don’t know the costs of “join suffix”. t0 t1 t2 t3 incoming_record_count record_count_t1 record_count_t2 record_count_t3 read_time_t1
  • 7. 7 The idea behind the heuristic – … we don’t know the “exchange ratio” because we don’t know the costs of “join suffix” ● Also the suffixes are different! – Let’s assume the suffixes have similar costs. – Then, if ● read_time_t1 < read_time_t2, AND ● record_count_t1 < record_count_t2 – Then t1 “is better” than t2. – Can prune away t2. t0 t1 t2 t3 incoming_record_count record_count_t1 record_count_t2 record_count_t3 read_time_t1
  • 8. 8 Applying heuristic pruning ● Do it locally in each join prefix ● First, consider more promising tables first. ● Less-promising tables second – And try to prune them away. t1 t1 t2 t2 t3 t4 t3 t4 t2 t4 t2 t3 t4 t3 t4 t2 t3 t2 t1 t3 t4 t3 t4 t1 t4 t1 t3 t4 t3 t4 t1 t3 t1 t3 t4
  • 9. 9 Pruning # 2: by heuristic ● A Model Table (yes, I’ve just invented this term): – Lowest read_time AND record_count seen so far – Either ● record_count < 2.0, or ● there are no possible "key dependencies" on tables not in the prefix – A “possible key dependency” is an eqality in form: tbl.keyXpartY=expr(tables_no_in_prefix) ● ^^ this is a “heuristic” to apply the heuristic. ● Prune away tables that have both worse read_time and record_count than the Model Table. t1 t1 t2 t2 t3 t4 t3 t4 t2 t4 t2 t3 t4 t3 t4 t2 t3 t2 t1 t3 t4 t3 t4 t1 t4 t1 t3 t4 t3 t4 t1 t3 t1 t3 t4
  • 10. 10 How one can see heuristic pruning ● @@optimizer_prune_level – 0 – not enabled. – 1 – enabled (the default) ● Optimizer trace: grep for “pruned_by_heuristic” t1 t1 t2 t2 t3 t4 t3 t4 t2 t4 t2 t3 t4 t3 t4 t2 t3 t2 t1 t3 t4 t3 t4 t1 t4 t1 t3 t4 t3 t4 t1 t3 t1 t3 t4
  • 12. 12 Greedy search ● Consider only prefixes of limited size – Based on that, pick the first table – Repeat ● @@optimizer_search_depth – Default: 62 (both MySQL and MariaDB) – 0 – “pick depth automatically” ● Why is this not default yet? t1 t1 t2 t2 t3 t4 t3 t4 t2 t4 t2 t3 t4 t3 t4 t2 t3 t2 t1 t3 t4 t3 t4 t1 t4 t1 t3 t4 t3 t4 t1 t3 t1 t3 t4
  • 14. 14 MDEV-28073: patch #1: “edge tables” ● If the suffix t1-t4-t3 uses only eq_ref or similar: – It is [nearly] the best – Don’t enumerate other table combinations. ● They can’t be much better. ● Optimizer trace: pruned_by_hanging_leaf t1 t1 t2 t2 t3 t4 t3 t4 t2 t4 t2 t3 t4 t3 t4 t2 t3 t2 t1 t3 t4 t3 t4 t1 t4 t1 t3 t4 t3 t4 t1 t3 t1 commit b729896d00e022f6205399376c0cc107e1ee0704 Author: Monty <monty@mariadb.org> Date: Tue May 10 11:47:20 2022 +0300 MDEV-28073 Query performance degradation in newer MariaDB versions when using many tables The issue was that best_extension_by_limited_search() had to go through too many plans with the same cost as there where many EQ_REF tables. Fixed by shortcutting EQ_REF (AND REF) when the result only contains one row. This got the optimization time down from hours to sub seconds. t0
  • 15. 15 MDEV-28073: patch #2: key_dependent select ... from person, car_rides, bicycle_rides where person.name=car_rides.rider and person.name=bicycle_rides.rider and ... car_rides bicycle_rides person ● Remember the “heuristics to apply the heuristic” a few slides above: – there are no possible "key dependencies" on tables not in the prefix It can be false due to multi-equalities: ● person.name=bicycle_rides.riders is a “possible key dependency”. ● But we already have person.name from car_rides.rider (the equality is “bound”) – Trying join orders with bicycle_rides before person won’t produce a better plan. ● Solution: adjust the heuristic: there are no possible key_dependencies on tables not in the prefix that are not already bound. name
  • 16. 16 MDEV-28073: patch #3: table order de-scrambling ● The optimizer should try good tables first ● Implemented by taking tables off the unused portion of join->best_ref array. – Initially it’s ordered (“promising” tables first) – But due to bug eventually gets out of order ● Plan searches that enumerate many options could suffer from poor pruning towards the end. t1 t1 t2 t2 t3 t4 t3 t4 t2 t4 t2 t3 t4 t3 t4 t2 t3 t2 t1 t3 t4 t3 t4 t1 t4 t1 t3 t4 t3 t4 t1 t3 t1 Author: Michael Widenius <monty@mariadb.org> Date: Sun May 15 15:46:29 2022 +0300 greedy_search() and best_extension_by_limited_search() scrambled table order best_extension_by_limited_search() assumes that tables should be sorted according to size to be able to quickly disregard bad plans. However the current usage of swap_variables() will change the table order to a not sorted one for the next recursive call. This breaks the assumtion and causes performance issues when using many tables (we have to examine many more plans). t0
  • 19. 19 In which order do we try the tables? ● Current: – join_tab_cmp() orders all tables by their JOIN_TAB::found_records (records after table’s condition is checked) – The same ordering is used everywhere – This *ignores* the join prefix and efficien table read plans we can use – e.g. here, ignores the prefix of t1: t1 t1 t2 t2 t3 t4 t3 t4 t2 t4 t2 t3 t4 t3 t4 t2 t3 t2 t1 t3 t4 t3 t4 t1 t4 t1 t3 t4 t3 t4 t1 t3 t1 t3 t4 t1 t1 t2 t3 t4
  • 20. 20 In which order do we try the tables? ● First, evaluate possible table accesses for {t2,t3,t4}. ● Sort them by #found_rows ● Then try extending join orders – Do all kinds of pruning while doing this t1 t1 t2 t3 t4 t3 t4 t2 t4 t2 t3 t4 t3 t4 t2 t3 t2 commit 0762dd9283185c72c6955f44fc4d862a0a928569 Author: Monty <monty@mariadb.org> Date: Tue May 31 17:36:32 2022 +0300 Improve pruning in greedy_search by sorting tables during search MDEV-28073 Slow query performance in MariaDB when using many tables The faster we can find a good query plan, the more options we have for finding and pruning (ignoring) bad plans. This patch adds sorting of plans to best_extension_by_limited_search().
  • 22. 22 Remember: the idea behind the heuristic – … we don’t know the “exchange ratio” because we don’t know the costs of “join suffix” ● Also the suffixes are different! – Let’s assume the suffixes have similar costs. – Then, if ● read_time_t1 < read_time_t2, AND ● record_count_t1 < record_count_t2 – Then t1 “is better” than t2. – Can prune away t2. – t0 t1 t2 t3 incoming_record_count record_count_t1 record_count_t2 record_count_t3 read_time_t1
  • 23. 23 Let’s plot the tables Let’s plot records_read read_time t1
  • 24. 24 Let’s plot the tables records_read read_time t1 Better than t1 Worse than t1
  • 25. 25 Let’s plot the tables ● “Typical” situation: plan with higher cost produce more #rows. – A lot of opportunities to do pruning ● The optimizer orders plans by records_read – (that is, goes left-to-right) – Pick the first plan as “Model”, prune those that are worse. records_read read_time t1 t0 t2 t3 t4
  • 26. 26 When pruning doesn’t work records_read read_time t1 t0 t2 t3 t4 ● “Bad” situation: – plans with high cost produce few rows – And vice versa ● Can’t do pruning.
  • 27. 27 When pruning could work but doesn’t records_read read_time t1 t2 t3 t4 t0 ● Walking left-to-right, the optimizer picks t0 as Model table. ● And then can’t prune away any other table. Tables that are worse than t0 are here
  • 28. 28 How to do as much pruning as possible? records_read read_time ● Pick a minimal set of Model tables that allow to prune away the rest? ● Complexity seems to be at least N^2. ● Some approximate algorithm? – Use the table with min_cost – Use the table with min_records_read ● Have a patch with some approximate implementation
  • 30. 30 Motivation ● Tables with “attributes” that are joined using Primary Key select * from base_table, attr1, attr2, ... attrN where attr1.pk = base_table.pk and attr2.pk = base_table.pk and ... attrN.pk = base_table.pk ● Lots of nearly-identical query plans: There are factorial(n_attributes) permutations – Have the same or very close cost ● => Can’t do pruning ● The fix with “Edge tables” aka pruned_by_hanging_leaf helps but only if the attributes are at the end of the join order.
  • 31. 31 eq_ref chaining ● The idea: if we see a eq_ref access, try considering only eq_refs as long as we can. ● MySQL 5.7 has a similar optimization – TODO: describe the differences. t1 t1 t2 t2 t3 t4 t3 t4 t2 t4 t2 t3 t4 t3 t4 t2 t3 t2 t1 t3 t4 t3 t4 t1 t4 t1 t3 t4 t3 t4 t1 t3 t1 t3 t4 commit 5abb6bff6cfb5cb5d87520f1e32e9b41db46bd7b Author: Monty <monty@mariadb.org> Date: Thu Jun 2 19:47:23 2022 +0300 Added EQ_REF chaining to the greedy_optimizer MDEV-28073 Slow query performance in MariaDB when using many table The idea is to prefer and chain EQ_REF tables (tables that uses an unique key to find a row) when searching for the best table combination. This significantly reduces row combinations that has to be examined. This is optimization is enabled when setting optimizer_prune_level=2 (default)
  • 32. 32 Thanks for your attention!