SlideShare une entreprise Scribd logo
1  sur  15
Télécharger pour lire hors ligne
Click to edit Master subtitle style
1
Query Planning Gone Wrong
Robert Haas
Chief Architect, Database Server
Why This Talk?
● 2010: The PostgreSQL Query Planner (Robert Haas)
● How does the query planner actually work from a user
perspective? What does it really do?
● Very common audience question: What do I do when the query
planner fails? How do I fix my query?
● 2011: Hacking the Query Planner (Tom Lane)
● How does the query planner actually work from a developer
perspective? What does it *really* do?
● Plea for help to improve the query planner.
● But... what should we be improving?
Methodology: Which Problems Matter?
● Read hundreds of email threads on pgsql-performance over a
period of almost two years.
● Disregarded all those that were not about query performance
problems.
● Decided what I thought the root cause (or, occasionally, causes) of
each complaint was.
● Skipped a very small number where I couldn't form an opinion.
● Counted the number of times each problem was reported.
Methodology: Possible Critiques
● The problems reported on pgsql-performance aren't necessarily
representative of all the problems PostgreSQL users encounter
(reporting bias).
● In particular, confusing problems might be more likely to be
reported.
● I might not have correctly identified the cause of each problem
(researcher bias).
● Others?
Statistically Speaking, Why Is My Query Slow? (168)
● Settings (23). Includes anything you can fix with postgresql.conf
changes, DDL, or operating systems settings changes.
● Just Plain Slow (23). Includes anything that amounts to an
unreasonable expectation on the part of the user. These are often
questions of the form “why is query A slower than query B?” when
A is actually doing something much more expensive than B.
● We're Bad At That (22). Includes anything that could be faster in
some other database product, but isn't fast in PostgreSQL for
some reason (not implemented yet, or architectural artifact).
● Planner Error (83). Bad decisions about the cost of one plan vs.
another plan due to limitations of the optimizer.
● Bugs (14). Bugs in the query planner, or in one case, the Linux
kernel.
● User Error (3). User got confused and did something illogical.
Settings (23)
● Planner Cost Constants (8). Adjustments needed to
seq_page_cost, random_page_cost, and perhaps cpu_tuple_cost
to accurately model real costs.
● Missing Index (4)
● Cost for @@ Operator Is Too Low (2)
● work_mem Too Low (2)
● Statistics Target Too Low (2)
● Statistics Target Too High (1)
● n_distinct Estimates Aren't Accurate On Large Tables (1)
● Not Analyzing Tables Often Enough (1)
● TOAST Decompression is Slow (1)
● vm.zone_reclaim_mode = 1 Causes Extra Disk I/O (1)
Just Plain Slow (23)
● It Takes a While to Process a Lot of Data (6)
● Disks Are Slower Than Memory (6)
● Clauses Involving Multiple Tables Can't Be Pushed Down (2)
● Random I/O is Slower Than Sequential I/O (1)
● Linearly Scanning an Array is O(n) (1)
● One Regular Expression is Faster Than Two (1)
● Can't Figure Out Which Patterns Match a String Without Trying
Them All (1)
● xmlagg Is Much Slower Than string_agg (1)
● Scanning More Tables is Slower Than Scanning Fewer Tables (1)
● Replanning Isn't Free (1)
● Repeated Concatenation Using xmlconcat Is Slow (1)
● UNION is Slower than UNION ALL (1)
We're Bad At That (22)
● Plan Types We Can't Generate (11)
● Parameterized Paths (7). Two of these are post-9.2 complaints,
involving cases where 9.2 can't parameterize as needed.
● Merge Append (3). Fixed in 9.1.
● Batched Sort of Data Already Ordered By Leading Columns (1).
● Executor Limitations (3)
● Indexing Unordered Data Causes Random I/O (1)
● <> is Not Indexable (1)
● DISTINCT + HashAggregate Reads All Input Before Emitting
Any Results (1). This matters if there is a LIMIT.
● Architecture (8)
● No Parallel Query (2), Table Bloat (1), Backend Startup Cost (1),
Redundant Updates Are Expensive (1), AFTER Trigger Queue
Size (1), On-Disk Size of numeric (1), Autovacuum Not Smart
About Inherited Tables (1)
Planner Errors (83)
● Any guesses?
Planner Errors (83)
● Conceptual Errors (28). The planner isn't able to recognize that
two different queries are equivalent, so it doesn't even consider the
best plan.
● Estimation Errors (55). The planner considers the optimal plan, but
rejects it as too expensive.
● Row Count Estimation Errors (48). The planner mis-estimates
the number of rows that will be returned by some scan, join, or
aggregate.
● Cost Estimation Errors (7). The planner estimates the row
count correctly but incorrectly estimates the relative cost.
Grand Prize Winners
● Selectivity of filter conditions involving correlated columns is
estimated inaccurately (13)
● Suppose we want all the rows from a table where a = 1 and b =
1 and c = 1 and d = 1 and e = 1. The planner must estimate the
number of rows that will match, but only has statistics on each
column individually.
● Planner incorrectly thinks that “SELECT * FROM foo WHERE a = 1
ORDER BY b LIMIT n” will fill the limit after reading a small
percentage of the index (11)
● It can scan an index on b and filter for rows where a = 1.
● Or it can scan an index on a, find all rows where a = 1, and
perform a top-N sort.
● It often prefers the former when the latter would be faster.
● Can often be worked around with a composite or functional
index.
Planner Error: Row Count Estimation – Others (24)
● Using WITH Results in a Bad Plan (5). Some of these are query
fattening issues, while others result from failure to dig out variable
statistics.
● Generic Plans Can Have Wildly Wrong Estimates (4). Improved.
● Selectivity Estimates on Arbitrary Estimates are Poor (4)
● Join Selectivity Doesn't Know about Cross-Table Correlations (3)
● Uncommitted Tuples Don't Affect Statistics (2)
● No Stats for WITH RECURSIVE (1) or GROUP BY (1) Results
● Redundant Equality Constraints Not Identified As Such (1)
● IN/NOT IN Estimation Doesn't Assume Array Elements Distinct (1).
Fixed.
● Histogram Bounds Can Slide Due to New Data (1). Fixed.
● Inheritance Parents Aren't Assumed to be Completely Empty (1).
Fixed.
Planner Error: Cost Estimation (7)
● Planner doesn't account for de-TOASTing cost (4)
● Plan change causes volume of data to exceed server memory (2)
● Hash join sometimes decides to hash the larger table when it
should probably be hashing the smaller one (1)
Planner Error: Conceptual (28)
● Cross-data type comparisons are not always indexable (3)
● Inlining the same thing multiple times can lose (3)
● NOT IN is hard to optimize – and we don't try very hard (3)
● Target lists are computed too early or unnecessary targets are
computed (3)
● Can't rewrite SELECT max(a) FROM foo WHERE b IN (…) as max of
index scans (2)
● Can't rearrange joins and aggregates relative to one another (2)
● Can't deduce implied inequalities (2)
● Ten other issues that came up once each
Thank You
● Any questions?

Contenu connexe

Tendances

Query processing and Query Optimization
Query processing and Query OptimizationQuery processing and Query Optimization
Query processing and Query OptimizationNiraj Gandha
 
Flowchart design for algorithms
Flowchart design for algorithmsFlowchart design for algorithms
Flowchart design for algorithmsKuppusamy P
 
ADS Introduction
ADS IntroductionADS Introduction
ADS IntroductionNagendraK18
 
COMPUTER PROGRAMMING UNIT 1 Lecture 4
COMPUTER PROGRAMMING UNIT 1 Lecture 4COMPUTER PROGRAMMING UNIT 1 Lecture 4
COMPUTER PROGRAMMING UNIT 1 Lecture 4Vishal Patil
 
Randomized Algorithms
Randomized AlgorithmsRandomized Algorithms
Randomized AlgorithmsKetan Kamra
 
Topic 1.4: Randomized Algorithms
Topic 1.4: Randomized AlgorithmsTopic 1.4: Randomized Algorithms
Topic 1.4: Randomized AlgorithmsKM Bappi
 
Synthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingSynthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingAdam Doyle
 
Discrete event simulation
Discrete event simulationDiscrete event simulation
Discrete event simulationssusera970cc
 
facility layout paper
 facility layout paper facility layout paper
facility layout paperSaurabh Tiwary
 
CIS110 Computer Programming Design Chapter (4)
CIS110 Computer Programming Design Chapter  (4)CIS110 Computer Programming Design Chapter  (4)
CIS110 Computer Programming Design Chapter (4)Dr. Ahmed Al Zaidy
 
connecting discrete mathematics and software engineering
connecting discrete mathematics and software engineeringconnecting discrete mathematics and software engineering
connecting discrete mathematics and software engineeringRam Kumar K R
 
Flow Chart @ppsc(2)
Flow Chart @ppsc(2)Flow Chart @ppsc(2)
Flow Chart @ppsc(2)Amiya Bhusan
 
Algorithm
AlgorithmAlgorithm
Algorithmogline
 
Flow control in computer
Flow control in computerFlow control in computer
Flow control in computerrud_d_rcks
 
SERENE 2014 Workshop: Paper "Modelling Resilience of Data Processing Capabili...
SERENE 2014 Workshop: Paper "Modelling Resilience of Data Processing Capabili...SERENE 2014 Workshop: Paper "Modelling Resilience of Data Processing Capabili...
SERENE 2014 Workshop: Paper "Modelling Resilience of Data Processing Capabili...SERENEWorkshop
 

Tendances (20)

Query processing and Query Optimization
Query processing and Query OptimizationQuery processing and Query Optimization
Query processing and Query Optimization
 
Abraham march07
Abraham march07Abraham march07
Abraham march07
 
Flowchart design for algorithms
Flowchart design for algorithmsFlowchart design for algorithms
Flowchart design for algorithms
 
ADS Introduction
ADS IntroductionADS Introduction
ADS Introduction
 
COMPUTER PROGRAMMING UNIT 1 Lecture 4
COMPUTER PROGRAMMING UNIT 1 Lecture 4COMPUTER PROGRAMMING UNIT 1 Lecture 4
COMPUTER PROGRAMMING UNIT 1 Lecture 4
 
Randomized Algorithms
Randomized AlgorithmsRandomized Algorithms
Randomized Algorithms
 
Plant Layout Algorithm
Plant Layout AlgorithmPlant Layout Algorithm
Plant Layout Algorithm
 
Topic 1.4: Randomized Algorithms
Topic 1.4: Randomized AlgorithmsTopic 1.4: Randomized Algorithms
Topic 1.4: Randomized Algorithms
 
Synthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingSynthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-making
 
Query optimization
Query optimizationQuery optimization
Query optimization
 
Discrete event simulation
Discrete event simulationDiscrete event simulation
Discrete event simulation
 
facility layout paper
 facility layout paper facility layout paper
facility layout paper
 
CIS110 Computer Programming Design Chapter (4)
CIS110 Computer Programming Design Chapter  (4)CIS110 Computer Programming Design Chapter  (4)
CIS110 Computer Programming Design Chapter (4)
 
Algorithm
AlgorithmAlgorithm
Algorithm
 
connecting discrete mathematics and software engineering
connecting discrete mathematics and software engineeringconnecting discrete mathematics and software engineering
connecting discrete mathematics and software engineering
 
Flow Chart @ppsc(2)
Flow Chart @ppsc(2)Flow Chart @ppsc(2)
Flow Chart @ppsc(2)
 
Algorithm
AlgorithmAlgorithm
Algorithm
 
Flow control in computer
Flow control in computerFlow control in computer
Flow control in computer
 
State chart diagram
State chart diagramState chart diagram
State chart diagram
 
SERENE 2014 Workshop: Paper "Modelling Resilience of Data Processing Capabili...
SERENE 2014 Workshop: Paper "Modelling Resilience of Data Processing Capabili...SERENE 2014 Workshop: Paper "Modelling Resilience of Data Processing Capabili...
SERENE 2014 Workshop: Paper "Modelling Resilience of Data Processing Capabili...
 

En vedette

World Robot Olympiad india 2016 Rap the Scrap! - How to Particapte
World Robot Olympiad india 2016   Rap the Scrap! - How to ParticapteWorld Robot Olympiad india 2016   Rap the Scrap! - How to Particapte
World Robot Olympiad india 2016 Rap the Scrap! - How to ParticapteSudhanshu Sharma
 
David Keeney - SQL Database Server Requests from the Browser @ Postgres Open
David Keeney - SQL Database Server Requests from the Browser @ Postgres OpenDavid Keeney - SQL Database Server Requests from the Browser @ Postgres Open
David Keeney - SQL Database Server Requests from the Browser @ Postgres OpenPostgresOpen
 
Ryan Jarvinen Open Shift Talk @ Postgres Open 2013
Ryan Jarvinen Open Shift Talk @ Postgres Open 2013Ryan Jarvinen Open Shift Talk @ Postgres Open 2013
Ryan Jarvinen Open Shift Talk @ Postgres Open 2013PostgresOpen
 
Kevin Kempter - PostgreSQL Backup and Recovery Methods @ Postgres Open
Kevin Kempter - PostgreSQL Backup and Recovery Methods @ Postgres OpenKevin Kempter - PostgreSQL Backup and Recovery Methods @ Postgres Open
Kevin Kempter - PostgreSQL Backup and Recovery Methods @ Postgres OpenPostgresOpen
 
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres OpenBruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres OpenPostgresOpen
 
Gurjeet Singh - How Postgres is Different From (Better Tha) Your RDBMS @ Post...
Gurjeet Singh - How Postgres is Different From (Better Tha) Your RDBMS @ Post...Gurjeet Singh - How Postgres is Different From (Better Tha) Your RDBMS @ Post...
Gurjeet Singh - How Postgres is Different From (Better Tha) Your RDBMS @ Post...PostgresOpen
 
Keith Fiske - When PostgreSQL Can't, You Can @ Postgres Open
Keith Fiske - When PostgreSQL Can't, You Can @ Postgres OpenKeith Fiske - When PostgreSQL Can't, You Can @ Postgres Open
Keith Fiske - When PostgreSQL Can't, You Can @ Postgres OpenPostgresOpen
 
Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedanc...
Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedanc...Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedanc...
Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedanc...PostgresOpen
 
Keith Paskett - Postgres on ZFS @ Postgres Open
Keith Paskett - Postgres on ZFS @ Postgres OpenKeith Paskett - Postgres on ZFS @ Postgres Open
Keith Paskett - Postgres on ZFS @ Postgres OpenPostgresOpen
 
Selena Deckelmann - Sane Schema Management with Alembic and SQLAlchemy @ Pos...
Selena Deckelmann - Sane Schema Management with  Alembic and SQLAlchemy @ Pos...Selena Deckelmann - Sane Schema Management with  Alembic and SQLAlchemy @ Pos...
Selena Deckelmann - Sane Schema Management with Alembic and SQLAlchemy @ Pos...PostgresOpen
 
Islamabad PUG - 7th Meetup - performance tuning
Islamabad PUG - 7th Meetup - performance tuningIslamabad PUG - 7th Meetup - performance tuning
Islamabad PUG - 7th Meetup - performance tuningUmair Shahid
 
Islamabad PUG - 7th meetup - performance tuning
Islamabad PUG - 7th meetup - performance tuningIslamabad PUG - 7th meetup - performance tuning
Islamabad PUG - 7th meetup - performance tuningUmair Shahid
 
Out of the box replication in postgres 9.4(pg confus)
Out of the box replication in postgres 9.4(pg confus)Out of the box replication in postgres 9.4(pg confus)
Out of the box replication in postgres 9.4(pg confus)Denish Patel
 
Steve Singer - Managing PostgreSQL with Puppet @ Postgres Open
Steve Singer - Managing PostgreSQL with Puppet @ Postgres OpenSteve Singer - Managing PostgreSQL with Puppet @ Postgres Open
Steve Singer - Managing PostgreSQL with Puppet @ Postgres OpenPostgresOpen
 
Michael Bayer Introduction to SQLAlchemy @ Postgres Open
Michael Bayer Introduction to SQLAlchemy @ Postgres OpenMichael Bayer Introduction to SQLAlchemy @ Postgres Open
Michael Bayer Introduction to SQLAlchemy @ Postgres OpenPostgresOpen
 
PoPostgreSQL Web Projects: From Start to FinishStart To Finish
PoPostgreSQL Web Projects: From Start to FinishStart To FinishPoPostgreSQL Web Projects: From Start to FinishStart To Finish
PoPostgreSQL Web Projects: From Start to FinishStart To Finishelliando dias
 
Koichi Suzuki - Postgres-XC Dynamic Cluster Management @ Postgres Open
Koichi Suzuki - Postgres-XC Dynamic Cluster  Management @ Postgres OpenKoichi Suzuki - Postgres-XC Dynamic Cluster  Management @ Postgres Open
Koichi Suzuki - Postgres-XC Dynamic Cluster Management @ Postgres OpenPostgresOpen
 
Gbroccolo pgconfeu2016 pgnfs
Gbroccolo pgconfeu2016 pgnfsGbroccolo pgconfeu2016 pgnfs
Gbroccolo pgconfeu2016 pgnfsGiuseppe Broccolo
 
PostgreSQL HA
PostgreSQL   HAPostgreSQL   HA
PostgreSQL HAharoonm
 

En vedette (20)

World Robot Olympiad 2017
World Robot Olympiad 2017World Robot Olympiad 2017
World Robot Olympiad 2017
 
World Robot Olympiad india 2016 Rap the Scrap! - How to Particapte
World Robot Olympiad india 2016   Rap the Scrap! - How to ParticapteWorld Robot Olympiad india 2016   Rap the Scrap! - How to Particapte
World Robot Olympiad india 2016 Rap the Scrap! - How to Particapte
 
David Keeney - SQL Database Server Requests from the Browser @ Postgres Open
David Keeney - SQL Database Server Requests from the Browser @ Postgres OpenDavid Keeney - SQL Database Server Requests from the Browser @ Postgres Open
David Keeney - SQL Database Server Requests from the Browser @ Postgres Open
 
Ryan Jarvinen Open Shift Talk @ Postgres Open 2013
Ryan Jarvinen Open Shift Talk @ Postgres Open 2013Ryan Jarvinen Open Shift Talk @ Postgres Open 2013
Ryan Jarvinen Open Shift Talk @ Postgres Open 2013
 
Kevin Kempter - PostgreSQL Backup and Recovery Methods @ Postgres Open
Kevin Kempter - PostgreSQL Backup and Recovery Methods @ Postgres OpenKevin Kempter - PostgreSQL Backup and Recovery Methods @ Postgres Open
Kevin Kempter - PostgreSQL Backup and Recovery Methods @ Postgres Open
 
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres OpenBruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
 
Gurjeet Singh - How Postgres is Different From (Better Tha) Your RDBMS @ Post...
Gurjeet Singh - How Postgres is Different From (Better Tha) Your RDBMS @ Post...Gurjeet Singh - How Postgres is Different From (Better Tha) Your RDBMS @ Post...
Gurjeet Singh - How Postgres is Different From (Better Tha) Your RDBMS @ Post...
 
Keith Fiske - When PostgreSQL Can't, You Can @ Postgres Open
Keith Fiske - When PostgreSQL Can't, You Can @ Postgres OpenKeith Fiske - When PostgreSQL Can't, You Can @ Postgres Open
Keith Fiske - When PostgreSQL Can't, You Can @ Postgres Open
 
Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedanc...
Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedanc...Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedanc...
Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedanc...
 
Keith Paskett - Postgres on ZFS @ Postgres Open
Keith Paskett - Postgres on ZFS @ Postgres OpenKeith Paskett - Postgres on ZFS @ Postgres Open
Keith Paskett - Postgres on ZFS @ Postgres Open
 
Selena Deckelmann - Sane Schema Management with Alembic and SQLAlchemy @ Pos...
Selena Deckelmann - Sane Schema Management with  Alembic and SQLAlchemy @ Pos...Selena Deckelmann - Sane Schema Management with  Alembic and SQLAlchemy @ Pos...
Selena Deckelmann - Sane Schema Management with Alembic and SQLAlchemy @ Pos...
 
Islamabad PUG - 7th Meetup - performance tuning
Islamabad PUG - 7th Meetup - performance tuningIslamabad PUG - 7th Meetup - performance tuning
Islamabad PUG - 7th Meetup - performance tuning
 
Islamabad PUG - 7th meetup - performance tuning
Islamabad PUG - 7th meetup - performance tuningIslamabad PUG - 7th meetup - performance tuning
Islamabad PUG - 7th meetup - performance tuning
 
Out of the box replication in postgres 9.4(pg confus)
Out of the box replication in postgres 9.4(pg confus)Out of the box replication in postgres 9.4(pg confus)
Out of the box replication in postgres 9.4(pg confus)
 
Steve Singer - Managing PostgreSQL with Puppet @ Postgres Open
Steve Singer - Managing PostgreSQL with Puppet @ Postgres OpenSteve Singer - Managing PostgreSQL with Puppet @ Postgres Open
Steve Singer - Managing PostgreSQL with Puppet @ Postgres Open
 
Michael Bayer Introduction to SQLAlchemy @ Postgres Open
Michael Bayer Introduction to SQLAlchemy @ Postgres OpenMichael Bayer Introduction to SQLAlchemy @ Postgres Open
Michael Bayer Introduction to SQLAlchemy @ Postgres Open
 
PoPostgreSQL Web Projects: From Start to FinishStart To Finish
PoPostgreSQL Web Projects: From Start to FinishStart To FinishPoPostgreSQL Web Projects: From Start to FinishStart To Finish
PoPostgreSQL Web Projects: From Start to FinishStart To Finish
 
Koichi Suzuki - Postgres-XC Dynamic Cluster Management @ Postgres Open
Koichi Suzuki - Postgres-XC Dynamic Cluster  Management @ Postgres OpenKoichi Suzuki - Postgres-XC Dynamic Cluster  Management @ Postgres Open
Koichi Suzuki - Postgres-XC Dynamic Cluster Management @ Postgres Open
 
Gbroccolo pgconfeu2016 pgnfs
Gbroccolo pgconfeu2016 pgnfsGbroccolo pgconfeu2016 pgnfs
Gbroccolo pgconfeu2016 pgnfs
 
PostgreSQL HA
PostgreSQL   HAPostgreSQL   HA
PostgreSQL HA
 

Similaire à Robert Haas Query Planning Gone Wrong Presentation @ Postgres Open

Talk PGConf Eu 2013
Talk PGConf Eu 2013Talk PGConf Eu 2013
Talk PGConf Eu 2013Atri Sharma
 
Talk pg conf eu 2013
Talk pg conf eu 2013Talk pg conf eu 2013
Talk pg conf eu 2013Atri Sharma
 
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search FeedbackBlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedbacksinfomicien
 
Big Data processing with Apache Spark
Big Data processing with Apache SparkBig Data processing with Apache Spark
Big Data processing with Apache SparkLucian Neghina
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
FlumeJava: Easy, Efficient Data-Parallel Pipelines
FlumeJava: Easy, Efficient Data-Parallel PipelinesFlumeJava: Easy, Efficient Data-Parallel Pipelines
FlumeJava: Easy, Efficient Data-Parallel PipelinesMiro Cupak
 
Machine Learning Applications in Subsurface Analysis: Case Study in North Sea
Machine Learning Applications in Subsurface Analysis: Case Study in North SeaMachine Learning Applications in Subsurface Analysis: Case Study in North Sea
Machine Learning Applications in Subsurface Analysis: Case Study in North SeaYohanes Nuwara
 
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...PATHALAMRAJESH
 
SFO15-301: Benchmarking Best Practices 101
SFO15-301: Benchmarking Best Practices 101SFO15-301: Benchmarking Best Practices 101
SFO15-301: Benchmarking Best Practices 101Linaro
 
MySQL Query Optimisation 101
MySQL Query Optimisation 101MySQL Query Optimisation 101
MySQL Query Optimisation 101Federico Razzoli
 
Pregel: A System For Large Scale Graph Processing
Pregel: A System For Large Scale Graph ProcessingPregel: A System For Large Scale Graph Processing
Pregel: A System For Large Scale Graph ProcessingRiyad Parvez
 
Map reduce presentation
Map reduce presentationMap reduce presentation
Map reduce presentationAhmad El Tawil
 
Hadoop & Spark Performance tuning using Dr. Elephant
Hadoop & Spark Performance tuning using Dr. ElephantHadoop & Spark Performance tuning using Dr. Elephant
Hadoop & Spark Performance tuning using Dr. ElephantAkshay Rai
 
Join Algorithms in MapReduce
Join Algorithms in MapReduceJoin Algorithms in MapReduce
Join Algorithms in MapReduceShrihari Rathod
 
Ledingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartLedingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartMukesh Singh
 
Advanced memory allocation
Advanced memory allocationAdvanced memory allocation
Advanced memory allocationJoris Bonnefoy
 
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB
 

Similaire à Robert Haas Query Planning Gone Wrong Presentation @ Postgres Open (20)

Talk PGConf Eu 2013
Talk PGConf Eu 2013Talk PGConf Eu 2013
Talk PGConf Eu 2013
 
Talk pg conf eu 2013
Talk pg conf eu 2013Talk pg conf eu 2013
Talk pg conf eu 2013
 
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search FeedbackBlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
 
Big Data processing with Apache Spark
Big Data processing with Apache SparkBig Data processing with Apache Spark
Big Data processing with Apache Spark
 
Data Preprocessing
Data PreprocessingData Preprocessing
Data Preprocessing
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Druid
DruidDruid
Druid
 
FlumeJava: Easy, Efficient Data-Parallel Pipelines
FlumeJava: Easy, Efficient Data-Parallel PipelinesFlumeJava: Easy, Efficient Data-Parallel Pipelines
FlumeJava: Easy, Efficient Data-Parallel Pipelines
 
Machine Learning Applications in Subsurface Analysis: Case Study in North Sea
Machine Learning Applications in Subsurface Analysis: Case Study in North SeaMachine Learning Applications in Subsurface Analysis: Case Study in North Sea
Machine Learning Applications in Subsurface Analysis: Case Study in North Sea
 
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
 
SFO15-301: Benchmarking Best Practices 101
SFO15-301: Benchmarking Best Practices 101SFO15-301: Benchmarking Best Practices 101
SFO15-301: Benchmarking Best Practices 101
 
MySQL Query Optimisation 101
MySQL Query Optimisation 101MySQL Query Optimisation 101
MySQL Query Optimisation 101
 
Pregel: A System For Large Scale Graph Processing
Pregel: A System For Large Scale Graph ProcessingPregel: A System For Large Scale Graph Processing
Pregel: A System For Large Scale Graph Processing
 
Map reduce presentation
Map reduce presentationMap reduce presentation
Map reduce presentation
 
Hadoop & Spark Performance tuning using Dr. Elephant
Hadoop & Spark Performance tuning using Dr. ElephantHadoop & Spark Performance tuning using Dr. Elephant
Hadoop & Spark Performance tuning using Dr. Elephant
 
Join Algorithms in MapReduce
Join Algorithms in MapReduceJoin Algorithms in MapReduce
Join Algorithms in MapReduce
 
Ledingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartLedingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @Lendingkart
 
Advanced memory allocation
Advanced memory allocationAdvanced memory allocation
Advanced memory allocation
 
Map reduce
Map reduceMap reduce
Map reduce
 
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
 

Plus de PostgresOpen

Craig Kerstiens - Scalable Uniques in Postgres @ Postgres Open
Craig Kerstiens - Scalable Uniques in Postgres @ Postgres OpenCraig Kerstiens - Scalable Uniques in Postgres @ Postgres Open
Craig Kerstiens - Scalable Uniques in Postgres @ Postgres OpenPostgresOpen
 
John Melesky - Federating Queries Using Postgres FDW @ Postgres Open
John Melesky - Federating Queries Using Postgres FDW @ Postgres OpenJohn Melesky - Federating Queries Using Postgres FDW @ Postgres Open
John Melesky - Federating Queries Using Postgres FDW @ Postgres OpenPostgresOpen
 
Robert Bernier - Recovering From A Damaged PostgreSQL Cluster @ Postgres Open
Robert Bernier - Recovering From A Damaged PostgreSQL Cluster @ Postgres OpenRobert Bernier - Recovering From A Damaged PostgreSQL Cluster @ Postgres Open
Robert Bernier - Recovering From A Damaged PostgreSQL Cluster @ Postgres OpenPostgresOpen
 
Michael Paquier - Taking advantage of custom bgworkers @ Postgres Open
Michael Paquier - Taking advantage of custom bgworkers @ Postgres OpenMichael Paquier - Taking advantage of custom bgworkers @ Postgres Open
Michael Paquier - Taking advantage of custom bgworkers @ Postgres OpenPostgresOpen
 
Kevin Kempter PostgreSQL Backup and Recovery Methods @ Postgres Open
Kevin Kempter PostgreSQL Backup and Recovery Methods @ Postgres OpenKevin Kempter PostgreSQL Backup and Recovery Methods @ Postgres Open
Kevin Kempter PostgreSQL Backup and Recovery Methods @ Postgres OpenPostgresOpen
 
Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013
Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013
Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013PostgresOpen
 

Plus de PostgresOpen (6)

Craig Kerstiens - Scalable Uniques in Postgres @ Postgres Open
Craig Kerstiens - Scalable Uniques in Postgres @ Postgres OpenCraig Kerstiens - Scalable Uniques in Postgres @ Postgres Open
Craig Kerstiens - Scalable Uniques in Postgres @ Postgres Open
 
John Melesky - Federating Queries Using Postgres FDW @ Postgres Open
John Melesky - Federating Queries Using Postgres FDW @ Postgres OpenJohn Melesky - Federating Queries Using Postgres FDW @ Postgres Open
John Melesky - Federating Queries Using Postgres FDW @ Postgres Open
 
Robert Bernier - Recovering From A Damaged PostgreSQL Cluster @ Postgres Open
Robert Bernier - Recovering From A Damaged PostgreSQL Cluster @ Postgres OpenRobert Bernier - Recovering From A Damaged PostgreSQL Cluster @ Postgres Open
Robert Bernier - Recovering From A Damaged PostgreSQL Cluster @ Postgres Open
 
Michael Paquier - Taking advantage of custom bgworkers @ Postgres Open
Michael Paquier - Taking advantage of custom bgworkers @ Postgres OpenMichael Paquier - Taking advantage of custom bgworkers @ Postgres Open
Michael Paquier - Taking advantage of custom bgworkers @ Postgres Open
 
Kevin Kempter PostgreSQL Backup and Recovery Methods @ Postgres Open
Kevin Kempter PostgreSQL Backup and Recovery Methods @ Postgres OpenKevin Kempter PostgreSQL Backup and Recovery Methods @ Postgres Open
Kevin Kempter PostgreSQL Backup and Recovery Methods @ Postgres Open
 
Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013
Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013
Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013
 

Dernier

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 

Dernier (20)

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 

Robert Haas Query Planning Gone Wrong Presentation @ Postgres Open

  • 1. Click to edit Master subtitle style 1 Query Planning Gone Wrong Robert Haas Chief Architect, Database Server
  • 2. Why This Talk? ● 2010: The PostgreSQL Query Planner (Robert Haas) ● How does the query planner actually work from a user perspective? What does it really do? ● Very common audience question: What do I do when the query planner fails? How do I fix my query? ● 2011: Hacking the Query Planner (Tom Lane) ● How does the query planner actually work from a developer perspective? What does it *really* do? ● Plea for help to improve the query planner. ● But... what should we be improving?
  • 3. Methodology: Which Problems Matter? ● Read hundreds of email threads on pgsql-performance over a period of almost two years. ● Disregarded all those that were not about query performance problems. ● Decided what I thought the root cause (or, occasionally, causes) of each complaint was. ● Skipped a very small number where I couldn't form an opinion. ● Counted the number of times each problem was reported.
  • 4. Methodology: Possible Critiques ● The problems reported on pgsql-performance aren't necessarily representative of all the problems PostgreSQL users encounter (reporting bias). ● In particular, confusing problems might be more likely to be reported. ● I might not have correctly identified the cause of each problem (researcher bias). ● Others?
  • 5. Statistically Speaking, Why Is My Query Slow? (168) ● Settings (23). Includes anything you can fix with postgresql.conf changes, DDL, or operating systems settings changes. ● Just Plain Slow (23). Includes anything that amounts to an unreasonable expectation on the part of the user. These are often questions of the form “why is query A slower than query B?” when A is actually doing something much more expensive than B. ● We're Bad At That (22). Includes anything that could be faster in some other database product, but isn't fast in PostgreSQL for some reason (not implemented yet, or architectural artifact). ● Planner Error (83). Bad decisions about the cost of one plan vs. another plan due to limitations of the optimizer. ● Bugs (14). Bugs in the query planner, or in one case, the Linux kernel. ● User Error (3). User got confused and did something illogical.
  • 6. Settings (23) ● Planner Cost Constants (8). Adjustments needed to seq_page_cost, random_page_cost, and perhaps cpu_tuple_cost to accurately model real costs. ● Missing Index (4) ● Cost for @@ Operator Is Too Low (2) ● work_mem Too Low (2) ● Statistics Target Too Low (2) ● Statistics Target Too High (1) ● n_distinct Estimates Aren't Accurate On Large Tables (1) ● Not Analyzing Tables Often Enough (1) ● TOAST Decompression is Slow (1) ● vm.zone_reclaim_mode = 1 Causes Extra Disk I/O (1)
  • 7. Just Plain Slow (23) ● It Takes a While to Process a Lot of Data (6) ● Disks Are Slower Than Memory (6) ● Clauses Involving Multiple Tables Can't Be Pushed Down (2) ● Random I/O is Slower Than Sequential I/O (1) ● Linearly Scanning an Array is O(n) (1) ● One Regular Expression is Faster Than Two (1) ● Can't Figure Out Which Patterns Match a String Without Trying Them All (1) ● xmlagg Is Much Slower Than string_agg (1) ● Scanning More Tables is Slower Than Scanning Fewer Tables (1) ● Replanning Isn't Free (1) ● Repeated Concatenation Using xmlconcat Is Slow (1) ● UNION is Slower than UNION ALL (1)
  • 8. We're Bad At That (22) ● Plan Types We Can't Generate (11) ● Parameterized Paths (7). Two of these are post-9.2 complaints, involving cases where 9.2 can't parameterize as needed. ● Merge Append (3). Fixed in 9.1. ● Batched Sort of Data Already Ordered By Leading Columns (1). ● Executor Limitations (3) ● Indexing Unordered Data Causes Random I/O (1) ● <> is Not Indexable (1) ● DISTINCT + HashAggregate Reads All Input Before Emitting Any Results (1). This matters if there is a LIMIT. ● Architecture (8) ● No Parallel Query (2), Table Bloat (1), Backend Startup Cost (1), Redundant Updates Are Expensive (1), AFTER Trigger Queue Size (1), On-Disk Size of numeric (1), Autovacuum Not Smart About Inherited Tables (1)
  • 9. Planner Errors (83) ● Any guesses?
  • 10. Planner Errors (83) ● Conceptual Errors (28). The planner isn't able to recognize that two different queries are equivalent, so it doesn't even consider the best plan. ● Estimation Errors (55). The planner considers the optimal plan, but rejects it as too expensive. ● Row Count Estimation Errors (48). The planner mis-estimates the number of rows that will be returned by some scan, join, or aggregate. ● Cost Estimation Errors (7). The planner estimates the row count correctly but incorrectly estimates the relative cost.
  • 11. Grand Prize Winners ● Selectivity of filter conditions involving correlated columns is estimated inaccurately (13) ● Suppose we want all the rows from a table where a = 1 and b = 1 and c = 1 and d = 1 and e = 1. The planner must estimate the number of rows that will match, but only has statistics on each column individually. ● Planner incorrectly thinks that “SELECT * FROM foo WHERE a = 1 ORDER BY b LIMIT n” will fill the limit after reading a small percentage of the index (11) ● It can scan an index on b and filter for rows where a = 1. ● Or it can scan an index on a, find all rows where a = 1, and perform a top-N sort. ● It often prefers the former when the latter would be faster. ● Can often be worked around with a composite or functional index.
  • 12. Planner Error: Row Count Estimation – Others (24) ● Using WITH Results in a Bad Plan (5). Some of these are query fattening issues, while others result from failure to dig out variable statistics. ● Generic Plans Can Have Wildly Wrong Estimates (4). Improved. ● Selectivity Estimates on Arbitrary Estimates are Poor (4) ● Join Selectivity Doesn't Know about Cross-Table Correlations (3) ● Uncommitted Tuples Don't Affect Statistics (2) ● No Stats for WITH RECURSIVE (1) or GROUP BY (1) Results ● Redundant Equality Constraints Not Identified As Such (1) ● IN/NOT IN Estimation Doesn't Assume Array Elements Distinct (1). Fixed. ● Histogram Bounds Can Slide Due to New Data (1). Fixed. ● Inheritance Parents Aren't Assumed to be Completely Empty (1). Fixed.
  • 13. Planner Error: Cost Estimation (7) ● Planner doesn't account for de-TOASTing cost (4) ● Plan change causes volume of data to exceed server memory (2) ● Hash join sometimes decides to hash the larger table when it should probably be hashing the smaller one (1)
  • 14. Planner Error: Conceptual (28) ● Cross-data type comparisons are not always indexable (3) ● Inlining the same thing multiple times can lose (3) ● NOT IN is hard to optimize – and we don't try very hard (3) ● Target lists are computed too early or unnecessary targets are computed (3) ● Can't rewrite SELECT max(a) FROM foo WHERE b IN (…) as max of index scans (2) ● Can't rearrange joins and aggregates relative to one another (2) ● Can't deduce implied inequalities (2) ● Ten other issues that came up once each
  • 15. Thank You ● Any questions?