Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Shaping Optimizer's Search Space

358 vues

Publié le

In this presentation Markus Winand talks about how developers can help the database optimizer come up with better execution plans.

Publié dans : Logiciels
  • Soyez le premier à commenter

  • Soyez le premier à aimer ceci

Shaping Optimizer's Search Space

  1. 1. Shaping the
 Optimizer’s
 Search-Space @MarkusWinand
  2. 2. The Optimizer’s Search-Space is Limited “the query optimizer determines
 the most efficient execution plan*” ...most efficient? Out of what? *http://docs.oracle.com/cd/E16655_01/server.121/e15857/pfgrf_perf_overview.htm#TGDBA94082
  3. 3. The Optimizer’s Search-Space is Limited The Optimizer... ‣Considers existing indexes only ➡ Other indexes might give even better performance ‣Doesn’t de-obfuscate queries very well ➡ Writing it in simpler terms might improve performance ‣Has built-in limitations ➡ Some theoretically possible plans are never considered
  4. 4. Bring the Best Plan in the Search-Space ... it determines the most efficient execution plan out of the remaining ones. Before the optimizer can find the
 absolutely best plan we must first
 make sure it is within these boundaries.
  5. 5. Two steps to get the absolutely best access path: 1. Maximize data-locality ‣ Plain old B-tree index is the #1 tool for that ‣ Partitions are greatly overrated ‣ Table clusters are slightly underrated It’s All About Matching Queries to Indexes 2. Write the query to exploit it ‣ Use explicit range conditions ‣ Use top-n based termination ‣ Exploit index order Thinking in Ordered
 Sets
  6. 6. Visualizing Indexes as Pyramids Visualize Simplify
  7. 7. The Order of Multi-Column Indexes
  8. 8. The Order of Multi-Column Indexes
  9. 9. The Order of Multi-Column Indexes
  10. 10. Using Indexes: Column Order Defines Row-Locality Example: WHERE A > :a AND B = :b
  11. 11. Using Indexes: Column Order Defines Row-Locality Example: WHERE A > :a AND B = :b
  12. 12. Using Indexes: Column Order Defines Row-Locality Simple-man’s guidelines (best in ~97% of the cases): ‣ Conjunctive equality conditions are king Column order doesn’t affect data-locality ➡ Put them first into the index and choose the column order so that other queries can use the index too. ‣ Conjunctive range conditions are tricky Column order affects data-locality ➡ Put them after the equality columns. If there are multiple range conditions, put the most-selective first.
  13. 13. Using Indexes: Column Order Defines Row-Locality Common mistakes: ‣ Arbitrary column order ☠ (bad) “Just put all columns from the where-clause in the index” ➡ Works only for all-conjunctive all-equality searches ➡ Doesn’t make the index useful for other queries ‣ Most-selective first ☠ (bad) “Order the columns according to the selectivity” ➡ Only valid to prioritize among multiple range conditions
  14. 14. Using Indexes: Finding Bad Index Row-Locality ------------------------------------ | Id | Operation | ------------------------------------ | 0 | SELECT STATEMENT | | 1 | TABLE ACCESS BY INDEX ROWID| |* 2 | INDEX SKIP SCAN | ------------------------------------ Predicate Information: ------------------------------------ 2 - access("B"=20 AND "A">25) filter("B"=20) Index on (A, B) ------------------------------------ | Id | Operation | ------------------------------------ | 0 | SELECT STATEMENT | | 1 | TABLE ACCESS BY INDEX ROWID| |* 2 | INDEX RANGE SCAN | ------------------------------------ Predicate Information: ------------------------------------ 2 - access("B"=20 AND "A">25) Index on (B, A) Most
 efficient solution Most efficient
 workaround ‣ Index filter predicates are a “bad smell” ‣ Index Skip Scan is a “bad smell” ‣ Index Fast Full Scan is a “bad smell”
  15. 15. Using Indexes: Trailing-Columns to Avoid Table-Access Example: SELECT C FROM X WHERE A > :a AND B = :b
  16. 16. Using Indexes: Trailing-Columns to Avoid Table-Access Example: SELECT C FROM X WHERE A > :a AND B = :b
  17. 17. Using Indexes: Trailing-Columns to Avoid Table-Access Add all needed columns to the index to avoid table access.
 The so-called index-only scan. ‣ Useful to nullify a bad clustering factor
 Consequently, not very useful if ➡ Clustering factor close to the number of table blocks or ➡ Selecting only a few rows ‣ A single non-indexed column breaks it No matter where it is mentioned (SELECT, ORDER BY,...) ➡ All or nothing: no benefit from adding some SELECT columns to the index.
  18. 18. Using Indexes: Trailing-Columns to Avoid Table-Access Common mistakes: ‣ Selecting unneeded columns* ☠ (bad) SELECT * anybody? ORM-tools in use? Hooray! ➡ Adding many columns to many indexes is a no-no. ‣ Pushing too hard ☠ (bad) ➡ Index gets bigger, clustering factor (CF) gets worse ➡ Small benefit for low CF or if selecting a few rows only ➡ You’ll hit the hard limits (32 columns, 6398 bytes@8k) * http://use-the-index-luke.com/blog/2013-08/its-not-about-the-star-stupid
  19. 19. It’s All About Matching Queries to Indexes Two steps to get the absolutely best access path: 1. Maximize data-locality ‣ Plain old B-tree index is the #1 tool for that ‣ Partitions are greatly overrated ‣ Table clusters are slightly underrated 2. Write the query to exploit it ‣ Use explicit range conditions ‣ Use top-n based termination ‣ Exploit index order Thinking in Ordered
 Sets ✓ ✓
  20. 20. Example: List yesterday’s orders CREATE TABLE orders (
 ...,
 order_dt DATE NOT NULL,
 ...
 ); INSERT INTO orders
 (..., order_dt, ...)
 VALUES (..., sysdate , ...); 100k rows Evenly distributed
 over 4 weeks.
  21. 21. Example: List yesterday’s orders 1. Maximize data-locality
  22. 22. Example: List yesterday’s orders 1. Lower bound:
 ORDER_DT >= TRUNC(sysdate-1) 2. Upper bound:
 ORDER_DT < TRUNC(sysdate) 2. Write query using explicit range conditions ---------------------------------------------- | Id | Operation | ---------------------------------------------- | 0 | SELECT STATEMENT | |* 1 | FILTER | | 2 | TABLE ACCESS BY INDEX ROWID BATCHED | |* 3 | INDEX RANGE SCAN | ---------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(TRUNC(SYSDATE@!)>TRUNC(SYSDATE@!-1)) 3 - access("ORDER_DT">=TRUNC(SYSDATE@!-1) AND "ORDER_DT"<TRUNC(SYSDATE@!))
  23. 23. Example: List yesterday’s orders Common anti-pattern: ‣TRUNC(order_dt)=:yesterday ☠ (bad) This is an “obfuscation” of the actual intention ➡ Requires function-based index
 CREATE INDEX … (TRUNC(order_dt)); ➡ Doesn’t support ordering by order_dt
 WHERE TRUNC(order_dt) = :yesterday
 ORDER BY order_dt DESC; Index
 not ordered by that
  24. 24. -------------------------------------- | Id | Operation | -------------------------------------- | 0 | SELECT STATEMENT | |* 1 | FILTER | | 2 | TABLE ACCESS BY INDEX ROWID | |* 3 | INDEX RANGE SCAN DESCENDING| -------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(TRUNC(SYSDATE@!)>TRUNC(SYSDATE@!-1)) 3 - access("ORDER_DT"<TRUNC(SYSDATE@!) AND "ORDER_DT">=TRUNC(SYSDATE@!-1)) Example: List yesterday’s orders reverse chronologically 1. Lower & upper bounds:
 ORDER_DT >= TRUNC(sysdate-1)
 ORDER_DT < TRUNC(sysdate) 2. Order
 ORDER BY ORDER_DT DESC
 2. Write query - exploit index order
  25. 25. -------------------------------------- | Id | Operation | -------------------------------------- | 0 | SELECT STATEMENT | |* 1 | FILTER | | 2 | TABLE ACCESS BY INDEX ROWID | |* 3 | INDEX RANGE SCAN DESCENDING| -------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(TRUNC(SYSDATE@!)>TRUNC(SYSDATE@!-1)) 3 - access("ORDER_DT"<TRUNC(SYSDATE@!) AND "ORDER_DT">=TRUNC(SYSDATE@!-1)) Example: List yesterday’s orders reverse chronologically 2. Write query - exploit index order 1. Lower & upper bounds:
 TRUNC(ORDER_DT) 
 = TRUNC(sysdate)-1 2. Order
 ORDER BY ORDER_DT DESC

  26. 26. Example: List yesterday’s orders reverse chronologically 2. Write query - exploit index order ---------------------------------------------- | Id | Operation | ---------------------------------------------- | 0 | SELECT STATEMENT | | 1 | SORT ORDER BY | | 2 | TABLE ACCESS BY INDEX ROWID BATCHED | |* 3 | INDEX RANGE SCAN | ---------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("ORDERS"."SYS_NC00004$"=TRUNC(SYSDATE@!-1))    Tradeoff:
   CPU
  Memory
 IO
 1. Lower & upper bounds:
 TRUNC(ORDER_DT) 
 = TRUNC(sysdate)-1 2. Order
 ORDER BY ORDER_DT DESC

  27. 27. Example: List orders from last 24 hours 1. Data-locality for the TRUNC variant * http://www.sqlfail.com/2014/05/05/oracle-can-now-use-function-based-indexes-in-queries-without-functions/
  28. 28. 2. Write query using explicit range conditions Example: List orders from last 24 hours ------------------------------------------------- | Id | Operation | ------------------------------------------------- | 0 | SELECT STATEMENT | |* 1 | TABLE ACCESS BY INDEX ROWID BATCHED | |* 2 | INDEX RANGE SCAN on TRUNC(ORDER_DT) | ------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("ORDER_DT">SYSDATE@!-1) 2 - access("ORDERS"."SYS_NC00004$">=TRUNC(SYSDATE@!-1)) 2. Upper bound: none (unbounded) 1. Lower bound:
 ORDER_DT > sysdate - 1 To use FBI Oracle adds (since 11.2.0.2*)
 TRUNC(ORDER_DT)>=TRUNC(sysdate-1) * http://www.sqlfail.com/2014/05/05/oracle-can-now-use-function-based-indexes-in-queries-without-functions/
  29. 29. Example: List orders from last 24 hours 1. Maximize data-locality using straight index
  30. 30. Example: List orders from last 24 hours 2. Write query using explicit range conditions 1. Lower bound:
 ORDER_DT > sysdate - 1 2. Upper bound: none (unbounded) -------------------------------------------- | Id | Operation | -------------------------------------------- | 0 | SELECT STATEMENT | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| |* 2 | INDEX RANGE SCAN | -------------------------------------------- Predicate Information: ---------------------- 2 - access("ORDER_DT">SYSDATE@!-1)
  31. 31. Example: List orders from last 24 hours -------------------------------------------- | Id | Operation | -------------------------------------------- | 0 | SELECT STATEMENT | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| |* 2 | INDEX RANGE SCAN | -------------------------------------------- Predicate Information: ---------------------- 2 - access("ORDER_DT">SYSDATE@!-1) -------------------------------------------- | Id | Operation | -------------------------------------------- | 0 | SELECT STATEMENT | |* 1 | TABLE ACCESS BY INDEX ROWID BATCHED| |* 2 | INDEX RANGE SCAN | -------------------------------------------- Predicate Information: ---------------------- 1 - filter("ORDER_DT">SYSDATE@!-1) 2 - access("ORDERS"."SYS_NC00004$">=TRUNC(SYSDATE@!-1))   Most
  efficient solution   Most
  efficient
 workaround
  32. 32. It’s All About Matching Queries to Indexes Two steps to get the absolutely best access path: 1. Maximize data-locality ‣ Plain old B-tree index is the #1 tool for that ‣ Partitions are greatly overrated ‣ Table clusters are slightly underrated 2. Write the query to exploit it ‣ Use explicit range conditions ‣ Use top-n based termination ‣ Exploit index order Thinking in Ordered
 Sets ✓ ✓ ✓ ✓
  33. 33. Example: List 10 Most Recent Orders 1. Maximize data-locality
  34. 34. Example: List 10 Most Recent Orders 2. Write query using explicit range conditions 1. Lower bound...? After 10 rows...??? 2. Upper bound? sysdate? Unbounded!
  35. 35. Example: List 10 Most Recent Orders 1. Lower bound...? After 10 rows...??? 2. Upper bound? sysdate? Unbounded! 2. Write query using top-n based termination 3. Start with: most recent
 ORDER BY ORDER_DT DESC 4. Stop after: 10 rows
 FETCH FIRST 10 ROWS ONLY (since 12c)
  36. 36. Example: List 10 Most Recent Orders 2. Write query using top-n based termination 3. Start with: most recent
 ORDER BY ORDER_DT DESC 4. Stop after: 10 rows
 FETCH FIRST 10 ROWS ONLY (since 12c) ---------------------------------------------------------- | Id | Operation | A-Rows | Buffers | ---------------------------------------------------------- | 0 | SELECT STATEMENT | 10 | 8 | |* 1 | VIEW | 10 | 8 | |* 2 | WINDOW NOSORT STOPKEY | 10 | 8 | | 3 | TABLE ACCESS BY INDEX ROWID| 11 | 8 | | 4 | INDEX FULL SCAN DESCENDING| 11 | 3 | ---------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=10) 2 - filter(ROW_NUMBER() OVER (ORDER BY ORDER_DT DESC)<=10) ROW_NUMBER() OVER (ORDER BY ORDER_DT DESC)<=10
  37. 37. 
 
 SELECT orders.*
 , ROW_NUMBER() OVER (
 ORDER BY order_dt DESC
 ) rn
 FROM orders
 Window-Functions for Top-N Termination
  38. 38. SELECT *
 FROM (
 SELECT orders.*
 , ROW_NUMBER() OVER (
 ORDER BY order_dt DESC
 ) rn
 FROM orders
 )
 WHERE rn <= 10
 ORDER BY order_dt DESC; Window-Functions for Top-N Termination Select 10 rows
  39. 39. Window-Functions for Top-N Termination SELECT *
 FROM (
 SELECT orders.*
 , DENSE_RANK() OVER (
 ORDER BY TRUNC(order_dt) DESC
 ) rn
 FROM orders
 )
 WHERE rn <= 1
 ORDER BY order_dt DESC; Select 1 group
  40. 40. Window-Functions for Top-N Termination SELECT *
 FROM (
 SELECT orders.*
 , DENSE_RANK() OVER (
 ORDER BY TRUNC(order_dt) DESC
 ) rn
 FROM orders
 )
 WHERE rn <= 1
 ORDER BY order_dt DESC; Useful to
 abort on edges
  41. 41. Window-Functions for Top-N Termination SELECT *
 FROM (
 SELECT orders.*
 , DENSE_RANK() OVER (
 ORDER BY TRUNC(order_dt) DESC
 ) rn
 FROM orders
 )
 WHERE rn <= 1
 ORDER BY order_dt DESC; --------------------------------------------------------------------------- | Id | Operation | E-Rows | A-Rows | Buffers | Reads | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 2057 | 695 | 695 | | 1 | SORT ORDER BY | 100K| 2057 | 695 | 695 | |* 2 | VIEW | 100K| 2057 | 695 | 695 | |* 3 | WINDOW NOSORT STOPKEY | 100K| 2057 | 695 | 695 | | 4 | TABLE ACCESS BY INDEX ROWID| 100K| 2058 | 695 | 695 | | 5 | INDEX FULL SCAN DESCENDING| 100K| 2058 | 8 | 8 | --------------------------------------------------------------------------- DENSE_RANK
  42. 42. Window-Functions for Top-N Termination SELECT *
 FROM orders
 WHERE TRUNC(order_dt)
 = (SELECT TRUNC(MAX(order_dt))
 FROM orders
 )
 ORDER BY order_dt ; --------------------------------------------------------------------------- | Id | Operation | E-Rows | A-Rows | Buffers | Reads | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 2057 | 695 | 695 | | 1 | SORT ORDER BY | 100K| 2057 | 695 | 695 | |* 2 | VIEW | 100K| 2057 | 695 | 695 | |* 3 | WINDOW NOSORT STOPKEY | 100K| 2057 | 695 | 695 | | 4 | TABLE ACCESS BY INDEX ROWID| 100K| 2058 | 695 | 695 | | 5 | INDEX FULL SCAN DESCENDING| 100K| 2058 | 8 | 8 | --------------------------------------------------------------------------- DENSE_RANK
  43. 43. Window-Functions for Top-N Termination --------------------------------------------------------------------------------- | Id | Operation | E-Rows | A-Rows | Buffers | Reads | --------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 2057 | 1038 | 694 | | 1 | SORT ORDER BY | 3448 | 2057 | 1038 | 694 | | 2 | TABLE ACCESS BY INDEX ROWID BATCHED| 3448 | 2057 | 1038 | 694 | |* 3 | INDEX RANGE SCAN | 3448 | 2057 | 10 | 8 | | 4 | SORT AGGREGATE | 1 | 1 | 2 | 2 | | 5 | INDEX FULL SCAN (MIN/MAX) | 1 | 1 | 2 | 2 | --------------------------------------------------------------------------------- --------------------------------------------------------------------------- | Id | Operation | E-Rows | A-Rows | Buffers | Reads | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 2057 | 695 | 695 | | 1 | SORT ORDER BY | 100K| 2057 | 695 | 695 | |* 2 | VIEW | 100K| 2057 | 695 | 695 | |* 3 | WINDOW NOSORT STOPKEY | 100K| 2057 | 695 | 695 | | 4 | TABLE ACCESS BY INDEX ROWID| 100K| 2058 | 695 | 695 | | 5 | INDEX FULL SCAN DESCENDING| 100K| 2058 | 8 | 8 | --------------------------------------------------------------------------- DENSE_RANK SUB-SELECT
  44. 44. Top-N vs. Max()-Subquery Common mistakes: ‣ Breaking ties with sub-queries ☠ (bad) WHERE (a, b)= (select max(a), max(b) ...) ➡ max() values coming from different rows... ➡ No rows selected. ‣ Selecting Nth largest ☠ (bad) WHERE X < (SELECT MAX()...
 WHERE X < (SELECT MAX()...)) WHERE (N-1) = (SELECT COUNT(DISTINCT(DT))...
  45. 45. It’s All About Matching Queries to Indexes Two steps to get the absolutely best execution plan: 1. Maximize data-locality ‣ Plain old B-tree index is the #1 tool for that ‣ Partitions are greatly overrated ‣ Table clusters are slightly underrated 2. Write the query to exploit it ‣ Use explicit range conditions ‣ Use top-n based termination ‣ Exploit index order Thinking in Ordered
 Sets ✓ ✓ ✓ ✓ ✓
  46. 46. 1. Maximize data-locality Example: List next 10 orders
  47. 47. Example: List next 10 orders 2. Use explicit range condition & top-n abort 1. Lower bound: unbounded (top-n) 2. Upper bound: where we stopped
 WHERE ORDER_DT < :prev_dt 3. ORDER BY ORDER_DT DESC 4. FETCH FIRST 10 ROWS ONLY What about ties?
  48. 48. Explicit range conditions: the general case Example: List next 10 orders
  49. 49. Explicit range conditions: the general case Example: List next 10 orders 1. Use definite sort order 2. Use Row-Value filter to remove what we have seen before (SQL:92) 3. Hit Enter
  50. 50. Explicit range conditions: the general case Example: List next 10 orders
  51. 51. Explicit range conditions: the general case Example: List next 10 orders (x,y) = (a,b) (x,y) IN ((a,b),(c,d)) (x,y) < (a,b) (x,y) > (a,b) ✓ ✓ ✗ ✗ Oracle limitation
  52. 52. Explicit range conditions: the general case Example: List next 10 orders Oracle limitation Two semantically
 equivalent workarounds: X <= A AND NOT(X=A AND Y>=B) (X < A) OR (X = A AND Y < B) * http://use-the-index-luke.com/sql/partial-results/fetch-next-page#sb-equivalent-logic ☠ No proper index use*
  53. 53. Using OFFSET to fetch next rows ‣After adding FETCH FIRST...ROWS ONLY, with SQL:2008, SQL:2011 introduced OFFSET to skip rows. ‣Rows can be skipped with the ROWNUM pseudo column too (ROWNUM > :x) ‣ROW_NUMBER() can do the trick too. It doesn’t matter how to write it, ...
  54. 54. Using OFFSET to fetch next rows OFFSET = SLEEP The bigger the number,
 the slower the execution. Even worse: it eats up resources
 and yields drifting results.
  55. 55. It’s All About Matching Queries to Indexes Two steps to get the absolutely best access path: 1. Maximize data-locality ‣ Plain old B-tree index is the #1 tool for that ‣ Partitions are greatly overrated ‣ Table clusters are slightly underrated 2. Write the query to exploit it ‣ Use explicit range conditions ‣ Use top-n based termination ‣ Exploit index order Thinking in Ordered
 Sets ✓ ✓ ✓ ✓ ✓ ✓
  56. 56. Index Smart, Not Hard
  57. 57. About @MarkusWinand ‣Training for Developers ‣ SQL Performance (Indexing) ‣ Modern SQL ‣ On-Site or Online ‣SQL Tuning ‣ Index-Redesign ‣ Query Improvements ‣ On-Site or Online http://winand.at/
  58. 58. About @MarkusWinand @ModernSQL http://modern-sql.com @SQLPerfTips http://use-the-index-luke.com

×