The document discusses strategies for optimizing queries by shaping the optimizer's search space. It recommends:
1. Maximizing data locality by using basic B-tree indexes rather than more complex options like partitions or clusters.
2. Writing queries to explicitly exploit indexes by using range conditions, ordering results to match the index order, and terminating scans after a specified number of rows.
3. Ordering columns in multi-column indexes to match the predicates in common queries, with equality conditions before range conditions.
2. The Optimizer’s Search-Space is Limited
“the query optimizer determines
the most efficient execution plan*”
...most efficient? Out of what?
*http://docs.oracle.com/cd/E16655_01/server.121/e15857/pfgrf_perf_overview.htm#TGDBA94082
3. The Optimizer’s Search-Space is Limited
The Optimizer...
‣Considers existing indexes only
➡ Other indexes might give even better performance
‣Doesn’t de-obfuscate queries very well
➡ Writing it in simpler terms might improve performance
‣Has built-in limitations
➡ Some theoretically possible plans are never considered
4. Bring the Best Plan in the Search-Space
... it determines the most efficient
execution plan out of the remaining ones.
Before the optimizer can find the
absolutely best plan we must first
make sure it is within these boundaries.
5.
6.
7. Two steps to get the absolutely best access path:
1. Maximize data-locality
‣ Plain old B-tree index is the #1 tool for that
‣ Partitions are greatly overrated
‣ Table clusters are slightly underrated
It’s All About Matching Queries to Indexes
2. Write the query to exploit it
‣ Use explicit range conditions
‣ Use top-n based termination
‣ Exploit index order
Thinking
in
Ordered
Sets
14. Using Indexes:
Column Order Defines Row-Locality
Simple-man’s guidelines (best in ~97% of the cases):
‣ Conjunctive equality conditions are king
Column order doesn’t affect data-locality
➡ Put them first into the index and choose the column
order so that other queries can use the index too.
‣ Conjunctive range conditions are tricky
Column order affects data-locality
➡ Put them after the equality columns. If there are
multiple range conditions, put the most-selective first.
15. Using Indexes:
Column Order Defines Row-Locality
Common mistakes:
‣ Arbitrary column order ☠ (bad)
“Just put all columns from the where-clause in the index”
➡ Works only for all-conjunctive all-equality searches
➡ Doesn’t make the index useful for other queries
‣ Most-selective first ☠ (bad)
“Order the columns according to the selectivity”
➡ Only valid to prioritize among multiple range conditions
16. Using Indexes:
Finding Bad Index Row-Locality
------------------------------------
| Id | Operation |
------------------------------------
| 0 | SELECT STATEMENT |
| 1 | TABLE ACCESS BY INDEX ROWID|
|* 2 | INDEX SKIP SCAN |
------------------------------------
Predicate Information:
------------------------------------
2 - access("B"=20 AND "A">25)
filter("B"=20)
Index on (A, B)
------------------------------------
| Id | Operation |
------------------------------------
| 0 | SELECT STATEMENT |
| 1 | TABLE ACCESS BY INDEX ROWID|
|* 2 | INDEX RANGE SCAN |
------------------------------------
Predicate Information:
------------------------------------
2 - access("B"=20 AND "A">25)
Index on (B, A)
Most
efficient solution
Most efficient
workaround
‣ Index filter predicates are a “bad smell”
‣ Index Skip Scan is a “bad smell”
‣ Index Fast Full Scan is a “bad smell”
19. Using Indexes:
Trailing-Columns to Avoid Table-Access
Add all needed columns to the index to avoid table access.
The so-called index-only scan.
‣ Useful to nullify a bad clustering factor
Consequently, not very useful if
➡ Clustering factor close to the number of table blocks or
➡ Selecting only a few rows
‣ A single non-indexed column breaks it
No matter where it is mentioned (SELECT, ORDER BY,...)
➡ All or nothing: no benefit from adding some SELECT
columns to the index.
20. Using Indexes:
Trailing-Columns to Avoid Table-Access
Common mistakes:
‣ Selecting unneeded columns* ☠ (bad)
SELECT * anybody? ORM-tools in use? Hooray!
➡ Adding many columns to many indexes is a no-no.
‣ Pushing too hard ☠ (bad)
➡ Index gets bigger, clustering factor (CF) gets worse
➡ Small benefit for low CF or if selecting a few rows only
➡ You’ll hit the hard limits (32 columns, 6398 bytes@8k)
* http://use-the-index-luke.com/blog/2013-08/its-not-about-the-star-stupid
21. It’s All About Matching Queries to Indexes
Two steps to get the absolutely best access path:
1. Maximize data-locality
‣ Plain old B-tree index is the #1 tool for that
‣ Partitions are greatly overrated
‣ Table clusters are slightly underrated
2. Write the query to exploit it
‣ Use explicit range conditions
‣ Use top-n based termination
‣ Exploit index order
Thinking
in
Ordered
Sets
✓ ✓
24. Example:
List yesterday’s orders
1. Lower bound:
ORDER_DT >= TRUNC(sysdate-1)
2. Upper bound:
ORDER_DT < TRUNC(sysdate)
2. Write query using explicit range conditions
----------------------------------------------
| Id | Operation |
----------------------------------------------
| 0 | SELECT STATEMENT |
|* 1 | FILTER |
| 2 | TABLE ACCESS BY INDEX ROWID BATCHED |
|* 3 | INDEX RANGE SCAN |
----------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(TRUNC(SYSDATE@!)>TRUNC(SYSDATE@!-1))
3 - access("ORDER_DT">=TRUNC(SYSDATE@!-1)
AND "ORDER_DT"<TRUNC(SYSDATE@!))
25. Example:
List yesterday’s orders
Common anti-pattern:
‣TRUNC(order_dt)=:yesterday ☠ (bad)
This is an “obfuscation” of the actual intention
➡ Requires function-based index
CREATE INDEX … (TRUNC(order_dt));
➡ Doesn’t support ordering by order_dt
WHERE TRUNC(order_dt) = :yesterday
ORDER BY order_dt DESC;
Index
not ordered by that
28. Example:
List yesterday’s orders reverse chronologically
2. Write query - exploit index order
----------------------------------------------
| Id | Operation |
----------------------------------------------
| 0 | SELECT STATEMENT |
| 1 | SORT ORDER BY |
| 2 | TABLE ACCESS BY INDEX ROWID BATCHED |
|* 3 | INDEX RANGE SCAN |
----------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("ORDERS"."SYS_NC00004$"=TRUNC(SYSDATE@!-1))
Tradeoff:
CPU
Memory
IO
1. Lower & upper bounds:
TRUNC(ORDER_DT)
= TRUNC(sysdate)-1
2. Order
ORDER BY ORDER_DT DESC
29. Example:
List orders from last 24 hours
1. Data-locality for the TRUNC variant
* http://www.sqlfail.com/2014/05/05/oracle-can-now-use-function-based-indexes-in-queries-without-functions/
30. 2. Write query using explicit range conditions
Example:
List orders from last 24 hours
-------------------------------------------------
| Id | Operation |
-------------------------------------------------
| 0 | SELECT STATEMENT |
|* 1 | TABLE ACCESS BY INDEX ROWID BATCHED |
|* 2 | INDEX RANGE SCAN on TRUNC(ORDER_DT) |
-------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("ORDER_DT">SYSDATE@!-1)
2 - access("ORDERS"."SYS_NC00004$">=TRUNC(SYSDATE@!-1))
2. Upper bound: none (unbounded)
1. Lower bound:
ORDER_DT > sysdate - 1
To use FBI Oracle adds (since 11.2.0.2*)
TRUNC(ORDER_DT)>=TRUNC(sysdate-1)
* http://www.sqlfail.com/2014/05/05/oracle-can-now-use-function-based-indexes-in-queries-without-functions/
32. Example:
List orders from last 24 hours
2. Write query using explicit range conditions
1. Lower bound:
ORDER_DT > sysdate - 1
2. Upper bound: none (unbounded)
--------------------------------------------
| Id | Operation |
--------------------------------------------
| 0 | SELECT STATEMENT |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED|
|* 2 | INDEX RANGE SCAN |
--------------------------------------------
Predicate Information:
----------------------
2 - access("ORDER_DT">SYSDATE@!-1)
33. Example:
List orders from last 24 hours
--------------------------------------------
| Id | Operation |
--------------------------------------------
| 0 | SELECT STATEMENT |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED|
|* 2 | INDEX RANGE SCAN |
--------------------------------------------
Predicate Information:
----------------------
2 - access("ORDER_DT">SYSDATE@!-1)
--------------------------------------------
| Id | Operation |
--------------------------------------------
| 0 | SELECT STATEMENT |
|* 1 | TABLE ACCESS BY INDEX ROWID BATCHED|
|* 2 | INDEX RANGE SCAN |
--------------------------------------------
Predicate Information:
----------------------
1 - filter("ORDER_DT">SYSDATE@!-1)
2 - access("ORDERS"."SYS_NC00004$">=TRUNC(SYSDATE@!-1))
Most
efficient
solution
Most
efficient
workaround
34. It’s All About Matching Queries to Indexes
Two steps to get the absolutely best access path:
1. Maximize data-locality
‣ Plain old B-tree index is the #1 tool for that
‣ Partitions are greatly overrated
‣ Table clusters are slightly underrated
2. Write the query to exploit it
‣ Use explicit range conditions
‣ Use top-n based termination
‣ Exploit index order
Thinking
in
Ordered
Sets
✓
✓
✓
✓
36. Example:
List 10 Most Recent Orders
2. Write query using explicit range conditions
1. Lower bound...? After 10 rows...???
2. Upper bound? sysdate? Unbounded!
37. Example:
List 10 Most Recent Orders
1. Lower bound...? After 10 rows...???
2. Upper bound? sysdate? Unbounded!
2. Write query using top-n based termination
3. Start with: most recent
ORDER BY ORDER_DT DESC
4. Stop after: 10 rows
FETCH FIRST 10 ROWS ONLY (since 12c)
38. Example:
List 10 Most Recent Orders
2. Write query using top-n based termination
3. Start with: most recent
ORDER BY ORDER_DT DESC
4. Stop after: 10 rows
FETCH FIRST 10 ROWS ONLY (since 12c)
----------------------------------------------------------
| Id | Operation | A-Rows | Buffers |
----------------------------------------------------------
| 0 | SELECT STATEMENT | 10 | 8 |
|* 1 | VIEW | 10 | 8 |
|* 2 | WINDOW NOSORT STOPKEY | 10 | 8 |
| 3 | TABLE ACCESS BY INDEX ROWID| 11 | 8 |
| 4 | INDEX FULL SCAN DESCENDING| 11 | 3 |
----------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=10)
2 - filter(ROW_NUMBER() OVER (ORDER BY ORDER_DT DESC)<=10) ROW_NUMBER() OVER (ORDER BY ORDER_DT DESC)<=10
41. Window-Functions for Top-N Termination
SELECT *
FROM (
SELECT orders.*
, DENSE_RANK() OVER (
ORDER BY TRUNC(order_dt) DESC
) rn
FROM orders
)
WHERE rn <= 1
ORDER BY order_dt DESC;
Select 1 group
42. Window-Functions for Top-N Termination
SELECT *
FROM (
SELECT orders.*
, DENSE_RANK() OVER (
ORDER BY TRUNC(order_dt) DESC
) rn
FROM orders
)
WHERE rn <= 1
ORDER BY order_dt DESC;
Useful to
abort on edges
43. Window-Functions for Top-N Termination
SELECT *
FROM (
SELECT orders.*
, DENSE_RANK() OVER (
ORDER BY TRUNC(order_dt) DESC
) rn
FROM orders
)
WHERE rn <= 1
ORDER BY order_dt DESC;
---------------------------------------------------------------------------
| Id | Operation | E-Rows | A-Rows | Buffers | Reads |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2057 | 695 | 695 |
| 1 | SORT ORDER BY | 100K| 2057 | 695 | 695 |
|* 2 | VIEW | 100K| 2057 | 695 | 695 |
|* 3 | WINDOW NOSORT STOPKEY | 100K| 2057 | 695 | 695 |
| 4 | TABLE ACCESS BY INDEX ROWID| 100K| 2058 | 695 | 695 |
| 5 | INDEX FULL SCAN DESCENDING| 100K| 2058 | 8 | 8 |
---------------------------------------------------------------------------
DENSE_RANK
44. Window-Functions for Top-N Termination
SELECT *
FROM orders
WHERE TRUNC(order_dt)
= (SELECT TRUNC(MAX(order_dt))
FROM orders
)
ORDER BY order_dt ;
---------------------------------------------------------------------------
| Id | Operation | E-Rows | A-Rows | Buffers | Reads |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2057 | 695 | 695 |
| 1 | SORT ORDER BY | 100K| 2057 | 695 | 695 |
|* 2 | VIEW | 100K| 2057 | 695 | 695 |
|* 3 | WINDOW NOSORT STOPKEY | 100K| 2057 | 695 | 695 |
| 4 | TABLE ACCESS BY INDEX ROWID| 100K| 2058 | 695 | 695 |
| 5 | INDEX FULL SCAN DESCENDING| 100K| 2058 | 8 | 8 |
---------------------------------------------------------------------------
DENSE_RANK
46. Top-N vs. Max()-Subquery
Common mistakes:
‣ Breaking ties with sub-queries ☠ (bad)
WHERE (a, b)= (select max(a), max(b) ...)
➡ max() values coming from different rows...
➡ No rows selected.
‣ Selecting Nth largest ☠ (bad)
WHERE X < (SELECT MAX()...
WHERE X < (SELECT MAX()...))
WHERE (N-1) = (SELECT COUNT(DISTINCT(DT))...
47. It’s All About Matching Queries to Indexes
Two steps to get the absolutely best execution plan:
1. Maximize data-locality
‣ Plain old B-tree index is the #1 tool for that
‣ Partitions are greatly overrated
‣ Table clusters are slightly underrated
2. Write the query to exploit it
‣ Use explicit range conditions
‣ Use top-n based termination
‣ Exploit index order
Thinking
in
Ordered
Sets
✓
✓
✓
✓
✓
49. Example:
List next 10 orders
2. Use explicit range condition & top-n abort
1. Lower bound: unbounded (top-n)
2. Upper bound: where we stopped
WHERE ORDER_DT < :prev_dt
3. ORDER BY ORDER_DT DESC
4. FETCH FIRST 10 ROWS ONLY
What about ties?
51. Explicit range conditions: the general case
Example:
List next 10 orders
1. Use definite sort order
2. Use Row-Value filter to
remove what we have
seen before (SQL:92)
3. Hit Enter
53. Explicit range conditions: the general case
Example:
List next 10 orders
(x,y) = (a,b)
(x,y) IN ((a,b),(c,d))
(x,y) < (a,b)
(x,y) > (a,b)
✓
✓
✗
✗
Oracle
limitation
54. Explicit range conditions: the general case
Example:
List next 10 orders
Oracle
limitation
Two semantically
equivalent workarounds:
X <= A
AND NOT(X=A AND Y>=B)
(X < A)
OR (X = A AND Y < B)
* http://use-the-index-luke.com/sql/partial-results/fetch-next-page#sb-equivalent-logic
☠
No proper index use*
55. Using OFFSET to fetch next rows
‣After adding FETCH FIRST...ROWS ONLY,
with SQL:2008, SQL:2011 introduced
OFFSET to skip rows.
‣Rows can be skipped with the ROWNUM
pseudo column too (ROWNUM > :x)
‣ROW_NUMBER() can do the trick too.
It doesn’t matter how to write it, ...
56. Using OFFSET to fetch next rows
OFFSET = SLEEP
The bigger the number,
the slower the execution.
Even worse: it eats up resources
and yields drifting results.
57. It’s All About Matching Queries to Indexes
Two steps to get the absolutely best access path:
1. Maximize data-locality
‣ Plain old B-tree index is the #1 tool for that
‣ Partitions are greatly overrated
‣ Table clusters are slightly underrated
2. Write the query to exploit it
‣ Use explicit range conditions
‣ Use top-n based termination
‣ Exploit index order
Thinking
in
Ordered
Sets
✓
✓
✓
✓
✓
✓