This session will take a look at when/how query optimization takes place, the resources used for query optimization, the role of index statistics and common application query problems (other than simplistic missing indexes) that lead to DBA’s assuming there is a query optimization issue.
1. (c) 2015 Independent SAP Technical User GroupAnnual Conference, 2015
-ISUG TECH 2015-ISUG TECH 2015
ConferenceConference
:The Science of DBMS Query Optimization:The Science of DBMS Query Optimization
,Jeff Tallman SAP ASE Product Management,Jeff Tallman SAP ASE Product Management
2. 2Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
AgendaAgenda
Intro& Optimization Basics
q Basic optimization cost factors
q Procedure Cache (ASE)
Query Processing& Optimization
q Internals of QP
q Impact of LOP-tree
q Understanding optimization vs. execution
Optimization Costing
q Histograms & column densities
q IN() & OR clauses
q Out of range histograms
q Joins & Multi-column densities
Controllingoptimization
q Sp_chgattribute ‘opt concurrency threshold’
q Sp_modifystats
q Resource Granularity
3. 3Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
SomeCaveatsSomeCaveats
Query Optimization isvery vendor proprietary/confidential
q You can buy books on generic optimization techniques….
q …but DBMS vendors hire PhD’s to develop implementations
ü Query performance often depends on how good the
optimization is
ü This is a key difference between OpenSource and COTS
DBMS packages
The strength of the query optimizer is largely due to the $$$ vested in skills of
highly educated staffing
Asa result, thissession will NOT explain thesecretsof ASE’soptimizer
q However, it will explain how it works, what influences it, what
resources it uses, etc.
q Additionally, most modern optimizers all use the same lava
tree model
ü Query optimization is based on an upside down tree with
data spewing out the top
4. 4Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Goal of ThisSessionGoal of ThisSession
Thegoal of thissession
q Help you understand the intricacies of query
optimization
q Use that knowledge to write queries that can be
optimized better
q Understand how/when additional index statistics might
be necessary
q Understand how to influence optimization
ü Other than the usual index forcing, AQP plan clauses,
etc.
q Differentiate when the optimizer is messing up…or your
SQL did
Assumptionsfor thissession
5. 5Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
RulesBased OptimizationRulesBased Optimization
Rulesbased optimization
q Index selection and join order processing are based on specific
rules
q For example:
ü Index selection is based on the index whose leading columns
are most covered by query predicates
ü Join order is based on left to right ordering in FROM clause
designates driving tables/join order
Thegood, bad & ugly
q Very good for extremely volatile data in which histogram
statistics are often stale/impossible
q Good for insert intensive monotonic sequences in which new
values are out of range of histograms
q Not so good…in fact sometimes ugly…on data that has any sort
of skew with highly repetitive values
q The really ugly part is if the SQL coders don’t know the “rules”
6. 6Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Cost Based OptimizationCost Based Optimization
Used by all mainstream DBMS’s
q Oracle, IBM DB2 UDB, MS SQL, ASE
Attemptstofind thecheapest method toperform query
q Uses some factoring of IO, CPU and memory
q Formula for cost varies among DBMS’s
Thekey tocostingisindex/column histograms
q In a sense, histograms attempt to report the relative skew of
the data being queried
q The optimizer’s goal is to find the cheapest access path
considering the data skew
q If it wasn’t for the histogram reporting the skew…a rules
based optimization would be the only choice
7. 7Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
SimpleCost Factors(1)SimpleCost Factors(1)
Physical IO
q This is pretty obvious – disks are slow.
q But we also need to predict how many writes (and then
re-reads) we may need to do for intermediate results
Logical IO
q This is where PhD’s are made
q Remember, at query optimization time, we don’t know
what pages we are after….
q However, we need to determine how many LIOs we
expect based on
ü How much of a table is already in cache
ü How often we may revisit the same pages for multiple
rows
8. 8Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
SimpleCost Factors(2)SimpleCost Factors(2)
Memory
q Besides LIO, memory can be used to cache query
intermediate results such as subquery results, hash tables
for HJ, etc.
q In addition, memory can be used to avoid writes – e.g. in
memory sorts for order by, sort merge joins, etc.
CPU
q Again, fairly basic – but every LIO requires CPU
ü We need to do the data comparison for non-index key
predicates
ü Again, though, we really don’t know how fast the CPU is
that we are on…and how awful the data comparisons
will be
We might apply some fuzzy logic on LIKE ‘%pattern%’ on large varchars or
something….but …..
q Also, basic – sorts require CPU as well
ü Distinct processing, Order by processing, etc.
9. 9Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
ProcedureCache& OptimizationProcedureCache& Optimization
Optimization • oneof theconsumersof proccache
q Index statistics are loaded into proc cache for each query optimization
ü Visible with set option show long
q Temporary work plans are created in proc cache
q Reported via set statistics resource on
q Total consumption not a lot (rule of thumb = #engines * 2MB for OLTP)
Twobigproblems
q There is no ‘sharing’ of index statistics in proc cache
q Index statistics don’t stay in cache
ü As soon as query optimization for that query is finished, the proc
buffers are deallocated.
ü This means a TON of logical IOs on sysstatistics
Unless you use a lot of fully prepared statements or stored procedures
ü Hence you really want to ensure you have a dedicated systables
cache
q This is largely due to historical aspects
ü Remember, in 1984, 1MB of memory was a lot
ü Today, sum of the index statistics are likely 256MB or less
10. 10Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
LoadingStats& ProcCacheUsageLoadingStats& ProcCacheUsage
Creating Initial Statistics for table aqi_locations l
.....Done creating Initial Statistics for table aqi_locations l
Creating Initial Statistics for table aqi_samples s
.....Done creating Initial Statistics for table aqi_samples s
Creating Initial Statistics for index aqi_locations_PK
.....Done creating Initial Statistics for index aqi_locations_PK
…
Phase 2b initialization of OptBlock0 ...
... phase 2b done.
Start merging statistics for table aqi_locations l
..... Done merging statistics for table aqi_locations l
Start merging statistics for table aqi_samples s
..... Done merging statistics for table aqi_samples s
…
Total estimated I/O cost for statement 1 (at line 1): 33926.
Parse and Compile Time 0.
Adaptive Server cpu time: 0 ms.
Statement: 1 Compile time resource usage: (est worker processes=0 proccache=126),
Execution time resource usage: (worker processes=0 auxsdesc=0 plansize=14
proccache=23 proccache hwm=28 tempdb hwm=2)
Private buffer count: 48,Private HWM buffer count: 48
use demo_db
go
set statement_cache off
set switch on 3604
set option show long
set statistics time, io, resource, plancost on
set showplan on
go
select l.city, l.county, s.sample_date, s.air_temp
from aqi_locations l, aqi_samples s
where l.location_id=s.location_id
and s.sample_date = 'July 1 2000 12:00:00:000PM'
and l.state='PA'
and s.weather='Overcast'
and s.air_temp = 90
go
set switch off 3604
set option show off
set statistics time, io, resource, plancost off
set showplan off
go
Loading stats
Compile time proc cache usage for stats & work plans
126 proc pages * 2k memory page = 252KB
11. (c) 2015 Independent SAP Technical User GroupAnnual Conference, 2015
QUERY PROCESSING &QUERY PROCESSING &
OPTIMIZATIONOPTIMIZATION
Internals, LOP Trees& Execution
12. 12Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
QP PhasesQP Phases
Receivebuffer
SQL Parsing
Query Normalization
q Resolves object id’s
q Replaces system
functions/functions with
literals with literal values
q Rearranges AND/OR according
to precedence
Pre-Processing
q Transforms subqueries
q Rearranges aggregates
q Creates Logical Operators
(LOP)
Query Optimization
Query Execution
TDSLANG select * from table where due_dt
=getdate() and recv_date is null
SELECT {column list}
FROM • table
COND1 due_dt <=getdate()
COND2 (AND) r recv_date is null
SELECT {column id’s & datatypes}
FROM • objid=123456
COND1 col_id=3 (dt) >= (dt) ‘Jan 1 2015’
COND2 (AND) col_id=4 (dt) IS NULL
Receive Buffer
SQL Parsing
Normalization
Pre-Processing
Query Optimization
Query Execution
Focus
13. 13Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
SomeNoteson WaitEventsSomeNoteson WaitEvents
Believeit or not….
q Until execution phase, all the rest counts as ‘awaiting
command’ in sp_who or WaitEvent ID=250 in
monProcessWaits
q It kinda makes sense….until query is executing…it isn’t
executing…
q ….but parsing, compiling & optimization all can use
considerable CPU time
ü Sooo…that is why set statistics time on reports
compile time separately
Sooo…if ‘awaitingcommand’ a lot….
q See if packets received are increasing
14. 14Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Optimization Startswith LOP TreeOptimization Startswith LOP Tree
Duringpre-processingphase, a LOP treeiscreated
q A high level tree that represents the logical operations
representing the relations between the entities
q Often, the LOP tree is the first place where optimization
starts to go wrong….due to bad query formation by
developers
Use‘set option show on’ toseelop tree
q It will be near the very top of the output
q You will need trace 3604 enabled
Duringexecution, a physical operator (Pop) isused
q Lop Join
q Pop NLJoin
15. 15Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
ExampleQueryExampleQuery
use demo_db
go
set option show on
set switch on 3604
set statistics plancost, time, resource, io on
set showplan on
set statement_cache off -- avoid rerunning goofy plans from previous run
set nodata on -- don’t return results (avoids network time/scrolling of large results)
go
select l.county, avg(s.air_temp)
from aqi_locations l,
aqi_samples s
where l.location_id=s.location_id
and s.sample_date between 'July 1 2000 00:01am' and 'July 31 2000 23:59:59'
and state='PA'
group by l.county
go
set option show off
set switch off 3604
set statistics plancost, time, resource, io off
set showplan off
--set statement_cache off
go
16. 16Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
ExampleLOP TreeExampleLOP Tree
1> select l.county, avg(s.air_temp)
2> from aqi_locations l,
3> aqi_samples s
4> where l.location_id=s.location_id
5> and s.sample_date between 'July 1 2000 00:01am' and 'July 31 2000 23:59:59'
6> and state='PA'
7> group by l.county
The Lop tree:
( project
( group
( join
( scan aqi_locations
)
( scan aqi_samples
)
)
)
)
17. 17Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
LOP Tree& OptBlocksLOP Tree& OptBlocks
Each LOP treelevel becomesan Optblock
q Outermost block (0) is one below
(project)
q Each block will generally have a
relational operator
ü Join, group, scalar, etc.
ü Scan is only considered an
operator if the query only
has one entity and no
other operators
Optimizer will determinean optimal plan for
that block
q ASE set option show will print
optimization for each optblock
q The optblock list is also printed at
The Lop tree:
( project
( group
( join
( scan aqi_locations
)
( scan aqi_samples
)
)
)
)
OptBlock1
OptBlock0
18. 18Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
ExampleOptBlockExampleOptBlock
The Lop tree:
…
OptBlock1
The Lop tree:
( join
( scan aqi_locations
)
( scan aqi_samples
)
)
Generic Tables: ( Gtt1( aqi_locations l ) Gtt2( aqi_samples s ) Gti3( aqi_locations_PK ) …
Generic Columns: ( Gc0(aqi_locations l ,Rid) Gc1(aqi_locations l ,state) Gc2(aqi_locations l ,location_id) …
Predicates: ( { aqi_samples s.sample_date} >= "Jul 1 2000 12:01AM" tc:{5} …
Transitive Closures: ( Tc0 = { Gc0(aqi_locations l ,Rid)} …
OptBlock0
The Lop tree:
( pseudoscan
)
Generic Tables: ( Gtg0 )
Generic Columns: ( Gc8(Gtg0 ,_gcelement_8) Gc9(Gtg0 ,_gcelement_9) Gc10(Gtg0 ,_gcelement_10) …
Predicates: ( )
Transitive Closures: ( Tc7 = { Gc8(Gtg0 ,_gcelement_8) Gc12(Gtg0 ,_virtualagg) …
19. 19Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
If you haveany doubtsIf you haveany doubts
If your index isbeingconsidered….
q It will be listed in Generic Tables with Gtti
ü Format is <tablelist>, <indexlist>
q Example:
ü Generic Tables: ( Gtt1( aqi_locations l ) Gtt2( aqi_samples
s ) Gti3( aqi_locations_PK ) Gti4( city_state_idx )
Gti5( county_state_idx ) Gti6( aqi_samples_PK )
Gti7( aqi_weather_date_idx ) )
If your whereclauseisbeingconsidered…
q It will be listed in Predicates
q Example:
ü Predicates: ( { aqi_samples s.sample_date} >= "Jul 1
2000 12:01AM" tc:{5} { aqi_samples s.sample_date}
<= "Jul 31 2000 11:59PM" tc:{5} { aqi_locations
l.state} = 'PA' tc:{1} )
20. 20Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Tofind optimization detailsTofind optimization details
Look for optblock begin/end section markersin output
q Begin
**************************************************************************
****
BEGIN: Search Space Traversal for OptBlock1
**************************************************************************
****
q End
**************************************************************************
****
DONE: Search Space Traversal for OptBlock1
**************************************************************************
****
Any section could befairly lengthy
q The key is to find the optblock where you think the
problem is….
21. 21Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
TheLOP role…a taleof twoqueriesTheLOP role…a taleof twoqueries
select *
into tempdb..my_objects
from sybsystemprocs..sysobjects
create index type_date_idx
on tempdb..my_objects (type, crdate)
declare @type char(2)
select @type='P'
select @type, max(crdate)
from tempdb..my_objects
where type=@type
declare @type char(2)
select @type='P'
select type, max(crdate)
from tempdb..my_objects
where type=@type
group by type
The setup: “Good” Query: “Bad” Query:
22. 22Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Theshowplans…and final IO costsTheshowplans…and final IO costs
QUERY PLAN FOR STATEMENT 2 (at line 9).
Optimized using Serial Mode
STEP 1
The type of query is SELECT.
2 operator(s) under root
|ROOT:EMIT Operator (VA = 2)
|
| |SCALAR AGGREGATE Operator (VA = 1)
| | Evaluate Ungrouped MAXIMUM AGGREGATE.
| | Scanning only up to the first qualifying row.
| |
| | |SCAN Operator (VA = 0)
| | | FROM TABLE
| | | my_objects
| | | Index : type_date_idx
| | | Backward scan.
| | | Positioning by key.
| | | Index contains all needed columns. Base table will not be read.
| | | Keys are:
| | | type ASC
| | | Using I/O Size 4 Kbytes for index leaf pages.
| | | With LRU Buffer Replacement Strategy for index leaf pages.
Total estimated I/O cost for statement 2 (at line 9): 54.
…
Table: my_objects scan count 1, logical reads: (regular=2 apf=0 total=2),
physical reads: (regular=0 apf=0 total=0), apf IOs used=0
Total actual I/O cost for this command: 4.
“Good” Query Plan & Cost:
QUERY PLAN FOR STATEMENT 2 (at line 9).
Optimized using Serial Mode
STEP 1
The type of query is SELECT.
3 operator(s) under root
|ROOT:EMIT Operator (VA = 3)
|
| |RESTRICT Operator (VA = 2)(0)(0)(0)(4)(0)
| |
| | |GROUP SORTED Operator (VA = 1)
| | | Evaluate Grouped MAXIMUM AGGREGATE.
| | |
| | | |SCAN Operator (VA = 0)
| | | | FROM TABLE
| | | | my_objects
| | | | Index : type_date_idx
| | | | Forward Scan.
| | | | Positioning by key.
| | | | Index contains all needed columns. Base table will not be read.
| | | | Keys are:
| | | | type ASC
| | | | Using I/O Size 4 Kbytes for index leaf pages.
| | | | With LRU Buffer Replacement Strategy for index leaf pages.
Total estimated I/O cost for statement 2 (at line 9): 360.
…
Table: my_objects scan count 1, logical reads: (regular=4 apf=0 total=4),
physical reads: (regular=0 apf=0 total=0), apf IOs used=0
Total actual I/O cost for this command: 8.
“Bad” Query Plan & Cost:
24. 24Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Theactual LOP treesTheactual LOP trees
The Lop tree:
( project
( scalar
( scan my_objects
)
)
)
OptBlock1
The Lop tree:
( scan my_objects
)
OptBlock0
The Lop tree:
( pseudoscan
)
“Good” Query LOP tree:
The Lop tree:
( project
( group
( scan my_objects
)
)
)
OptBlock1
The Lop tree:
( scan my_objects
)
OptBlock0
The Lop tree:
( pseudoscan
)
“Bad” Query LOP Plancost:
25. 25Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
TheLessonTheLesson
TheLOP can influenceoptimization and final costs
q Try to use operators that are lighter weight (e.g. scalar
vs. group by)
q In this case, we knew the @type up front….
ü Re-selecting it in the ‘group by’ variant is
duplicative/redundant
ü Literals, @vars are scalars whereas group by is a
vector
Execution can play a roleaswell
q We saw in this example, in the scalar variant that the
optimizer can limit the rows to be scanned
| |SCALAR AGGREGATE Operator (VA = 1)
| | Evaluate Ungrouped MAXIMUM AGGREGATE.
| | Scanning only up to the first qualifying row.
q Execution can also short-circuit based in certain
26. 26Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Optimization vs. Execution (1)Optimization vs. Execution (1)
Optimizer getsa lot of blamefor thingsit isnot involved in
Example:
q Customer on SCN whines about table scan due to
optimizer ‘bug’ on the following example query
Select * from sysobjects
Where id=8 OR 1=2
q Customer “thinks” optimizer should simply use the index
What doyou think thereal problem isand why???
27. 27Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Let’sstart simple(1)Let’sstart simple(1)
1> select count(*) from sysobjects plan '(t_scan sysobjects)'
QUERY PLAN FOR STATEMENT 1 (at line 1).
Optimized using Serial Mode
Optimized using the Abstract Plan in the PLAN clause.
STEP 1
The type of query is SELECT.
2 operator(s) under root
|ROOT:EMIT Operator (VA = 2)
|
| |SCALAR AGGREGATE Operator (VA = 1)
| | Evaluate Ungrouped COUNT AGGREGATE.
| |
| | |SCAN Operator (VA = 0)
| | | FROM TABLE
| | | sysobjects
| | | Table Scan.
| | | Forward Scan.
| | | Positioning at start of table.
| | | Using I/O Size 32 Kbytes for data pages.
| | | With LRU Buffer Replacement Strategy for data pages.
Total estimated I/O cost for statement 1 (at line 1): 414.
Parse and Compile Time 0.
Adaptive Server cpu time: 0 ms.
-----------
702
Let’s force a table scan just to
see how many LIO’s it takes
28. 28Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Let’sstart simple(2)Let’sstart simple(2)
Statement: 1 Compile time resource usage: (est worker processes=0 proccache=57),
Execution time resource usage: (worker processes=0 auxsdesc=0 plansize=6 proccache=7 proccache hwm=7 tempdb hwm=0)
==================== Lava Operator Tree ====================
Emit
(VA = 2)
r:1 er:1
cpu: 0
/
ScalarAgg
Count
(VA = 1)
r:1 er:1
cpu: 0
/
TableScan
sysobjects
(VA = 0)
r:702 er:702
l:26 el:26
p:0 ep:4
============================================================
Table: sysobjects scan count 1, logical reads: (regular=26 apf=0 total=26), physical reads: (regular=0 apf=0 total=0), apf IOs
used=0
Total actual I/O cost for this command: 52.
Total writes for this command: 0
Execution Time 0.
Adaptive Server cpu time: 0 ms. Adaptive Server elapsed time: 0 ms.
The answer is 26…remember
that
29. 29Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
A simplefalseexpression (1)A simplefalseexpression (1)
1> select * from sysobjects where 1=2
QUERY PLAN FOR STATEMENT 1 (at line 1).
Optimized using Serial Mode
STEP 1
The type of query is SELECT.
2 operator(s) under root
|ROOT:EMIT Operator (VA = 2)
|
| |RESTRICT Operator (VA = 1)(4)(0)(0)(0)(0)
| |
| | |SCAN Operator (VA = 0)
| | | FROM TABLE
| | | sysobjects
| | | Table Scan.
| | | Forward Scan.
| | | Positioning at start of table.
| | | Using I/O Size 4 Kbytes for data pages.
| | | With LRU Buffer Replacement Strategy for data pages.
Total estimated I/O cost for statement 1 (at line 1): 237.
Parse and Compile Time 0.
Adaptive Server cpu time: 0 ms.
We are still going to do an
table scan….
30. 30Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
A simplefalseexpression (2)A simplefalseexpression (2)
Statement: 1 Compile time resource usage: (est worker processes=0 proccache=69),
Execution time resource usage: (worker processes=0 auxsdesc=0 plansize=14 proccache=15 proccache hwm=15 tempdb hwm=0)
==================== Lava Operator Tree ====================
Emit
(VA = 2)
r:0 er:702
cpu: 0
/
Restrict
(4)(0)(0)(0)(0)
(VA = 1)
r:0 er:702
/
TableScan
sysobjects
(VA = 0)
r:0 er:702
l:0 el:1
p:0 ep:1
============================================================
Table: sysobjects scan count 0, logical reads: (regular=0 apf=0 total=0), physical reads: (regular=0 apf=0 total=0), apf IOs used=0
Total actual I/O cost for this command: 0.
Total writes for this command: 0
Execution Time 0.
Adaptive Server cpu time: 0 ms. Adaptive Server elapsed time: 0 ms.
(0 rows affected)
What happened to our 26
IO’s???
31. 31Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Digginga Bit Deeper (1)Digginga Bit Deeper (1)
1> select * from sysobjects where 1=2
2>
The Lop tree:
( project
( scan sysobjects
)
)
OptBlock0
The Lop tree:
( scan sysobjects
)
Generic Tables: ( Gtt0( sysobjects ) )
Generic Columns: …
Predicates: ( 1=2)
Transitive Closures: …
We do see the expression…but notice
there is no index listed in Generic Tables…
….and notice that the predicate listed
doesn’t have a condition number (tc{#})…
32. 32Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Digginga Bit Deeper (2)Digginga Bit Deeper (2)
******************************************************************************
BEGIN: Search Space Traversal for OptBlock0
******************************************************************************
Scan plans selected for this optblock:
Statistics for rows returned to client...
Estimated rows :702 Estimated row width :239.5
Estimated client cost is :132.95
Estimating selectivity for table 'sysobjects'
Table scan cost is 702 rows, 21 pages,
Cost adjusted for Fastfirstrow goal, Adjustment ratio0.001424501
Adjusted Table scan cost is 1 rows, 21 pages,
The table (Datarows) has 702 rows, 21 pages,
Data Page Cluster Ratio 0.9999900
Search argument selectivity is 1.
using table prefetch (size 32K I/O)
Large IO selected: The number of leaf pages qualified is > MIN_PREFETCH pages
in data cache 'default data cache' (cacheid 0) with LRU replacement
OptBlock0 Eqc{0} -> Pops added:
( PopTabScan sysobjects ) cost:237.6 T(L1,P0.9999995,C2106) O(L1,P0.9999995,C2106) order: none
The best plan found in OptBlock0 :
( PopTabScan cost:237.6 T(L1,P0.9999995,C2106) O(L1,P0.9999995,C2106) props: [{}] Gtt0( sysobjects ) )
cost:237.6 T(L1,P0.9999995,C2106) O(L1,P0.9999995,C2106) order: none
Hmmm….no indexes looked
at…
33. 33Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Let’sTry SomethingClose(1)Let’sTry SomethingClose(1)
1> select * from sysobjects where id=8 and 1=2
QUERY PLAN FOR STATEMENT 1 (at line 1).
Optimized using Serial Mode
STEP 1
The type of query is SELECT.
2 operator(s) under root
|ROOT:EMIT Operator (VA = 2)
|
| |RESTRICT Operator (VA = 1)(4)(0)(0)(0)(0)
| |
| | |SCAN Operator (VA = 0)
| | | FROM TABLE
| | | sysobjects
| | | Using Clustered Index.
| | | Index : csysobjects
| | | Forward Scan.
| | | Positioning by key.
| | | Keys are:
| | | id ASC
| | | Using I/O Size 4 Kbytes for index leaf pages.
| | | With LRU Buffer Replacement Strategy for index leaf pages.
| | | Using I/O Size 4 Kbytes for data pages.
| | | With LRU Buffer Replacement Strategy for data pages.
Total estimated I/O cost for statement 1 (at line 1): 81.
Parse and Compile Time 0.
Adaptive Server cpu time: 0 ms.
Heyyy!!!! We used an
index…even with a FALSE
expression….
34. 34Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Let’sTry SomethingClose(2)Let’sTry SomethingClose(2)
Statement: 1 Compile time resource usage: (est worker processes=0 proccache=69),
Execution time resource usage: (worker processes=0 auxsdesc=0 plansize=14 proccache=17 proccache hwm=17 tempdb hwm=0)
==================== Lava Operator Tree ====================
Emit
(VA = 2)
r:0 er:71
cpu: 0
/
Restrict
(4)(0)(0)(0)(0)
(VA = 1)
r:0 er:71
/
IndexScan
csysobjects
(VA = 0)
r:0 er:71
l:0 el:3
p:0 ep:3
============================================================
Table: sysobjects scan count 0, logical reads: (regular=0 apf=0 total=0), physical reads: (regular=0 apf=0 total=0), apf IOs used=0
Total actual I/O cost for this command: 0.
Total writes for this command: 0
Execution Time 0.
Adaptive Server cpu time: 0 ms. Adaptive Server elapsed time: 0 ms.
(0 rows affected)
…but we *STILL* didn’t do any
LIO’s….how is that???
35. 35Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Let’sTry SomethingClose(3)Let’sTry SomethingClose(3)
1> select * from sysobjects where id=8 and 1=2
2>
3>
The Lop tree:
( project
( scan sysobjects
)
)
OptBlock0
The Lop tree:
( scan sysobjects
)
Generic Tables: ( Gtt0( sysobjects ) Gti1( csysobjects ) )
Generic Columns: …
Predicates: ( { sysobjects.id } = 8 tc:{25} 1=2)
Transitive Closures: …
…We now have an index to look
at as well as a predicate with a
tc{#}….it applies to the condition
before the label.
36. 36Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Let’sTry SomethingClose(4)Let’sTry SomethingClose(4)
******************************************************************************
BEGIN: Search Space Traversal for OptBlock0
******************************************************************************
Scan plans selected for this optblock:
Statistics for rows returned to client...
Estimated rows :70.2 Estimated row width :239.5
Estimated client cost is :14.7343
Scan on table sysobjects skipped because table scan less than concurrency threshold
Scan on table sysobjects skipped because table scan less than concurrency threshold
Beginning selection of qualifying indexes for table 'sysobjects',
Estimating selectivity of index 'sysobjects.csysobjects', indid 3
id = 8
Estimated selectivity for id,
selectivity = 0.1,
scan selectivity 0.001424501, filter selectivity 0.001424501
restricted selectivity 0.1
Cost adjusted for Fastfirstrow goal, Adjustment ratio 0.01424501
unique index with all keys, one row scans
1 rows, 1 pages
Adjustment ratio 0.01424501 applied gives 0.01424501 rows, 1 pages
Data Row Cluster Ratio 0.06314244
Index Page Cluster Ratio 0.99999
Data Page Cluster Ratio 0.2469512
using no index prefetch (size 4K I/O)
in index cache 'default data cache' (cacheid 0) with LRU replacement
Yep, we evaluated the
index
37. 37Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Let’sTry SomethingClose(5)Let’sTry SomethingClose(5)
******************************************************************************
BEGIN: Search Space Traversal for OptBlock0
******************************************************************************
…
using no table prefetch (size 4K I/O)
in data cache 'default data cache' (cacheid 0) with LRU replacement
Data Page LIO for 'csysobjects' on table 'sysobjects' = 1
OptBlock0 Eqc{0} -> Pops added:
( PopRidJoin ( PopIndScan csysobjects sysobjects ) ) cost:81.39999 T(L3,P3,C4) O(L1,P1,C3) order: none
The best plan found in OptBlock0 :
( PopRidJoin cost:81.39999 T(L3,P3,C4) O(L1,P1,C3) props: [{}] ( PopIndScan cost:54.09999 T(L2,P2,C1) O(L2,P2,C1)
props: [{}] Gti1( csysobjects ) Gtt0( sysobjects ) ) cost:54.09999 T(L2,P2,C1) O(L2,P2,C1) order: none
) cost:81.39999 T(L3,P3,C4) O(L1,P1,C3) order: none
******************************************************************************
DONE: Search Space Traversal for OptBlock0
******************************************************************************
…and that was about it….so we go with the index
38. 38Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Understandingwhat happenedUnderstandingwhat happened
Query optimizer optimizes…not executes
q Expression evaluation happens during execution time
q Soooo….. 1=2 is not even looked at by optimizer
ü Both are literals and optimizer skips this as a literal
expression that cannot be optimized
Query execution can ‘short circuit’
q Obviously false expressions
q N-ary Nested Loop Joins
q …
39. 39Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Soo….What about Our Query?Soo….What about Our Query?
Our Example:
Select * from sysobjects
Where id=8 OR 1=2
What happens
q Optimizer evaluates index on id=8
q Optimizer sees OR clause
ü …opposite side of OR clause is unoptimizable expression
which could be *anything* (e.g. an unindexed param
like type=‘U’)
ü Since it could be anything OR clause means table scan
q Since we have to table scan the OR’d condition….
ü No sense in using the index for id=8…we will just hit
those rows on the way by doing the OR clause
40. 40Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Why did I bringthat up???Why did I bringthat up???
Haveyou ever donethisin a stored proc???
Select….
from tableA, …
where …
and (((@var1=1) and (colA=‘value’))
or ((@var1=2) and (colB=‘value))
)
Or worseyet…
Select….
from tableA, …
where …
and (((@var1=1) and (colA=‘value’))
or ((@var1=2) and (colB=‘value))
)
I have….ooops….
41. 41Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
A morecomplicated exampleA morecomplicated example
INSERT INTO #temp (...)
SELECT DISTINCT ...
FROM
MYDBNAME..TABLE_A A
, MYDBNAME..TABLE_B B
, MYDBNAME..TABLE_C C
, MYDBNAME..TABLE_D D
, MYDBNAME..TABLE_E E
, MYDBNAME..TABLE_F F
, MYDBNAME..TABLE_G G
, MYDBNAME..TABLE_H H
WHERE
A.COLUMN_1 = @VARIABLE_1
AND A.COLUMN_2 = @VARIABLE_2
AND A.COLUMN_3 = IsNull(@VARIABLE_3,A.COLUMN_3)
AND A.COLUMN_4 = IsNull(@VARIABLE_4,A.COLUMN_4)
AND A.COLUMN_5 = IsNull(@VARIABLE_5,A.COLUMN_5)
...
AND A.COLUMN_6 BETWEEN @VARIABLE_6 AND @VARIABLE_7
...
ORDER BY ...
Customer is trying to avoid writing IF/ELSE logic
for different conditions/variables being passed
in…if @VAR3-5 are set, the intent would be that
they would be used as SARGs….but if not set,
then the predicate is a no-op as column is
compared to itself….
42. 42Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Simplifying(1)Simplifying(1)
use demo_db
go
set statement_cache off
set switch on 3604
set option show on
set statistics time, io, resource, plancost on
set showplan on
go
declare @air_temp smallint, @weather varchar(30), @bDate datetime, @eDate datetime
select @air_temp=null, @weather=null, @bDate='July 1 2000 00:00:01', @eDate='July 31 2000 23:59:59'
--select @air_temp=80, @weather='sunny',@bDate='July 1 2000 00:00:01', @eDate='July 31 2000 23:59:59'
select count(*)
from aqi_samples
where sample_date between @bDate and @eDate
and air_temp=isnull(@air_temp,air_temp)
and weather=isnull(@weather,weather)
go
set switch off 3604
set option show off
set statistics time, io, resource, plancost off
set showplan off
go
Table has 168M rows with an index on
{sample_date, air_temp, weather}
…first run with nulls for second 2 index keys
43. 43Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Simplifying(2)Simplifying(2)
The Lop tree:
( project
( scalar
( scan aqi_samples
)
)
)
OptBlock1
The Lop tree:
( scan aqi_samples
)
Generic Tables: ( Gtt1( aqi_samples ) Gti2( aqi_samples_PK ) Gti3( aqi_weather_date_idx ) )
Generic Columns: …
Predicates: ( { aqi_samples.sample_date} >= "Jan 1 1900 12:00AM" tc:{3} { aqi_samples.sample_date} <= "Jan 1 1900 12:00AM" tc:{3} )
Transitive Closures: …
OptBlock0
The Lop tree:
( pseudoscan
)
Generic Tables: ( Gta0 )
Generic Columns: …
Predicates: ( )
Transitive Closures: …
The between clause is only one passed to optimizer…
not much of a surprise as with the NULLs, we are
expecting no-ops on air_temp and weather.
Note that since we don’t know the value of @vars at
compile time, we use default date here
44. 44Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Simplifying(3)Simplifying(3)
Total estimated I/O cost for statement 3 (at line 4): 17133977.
==================== Lava Operator Tree ====================
Emit
(VA = 3)
r:1 er:1
cpu: 0
/
ScalarAgg
Count
(VA = 2)
r:1 er:1
cpu: 400
/
Restrict
(0)(0)(0)(11)(0)
(VA = 1)
r:1.303e+006 er:4.202e+007
/
IndexScan
aqi_weather_date
(VA = 0)
r:1.303e+006 er:4.202e+007
l:1969 el:63590
p:251 ep:8005
============================================================
Table: aqi_samples scan count 1, logical reads: (regular=1969 apf=0 total=1969), physical reads: (regular=8 apf=243 total=251), apf IOs used=243
Total actual I/O cost for this command: 10213.
Total writes for this command: 0
Execution Time 4.
Adaptive Server cpu time: 417 ms. Adaptive Server elapsed time: 417 ms.
Our total IO estimate is 17M+….Our estimated
rows (from IndexScan) are off by 30x….which is
bad…
45. 45Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Simplifying– Rerun (1)Simplifying– Rerun (1)
use demo_db
go
set statement_cache off
set switch on 3604
set option show on
set statistics time, io, resource, plancost on
set showplan on
go
declare @air_temp smallint, @weather varchar(30), @bDate datetime, @eDate datetime
--select @air_temp=null, @weather=null, @bDate='July 1 2000 00:00:01', @eDate='July 31 2000 23:59:59'
select @air_temp=80, @weather='sunny',@bDate='July 1 2000 00:00:01', @eDate='July 31 2000 23:59:59'
select count(*)
from aqi_samples
where sample_date between @bDate and @eDate
and air_temp=isnull(@air_temp,air_temp)
and weather=isnull(@weather,weather)
go
set switch off 3604
set option show off
set statistics time, io, resource, plancost off
set showplan off
go
Table has 168M rows with an index on
{sample_date, air_temp, weather}
…second run with values for second 2 index keys
46. 46Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Simplifying- Rerun (2)Simplifying- Rerun (2)
The Lop tree:
( project
( scalar
( scan aqi_samples
)
)
)
OptBlock1
The Lop tree:
( scan aqi_samples
)
Generic Tables: ( Gtt1( aqi_samples ) Gti2( aqi_samples_PK ) Gti3( aqi_weather_date_idx ) )
Generic Columns: …
Predicates: ( { aqi_samples.sample_date} >= "Jan 1 1900 12:00AM" tc:{3} { aqi_samples.sample_date} <= "Jan 1 1900 12:00AM" tc:{3} )
Transitive Closures: …
OptBlock0
The Lop tree:
( pseudoscan
)
Generic Tables: ( Gta0 )
Generic Columns: …
Predicates: ( )
Transitive Closures: …
The between clause is still the only one passed
to optimizer… which means this fails as a coding
style
47. 47Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Simplifying- Rerun (3)Simplifying- Rerun (3)
Total estimated I/O cost for statement 3 (at line 4): 17133977.
==================== Lava Operator Tree ====================
Emit
(VA = 3)
r:1 er:1
cpu: 0
/
ScalarAgg
Count
(VA = 2)
r:1 er:1
cpu: 300
/
Restrict
(0)(0)(0)(11)(0)
(VA = 1)
r:0 er:4.202e+007
/
IndexScan
aqi_weather_date
(VA = 0)
r:1.303e+006 er:4.202e+007
l:1969 el:63590
p:0 ep:8005
============================================================
Table: aqi_samples scan count 1, logical reads: (regular=1969 apf=0 total=1969), physical reads: (regular=0 apf=0 total=0), apf IOs used=0
Total actual I/O cost for this command: 3938.
Total writes for this command: 0
Execution Time 3.
Adaptive Server cpu time: 309 ms. Adaptive Server elapsed time: 309 ms.
We get the same estimates for total IO (17M)
and in the bottom node, but the Restrict filters
out non-qualifying rows – so we get 0….and
finish 100ms faster…the faster execution might
make developer think it worked. However, we
do the same amount of work (1969 LIOs) so the
faster exec is just likely the reduction in
ScalarAgg (which it is) due to fewer rows to
count.
48. 48Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Simplifying– Correct (1)Simplifying– Correct (1)
use demo_db
go
set statement_cache off
set switch on 3604
set option show on
set statistics time, io, resource, plancost on
set showplan on
go
declare @air_temp smallint, @weather varchar(30), @bDate datetime, @eDate datetime
--select @air_temp=null, @weather=null, @bDate='July 1 2000 00:00:01', @eDate='July 31 2000 23:59:59'
select @air_temp=80, @weather='sunny',@bDate='July 1 2000 00:00:01', @eDate='July 31 2000 23:59:59'
select count(*)
from aqi_samples
where sample_date between @bDate and @eDate
and air_temp=@air_temp
and weather=@weather
go
set switch off 3604
set option show off
set statistics time, io, resource, plancost off
set showplan off
go
Table has 168M rows with an index on
{sample_date, air_temp, weather}
…third run with the way it should be…
49. 49Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Simplifying- Correct (2)Simplifying- Correct (2)
The Lop tree:
( project
( scalar
( scan aqi_samples
)
)
)
OptBlock1
The Lop tree:
( scan aqi_samples
)
Generic Tables: ( Gtt1( aqi_samples ) Gti2( aqi_samples_PK ) Gti3( aqi_weather_date_idx ) )
Generic Columns: …
Predicates: ( { aqi_samples.sample_date} >= "Jan 1 1900 12:00AM" tc:{3} { aqi_samples.sample_date} <= "Jan 1 1900 12:00AM" tc:{3}
{ aqi_samples.air_temp} = 0 tc:{2} { aqi_samples.weather} = ' tc:{1} )
Transitive Closures: …
OptBlock0
The Lop tree:
( pseudoscan
)
Generic Tables: ( Gta0 )
Generic Columns: …
Predicates: ( )
Transitive Closures: …
We now have all 3 predicates…since we still
have @vars with unknown values, we substitute
a 0 for int/smallint and ‘ (empty string) for
varchar/char
50. 50Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Simplifying- Correct (3)Simplifying- Correct (3)
Total estimated I/O cost for statement 3 (at line 4): 227844.
==================== Lava Operator Tree ====================
Emit
(VA = 2)
r:1 er:1
cpu: 0
/
ScalarAgg
Count
(VA = 1)
r:1 er:1
cpu: 0
/
IndexScan
aqi_weather_date
(VA = 0)
r:0 er:450006
l:306 el:1307
p:0 ep:165
============================================================
Table: aqi_samples scan count 1, logical reads: (regular=306 apf=0 total=306), physical reads: (regular=0 apf=0 total=0), apf IOs used=0
Total actual I/O cost for this command: 612.
Total writes for this command: 0
Execution Time 0.
Adaptive Server cpu time: 1 ms. Adaptive Server elapsed time: 1 ms.
Total estimated IO is 228K (vs. 17M) and
estimated rowcount is TONS less…still off, but
likely due to data skew and not knowing values
of @vars…. And we only do 300 LIO vs.
1969….and we finish 300x faster
51. 51Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Index Keys: TheQueryIndex Keys: TheQuery
SELECT SUM( T_00 ."MBGBTR" )
FROM "COEP" T_00
INNER JOIN "COBK" T_01
ON T_01 ."KOKRS" = ?
AND T_01 ."BELNR" = T_00 ."BELNR"
WHERE T_00 ."MANDT" = ?
AND T_00 ."LEDNR" = ?
AND T_00 ."OBJNR" = ?
AND ( T_00 ."KSTAR" BETWEEN ? AND ? OR T_00 ."KSTAR" IN ( ? , ? , ? , ? ) )
AND T_01 ."AWTYP" = ?
/* R3:ZVDESR121:558 T:COEP M:400 */
index_name index_keys index_description,
COEP~0 MANDT, KOKRS, BELNR, BUZEI nonclustered, unique
COEP~1 MANDT, LEDNR, OBJNR, GJAHR, WRTTP, VERSN, KSTAR, HRKFT, PERIO,
VRGNG, PAROB, USPOB, VBUND, PARGB, BEKNZ, TWAER nonclustered
COEP~Z02 MANDT, KOKRS, BUKRS, OBJNR nonclustered
COEP_BDLS0 MANDT, LOGSYSO nonclustered
COEP~4 MANDT, TIMESTMP, OBJNR nonclustered
COEP~Z03 MANDT, LEDNR, OBJNR, KSTAR nonclustered
COEP~Z05 MANDT, OBJNR, KSTAR, GJAHR, PERIO, PAROB1, WRTTP nonclustered
COEP~Zt1 MANDT, LEDNR, OBJNR, KSTAR nonclustered
52. 52Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Index Keys– Bad Index AccessIndex Keys– Bad Index Access
|ROOT:EMIT Operator (VA = 5)
|
| |SCALAR AGGREGATE Operator (VA = 4)
| | Evaluate Ungrouped SUM OR AVERAGE AGGREGATE.
| |
| | |NESTED LOOP JOIN Operator (VA = 3) (Join Type: Inner Join)
| | |
| | | |RESTRICT Operator (VA = 1)(0)(0)(0)(4)(0)
| | | |
| | | | |SCAN Operator (VA = 0)
| | | | | FROM TABLE
| | | | | COEP
| | | | | T_00
| | | | | Index : COEP~4
| | | | | Forward Scan.
| | | | | Positioning by key.
| | | | | Keys are:
| | | | | MANDT ASC
| | | | | Using I/O Size 128 Kbytes for index leaf pages.
| | | | | With LRU Buffer Replacement Strategy for index leaf pages.
| | | | | Using I/O Size 128 Kbytes for data pages.
| | | | | With LRU Buffer Replacement Strategy for data pages.
| | |
| | | |SCAN Operator (VA = 2)
| | | | FROM TABLE
| | | | COBK
| | | | T_01
| | | | Index : COBK~Zt1
| | | | Forward Scan.
| | | | Positioning at index start.
| | | | Index contains all needed columns. Base table will not be read.
| | | | Using I/O Size 16 Kbytes for index leaf pages.
| | | | With LRU Buffer Replacement Strategy for index leaf pages.
53. (c) 2015 Independent SAP Technical User GroupAnnual Conference, 2015
OPTIMIZATION COSTINGOPTIMIZATION COSTING
(PART 1)(PART 1)
Histograms, Column Densities, IN(), Out of RangeHistograms…
54. 54Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
HistogramsHistograms
Thekey tocost-based optimization
q Really is a distribution of data
skew
ü If data was evenly
distributed, we
wouldn’t need
histograms at all
q Mostly used for range scans
q Can be used for equisargs if
data highly skewed..as
most is
Thebasics
q Frequency cells
q Range cells
Statistics for column: "type"
Last update of column statistics: Feb 15 2015 9:18:32:850PM
Range cell density: 0.0053191489361702
Total density: 0.4216274332277049
Range selectivity: default used (0.33)
In between selectivity: default used (0.25)
Unique range values: 0.0053191489361702
Unique total values: 0.2000000000000000
Average column width: default used (2.00)
Rows scanned: 188.0000000000000000
Statistics version: 4
Histogram for column: "type"
Column datatype: char(2)
Requested step count: 20
Actual step count: 9
Sampling Percent: 0
Tuning Factor: 20
Out of range Histogram Adjustment is DEFAULT.
Low Domain Hashing.
Step Weight Value
1 0.00000000 <= "EJ"
2 0.00531915 < "P "
3 0.10638298 = "P "
4 0.00000000 < "S "
5 0.30319148 = "S "
6 0.00000000 < "U "
7 0.56382978 = "U "
8 0.00000000 < "V "
9 0.02127660 = "V "
Range Cells
Frequency Cells
55. 55Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
How Many StepsDoWeNeedHow Many StepsDoWeNeed
Fewer = better for resourceusageand timetofind steps
More= better for optimization accuracy
q Ideally, you want most range scans to be in a single cell
ü Multiple cells means aggregating stats…may be
accurate, but takes longer
ü For example, for datetime, columns see if cells cover
the common query range (week, month, year, ….)
Hard to near impossible to control to semantic boundaries
q Increase stats may be better for estimates with high
skew
56. 56Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
ExampleDateHistogramExampleDateHistogram
Histogram for column: "sample_date"
Column datatype: datetime
Requested step count: 100
Actual step count: 103
Sampling Percent: 0
Tuning Factor: 20
Out of range Histogram Adjustment is DEFAULT.
Sticky step count.
Sticky hashing.
Step Weight Value
1 0.00000000 <= "Jan 1 1993 11:59:59:996AM"
2 0.01017933 <= "Feb 13 1993 12:00:00:000PM"
3 0.00763450 <= "Mar 18 1993 12:00:00:000PM"
4 0.01018039 <= "May 1 1993 12:00:00:000PM"
5 0.00766925 <= "Jun 3 1993 12:00:00:000PM"
6 0.00777507 <= "Jul 6 1993 12:00:00:000PM"
7 0.00825124 <= "Aug 8 1993 12:00:00:000PM"
8 0.00816318 <= "Sep 10 1993 12:00:00:000PM"
9 0.00796063 <= "Oct 13 1993 12:00:00:000PM"
10 0.00795876 <= "Nov 15 1993 12:00:00:000PM"
11 0.00795651 <= "Dec 18 1993 12:00:00:000PM"
12 0.00788510 <= "Jan 19 1994 12:00:00:000PM"
13 0.01000150 <= "Feb 28 1994 12:00:00:000PM"
14 0.01000150 <= "Apr 9 1994 12:00:00:000PM“
…
~1.5 month spread…. Problem is that on some months it is mid-
month, so a range scan for that month would need 3 cells. If
concerned, likely need to double or triple stats
57. 57Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Histograms& StepsHistograms& Steps
Default no HTF Defaults 40 steps 100 steps 500 steps
Default number of steps 20 20 20 20 20
Histogram tuning factor 1 20 20 20 20
Requested steps 20 20 40 100 500
Actual steps 20 195 509 1550 7580
(Index statistics for combined city,state)
Range cell density 0.00328457 0.00121356 0.00022722 0.00010744 0.00003560
Total density 0.00328457 0.00328457 0.00328457 0.00328457 0.00328457
Unique range values 0.00011547 0.00008212 0.00006416 0.00004897 0.00002615
Unique total values 0.00011547 0.00011547 0.00011547 0.00011547 0.00011547
Impact on estimates for Washington DC & San Francisco CA
DC Cell <= Washington <= Washington = Washington = Washington = Washington
DC Selectivity 0.05184000 0.02155000 0.02063000 0.02063000 0.02063000
DC Row Estimates 5184 2155 2063 2063 2063
SF Cell <= Somerset <= San Jacint = San Franci = San Franci = San Franci
SF Selectivity 0.04875000 0.00678000 0.00634000 0.00634000 0.00634000
SF Row Estimates 4875 678 634 634 634
Statistics from
an index on
{city,state} for a
100,000 row
table with
~6,200 distinct
city names
58. 58Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Column DensitiesColumn Densities
Singlecolumn densities
q Range cell density/unique
range values
ü Tells maximum
uniqueness…
ü Min(weight)!=0 from
range cells
q Total density
ü Relative skewness of the
data
ü Total density approaching
1.0 is extremely
skewed
ü Sum(weights^2)
q Unique total values
ü The number distinct
values in column
ü 1.0/select count(distinct
column)
Multiplecolumn densities
q Automatically created on index
Statistics for column: "type"
Last update of column statistics: Feb 15 2015 9:18:32:850PM
Range cell density: 0.0053191489361702
Total density: 0.4216274332277049
Range selectivity: default used (0.33)
In between selectivity: default used (0.25)
Unique range values: 0.0053191489361702
Unique total values: 0.2000000000000000
Average column width: default used (2.00)
Rows scanned: 188.0000000000000000
Statistics version: 4
Statistics for column group: "sample_date", "air_temp", "weather"
Last update of column statistics: May 27 2014 11:45:45:016AM
Range cell density: 0.0000051075008894
Total density: 0.0000051075008894
Range selectivity: default used (0.33)
In between selectivity: default used (0.25)
Unique range values: 0.0000016297687032
Unique total values: 0.0000016297687032
Average column width: 8.5268955638740458
Rows scanned: 168066824.0000000000000000
Statistics version: 4
59. 59Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
UsingColumn DensitiesUsingColumn Densities
If thecolumn valueisknown and…
q …value falls in a range cell ….Estimate will be range cell
value
ü Whether range or frequency cell
If thecolumn valueisnot known
q Optimized with a literal placeholder (0, ‘’, Jan 1 1900,
etc.)
q Selectivity is total density
60. 60Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Column Selectivity vs. Density (1)Column Selectivity vs. Density (1)
Statistics for column: "id"
Last update of column statistics: Feb 16 2015 4:47:23:956PM
Range cell density: 0.0092592412744228
Total density: 0.0113194187537711
Unique range values: 0.0041383133267069
Unique total values: 0.0055248618784530
Step Weight Value
1 0.00000000 < 1
2 0.01093356 = 1
3 0.01387721 <= 2
4 0.01261564 <= 3
5 0.00714886 <= 4
6 0.00294365 <= 5
7 0.00462574 <= 6
8 0.00210261 <= 8
9 0.00336417 <= 9
10 0.00336417 <= 11
11 0.00378469 <= 12
12 0.00925147 <= 13
13 0.00210261 <= 15
14 0.01808242 <= 16
15 0.00252313 <= 17
16 0.00252313 <= 18
17 0.00168209 <= 19
18 0.00000000 < 21
19 0.00630782 = 21
20 0.00252313 <= 22
21 0.01429773 <= 23
22 0.03868797 <= 24
23 0.00378469 <= 25
1> declare @id int
2> select @id=8
3> select * from syscolumns where id=@id
Estimating selectivity of index 'syscolumns.csyscolumns', indid 2
id = 0
Estimated selectivity for id,
selectivity = 0.01131942,
scan selectivity 0.01131942, filter selectivity 0.01131942
26.91758 rows, 1 pages
range cell unknown
1> select * from syscolumns where id=8
Estimating selectivity of index 'syscolumns.csyscolumns', indid 2
id = 8
Estimated selectivity for id,
selectivity = 0.002102607,
scan selectivity 0.002102607, filter selectivity 0.002102607
5 rows, 1 pages
Weight < range cell density selectivity = weight
61. 61Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Column Selectivity vs. Density (2)Column Selectivity vs. Density (2)
Statistics for column: "id"
Last update of column statistics: Feb 16 2015 4:47:23:956PM
Range cell density: 0.0092592412744228
Total density: 0.0113194187537711
Unique range values: 0.0041383133267069
Unique total values: 0.0055248618784530
Step Weight Value
1 0.00000000 < 1
2 0.01093356 = 1
3 0.01387721 <= 2
4 0.01261564 <= 3
5 0.00714886 <= 4
6 0.00294365 <= 5
7 0.00462574 <= 6
8 0.00210261 <= 8
9 0.00336417 <= 9
10 0.00336417 <= 11
11 0.00378469 <= 12
12 0.00925147 <= 13
13 0.00210261 <= 15
14 0.01808242 <= 16
15 0.00252313 <= 17
16 0.00252313 <= 18
17 0.00168209 <= 19
18 0.00000000 < 21
19 0.00630782 = 21
20 0.00252313 <= 22
21 0.01429773 <= 23
22 0.03868797 <= 24
23 0.00378469 <= 25
1> select * from syscolumns where id=21
Estimating selectivity of index 'syscolumns.csyscolumns', indid 2
id = 21
Estimated selectivity for id,
selectivity = 0.006307822,
scan selectivity 0.006307822, filter selectivity 0.006307822
15 rows, 1 pages
Frequency cell selectivity = weight
1> select * from syscolumns where id=24
Estimating selectivity of index 'syscolumns.csyscolumns', indid 2
id = 24
Estimated selectivity for id,
selectivity = 0.03868797,
scan selectivity 0.03868797, filter selectivity 0.03868797
92 rows, 1 pages
Weight > range cell density selectivity = weight
62. 62Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Column Selectivity vs. Density (3)Column Selectivity vs. Density (3)
Statistics for column: "id"
Last update of column statistics: Feb 16 2015 4:47:23:956PM
Range cell density: 0.0092592412744228
Total density: 0.0113194187537711
Unique range values: 0.0041383133267069
Unique total values: 0.0055248618784530
Step Weight Value
1 0.00000000 < 1
2 0.01093356 = 1
3 0.01387721 <= 2
4 0.01261564 <= 3
5 0.00714886 <= 4
6 0.00294365 <= 5
7 0.00462574 <= 6
8 0.00210261 <= 8
9 0.00336417 <= 9
10 0.00336417 <= 11
11 0.00378469 <= 12
12 0.00925147 <= 13
13 0.00210261 <= 15
14 0.01808242 <= 16
15 0.00252313 <= 17
16 0.00252313 <= 18
17 0.00168209 <= 19
18 0.00000000 < 21
19 0.00630782 = 21
20 0.00252313 <= 22
21 0.01429773 <= 23
22 0.03868797 <= 24
23 0.00378469 <= 25
1> select * from syscolumns where id between 5 and 10
Estimating selectivity of index 'syscolumns.csyscolumns', indid 2
id >= 5
id <= 10
Estimated selectivity for id,
selectivity = 0.01471826,
scan selectivity 0.01471826, filter selectivity 0.01471826
35.00002 rows, 1 pages
Range query
Note that the sum of steps 6 10 is 0.01640034. However, since
we are only using a portion of step 10 and the distribute is 2 values
per step, we use the formula:
Sum(step6..step9) + step10/2.0 = 0.01471826
63. 63Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
DebuggingSelectivityDebuggingSelectivity
You’veprobably noticed….
q You need to have ‘set option show’ and optdiag output
Find theindex you thought it should haveused
q Look at the selectivity for each predicate
q Check out the optdiag to see if it was a really skewed
value
But sometimesyou just havetolook at thequery
q …your expectation may be due to knowledge you infer
ü But optimizer doesn’t know
ü ….such as the relationship between two columns
q …and sometimes the indexing doesn’t support the query
64. 64Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Unbounded DateRangeUnbounded DateRange
create table jobs (
job_number numeric(30,0),
…
job_category varchar(20), -- 10 distinct values
job_priority tinyint, -- 100 distinct values
job_begindate datetime,
job_enddate datetime,
job_status char(1), -- 6 distinct values
…,
primary key (job_number)
)
Consider the above table for each of the scenarios on the following slides. Note the key
columns of job dates and those that have some distinct values listed.
65. 65Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Scenario#1Scenario#1
Consider theindex:
create index job_begin_idx on jobs (job_begindate)
…and thetypical query
Select * from jobs
Where job_begindate >= $begin_date
and job_enddate <= $end_date
Why isLIO sometimeshigh and sometimeslow?
66. 66Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Scenario#1: TheProblemsScenario#1: TheProblems
Becausetheindex only hasbegin date
q On very recent dates, it can go near the end of the
index and scan to the end…
q But on dates in the past – even a few months ago
ü It positions to the $begin_date
ü Scans to end of index
ü For each leaf node, it does a LIO to data page
to compare $end_date
ü Some quick math….assume 50 rows per page
per index leaf node
100 leaf pages = 5000 data page LIO’s ≈ 1
sec CPU (@5LIO/ms)
1000 leaf pages = 50000 data page LIO’s
≈ 10 sec CPU
10000 leaf pages = 500000 data page
LIO’s ≈ 100 sec CPU
100000 leaf pages = 5000000 data page
LIO’s ≈ 1000 sec CPU (16m40s)
Soooo….
q For dates not very recent, we get an index leaf scan
to end of index
q Plus a datapage lookup for every leaf row
2010
2011
2012
2013
2014
> 01Mar2011
> 01Nov2012
> 01Jan2014
67. 67Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Scenario#1: TheSolutionsScenario#1: TheSolutions
Solution #1: Add job_enddatetoindex
create index job_date_idx
on jobs (job_begindate, job_enddate)
Solution #2: Add implied boundary todatequery
Select * from jobs
Where job_begindate between $begin_date and $end_date
and job_enddate between $begin_date and $end_date
Why both???
q Wouldn’t fixing the index be enough – why bother the
coders and try to teach them better coding style???
68. 68Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Scenario#2Scenario#2
Consider theindex:
create index job_begin_idx
on jobs (job_category, job_begindate)
…and thetypical query
Select * from jobs
Where job_begindate >= $begin_date
and job_enddate <= $end_date
Why doesit sometimesusetheindex and other timesnot?
69. 69Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Scenario#2: TheProblemScenario#2: TheProblem
Theproblem iswearemissinga predicateon leadingindex columns
q A similar situation occurs when we have intermediate index keys for
which we have no valid SARGs
Tohandlethis, ASE doesa bit of a trick
q It looks at cardinality of unknown keys
ü If low it considers an ORScan for each value
ü If high, it considers an index leaf scan
q Then it considers the selectivity of the known predicates
Sooo…asa result
q If we pick a date that is fairly recent (index is more selective), then we
will likely do an ORScan and then a index leaf scan from the begin
date until the next job_category
q If we pick a date that isn’t very selective, then the ORScan becomes too
expensive due to leaf scan per Orscan and we compare the multiple
index leaf scan vs. single table scan
70. 70Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Scenario#2: TheSolutionScenario#2: TheSolution
Solution: Add implied boundary todatequery
Select * from jobs
Where job_begindate between $begin_date and $end_date
and job_enddate between $begin_date and $end_date
…and thisiswhy wefix both theindex and thequery
q In the above case, considering the index in scenario #2,
as long as the range is fairly selective, we likely will do
the ORScan
71. 71Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
OrScan in Lava TreeOrScan in Lava Tree
==================== Lava Operator Tree ====================
Emit
(VA = 4)
r:5 er:1
cpu: 0
/
NestLoopJoin
Inner Join
(VA = 3)
r:5 er:1
l:0 el:8
p:0 ep:8
/
OrScan Restrict
Max Rows: 2 (0)(0)(0)(4)(0)
(VA = 0) (VA = 2)
r:2 er:-1 r:5 er:1
l:0 el:-1
p:0 ep:-1
/
IndexScan
TBTCO~7
(VA = 1)
r:9 er:1
l:28 el:8
p:0 ep:8
============================================================
72. 72Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
OrScan in Show PlanOrScan in Show Plan
|ROOT:EMIT Operator (VA = 6)
|
| |NESTED LOOP JOIN Operator (VA = 5) (Join Type: Inner Join)
| |
| | |NESTED LOOP JOIN Operator (VA = 3) (Join Type: Inner Join)
| | |
| | | |SCAN Operator (VA = 0)
| | | | FROM OR List
| | | | OR List has up to 12 rows of OR/IN values.
| | |
| | | |RESTRICT Operator (VA = 2)(0)(0)(0)(13)(0)
| | | |
| | | | |SCAN Operator (VA = 1)
| | | | | FROM TABLE
| | | | | SAPSR3.MSEG
| | | | | T_01
| | | | | Index : MSEG~1
| | | | | Forward Scan.
| | | | | Positioning by key.
| | | | | Keys are:
| | | | | MANDT ASC
| | | | | MATNR ASC
| | | | | Using I/O Size 128 Kbytes for index leaf pages.
| | | | | With LRU Buffer Replacement Strategy for index leaf pages.
| | | | | Using I/O Size 128 Kbytes for data pages.
| | | | | With LRU Buffer Replacement Strategy for data pages.
| |
73. 73Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Scenario#3Scenario#3
Consider thefollowingindex
create index job_begin_idx
on jobs (job_category, job_status, job_begindate,
job_enddate)
…and thetypical query
Select * from jobs
Where job_category = ‘night batch’
and job_status in (‘U’, ‘A’, ‘E’)
and job_begindate >= $begin_date
and job_enddate <= $end_date
Why might weonly position by job_category, job_status?
74. 74Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Scenario#3: TheProblemScenario#3: TheProblem
Theproblem iswedon’t havemulti-density stats
q And creating them might be a bit of a nightmare
Asa result, ASE doesthefollowing
q It weighs each selectivity individually:
ü ‘nightly batch’ + ‘U’ + $begin_date
ü ‘nightly batch’ + ‘A’ + $begin_date
ü ‘nightly batch’ + ‘E’ + $begin_date
q Then aggregates
Here’stheproblem….assumeweonly have20 steps
q Let’s pick a begin date 3 or more steps from the end
ü …and assume end_date is in the same step
ü …but remember, we have an unbounded range on both ….so
…effectively it will think it will be 3 steps for each $begin_date….not 1
…and it will thing $end_date is atrocious as is 17 steps worth (from beginning)
q If we aggregate, then we will have 3x….so 9 steps….40% of table is 8
steps….we might table scan or look for different index
75. 75Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Scenario#3: TheSolutionScenario#3: TheSolution
Updatecolumn statsfor distinctivecolumns
q Use 100 steps or similar large value
ü update statistics job_status (job_begindate) using 100
values
q Result is that each step has a much lower selectivity
value
Add thebounded rangeintothequery
q This means we aggregate only across the exact range of
dates we want…which reduces the impact of the IN()
clause
q
76. 76Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
ASE’sOR StrategyASE’sOR Strategy
If thequery containsan OR clauseon different columns
q ASE will (and can) use two different indexes
ü On index for predicates on one side of OR
ü …and a different index for predicates on other side of
OR
ü This would be similar to splitting the query in two with
union
q However, if one side of OR drives a tablescan – ASE will
tablescan
ü Remember, we saw this with the id=8 OR 1=2
example
Common issues
q One side of OR not indexed well….drives tablescan
q Developer attempted to use 1 index to cover both
77. 77Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
An Exampleof Indexingvs. ORAn Exampleof Indexingvs. OR
Consider thefollowingquery:
SELECT "VBELV" ,"POSNV" ,"VBELN" ,"POSNN" ,"VBTYP_N" ,"RFMNG" ,"MEINS" ,"VBTYP_V"
,"ERDAT" ,"ERZET" ,"AEDAT" ,"STUFE" ,"VRKME"
FROM "VBFA"
WHERE "MANDT" = ? AND ( "ERDAT" = ? OR "AEDAT" = ? )
/* R3:SAPLZFEDWS1:767 T:VBFA M:430 */
Now, consider theindexes:
index_name index_keys
-------------------------------------
--------------------------------------------
VBFA~0 MANDT, VBELV, POSNV, VBELN, POSNN, VBTYP_N
VBFA~Z01 MANDT, VBELN
VBFA~Z02 ERDAT, BWART
VBFA~Z04 MANDT, ERDAT, AEDAT
VBFA~Z99 MANDT, LOGSYS
Issueisthat thequery seemstodrivea tablescan….
q …it seems obvious that VBFA~Z04 should be used…..
q ….or is it???
78. 78Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Let’slook a littlecloserLet’slook a littlecloser
Lookingat systabstats
ColumnName ColumnID Row_Count RequestedSteps ActualSteps ApproxDistincts
DistinctsPerStep
-------------- -------- -------------------- -------------- ----------- ---------------
-----------------
AEDAT 22 1255008198 50 50 1625 33.0
BWART 17 1255008198 50 29 64 2.0
ERDAT 14 1255008198 50 245 4674 19.0
LOGSYS 38 1255008198 50 2 1 1.0
MANDT 1 1255008198 50 2 1 1.0
POSNN 5 1255008198 50 573 93300 163.0
POSNV 3 1255008198 50 231 12649 55.0
VBELN 4 1255008198 50 38 85330918 2245550.0
VBELV 2 1255008198 50 38 31223216 821664.0
VBTYP_N 6 1255008198 50 31 25 1.0
Hmmmm….not very good query criteria
q MANDT is useless as always
q AEDAT and ERDAT are not very distinct….1625 and 4674 values
respectively
ü Which means each distinct value will return ~250K to ~1M
79. 79Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
AEDAT Stats….from optdiagAEDAT Stats….from optdiag
Statistics for column: AEDAT
Last update of column statistics: Jan 10 2014 7:21:35:026PM
Range cell density: 0.0000017268359901
Total density: 0.9986527756879466
…
Unique range values: 0.0000004149259654
Unique total values: 0.0006153846153846
…
Histogram for column: AEDAT
Column datatype: varchar(24)
…
Statistics step count sticky
Statistics hashing sticky
Statistics hashing low domain used
Step Weight Value (only 255 bytes used)
1 0.00000000 < '00000000'
2 0.99932617 = '00000000'
3 0.00001720 <= '20080724'
4 0.00001430 <= '20080826'
5 0.00001409 <= '20081030'
6 0.00001545 <= '20081113'
7 0.00001415 <= '20081216'
8 0.00001419 <= '20090310'
9 0.00001468 <= '20090331'
10 0.00002772 <= '20090615'
…
OUCH!!!!!
80. 80Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
ERDAT Stats….from optdiagERDAT Stats….from optdiag
Statistics for column: ERDAT
Last update of column statistics: Jan 10 2014 7:21:35:026PM
Range cell density: 0.0005738551548958
Total density: 0.0006834762135235
…
Unique range values: 0.0001879716956084
Unique total values: 0.0002139495079161
…
Requested step count: 50
Actual step count: 245
…
Statistics step count sticky
Statistics hashing sticky
Statistics hashing low domain used
Step Weight Value (only 255 bytes used)
1 0.00000000 < '00000000'
2 0.00004201 = '00000000'
3 0.01879592 <= '20030624'
4 0.01879998 <= '20040316'
5 0.01888011 <= '20041015'
6 0.01887963 <= '20050502'
7 0.01878721 <= '20051031'
8 0.01888958 <= '20060420'
9 0.01879898 <= '20061014'
10 0.01882141 <= '20070417'
BETTER!!!!
81. 81Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Tounderstand, let’ssimplify thingsTounderstand, let’ssimplify things
Assumewehavea tableof customer transactions…
q with 1 billion rows
q PKEY is transaction_id (not that it matters…..)
q Has an index (IDX~1) on {purchase_date, ship_date}
ü Both purchase_date and ship_date are not very distinct
ü think about it …only 365 in a year….~3600 in 10 years…
not very distinctive out of 1 billion row table
Now consider thequery:
Select * from cust_transactions
where purchase_date=‘Jan 1 2014’ OR ship_date=‘Jan 1 2014’
Seetheproblem?.... Think about it….
82. 82Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
TheProblemTheProblem
Theproblem query:
Select * from cust_transactions
where purchase_date=‘Jan 1 2014’ OR ship_date=‘Jan 1 2014’
Theproblems….
q We can use the index IDX~1 for the purchase_date case …..depending
of course on selectivity of the data provided
q …but the OR clause means it that we also need to look for the ship date
ü individually and not in combination with purchase date – remember a
composite index works on COMBINING cols
q ….using IDX~1 for that is sort of useless as we can’t use the leading
purchase_date column as the OR clause is disjunctive…..the query
really could be expressed as:
select * from cust_transactions where purchase_date=‘Jan 1 2014’
union
select * from cust_transactions where ship_date=‘Jan 1 2014’
83. 83Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Remember special OR strategy???Remember special OR strategy???
When an OR condition exists:
q ASE can use multiple indexes – a different index for each side of
the OR
q This ‘special OR strategy’ is also known as ‘index union’
When lookingat thequery & index
q ASE says index is probably okay for purchase_date….
q ….but says it will need to tablescan for ship_date
q Why the tablescan
ü Remember, this is a DOL table and the index keys are sorted by
purchase_date, then ship_date
ü ….so we would have to scan ALL the leaf pages to find that
ship_date
ü ….only to find out that 1/4000th of the table qualifies
ü ….and they are scattered around due to purchase date,
so….LIO exceeds cost of tablescan so we do tablescan
ü ….especially if we have an OR value of ‘00000000’….which is
99% of the table.
84. 84Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
What about IN()???What about IN()???
If you werewatchingclosely….you already know theanswer
If you think about it….
q …an IN() is like an OR list…
q ….in fact ASE flattens into one
So, all wedois:
q Cost each one individually
q Aggregate them into a final cost
85. 85Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
A SimpleIN() exampleA SimpleIN() example
1> select * from sysobjects where id in (2,4,6,8,10,12,14,16)
The Lop tree:
( project
( scan sysobjects
)
)
OptBlock0
The Lop tree:
( scan sysobjects
)
Generic Tables: ( Gtt0( sysobjects ) Gti1( csysobjects ) )
Generic Columns: …
Predicates: ( ( { sysobjects.id } = 16 tc:{25} OR{ sysobjects.id } = 14 tc:{25}
OR { sysobjects.id } = 12 tc:{25} OR{ sysobjects.id } = 10 tc:{25}
OR { sysobjects.id } = 8 tc:{25} OR{ sysobjects.id } = 6 tc:{25}
OR { sysobjects.id } = 4 tc:{25} OR{ sysobjects.id } = 2 tc:{25} ) tc:{25} )
Transitive Closures: …)
IN() clause is expanded to OR’s….note that all
have the same transitive closure id (tc:{25})
86. 86Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Individual OR term selectivityIndividual OR term selectivity
BEGIN GENERAL OR ANALYSIS OF all types of indices FOR sysobjects
ANALYZING OR TERM 1
Estimating selectivity of index 'sysobjects.csysobjects', indid 3
id = 16
Estimated selectivity for id,
selectivity = 0.1,
scan selectivity 0.02272727, filter selectivity 0.02272727
restricted selectivity 0.1
unique index with all keys, one row scans
1 rows, 1 pages
…
ANALYZING OR TERM 2
Estimating selectivity of index 'sysobjects.csysobjects', indid 3
id = 14
…
ANALYZING OR TERM 3
Estimating selectivity of index 'sysobjects.csysobjects', indid 3
id = 12
…
ANALYZING OR TERM 4
Estimating selectivity of index 'sysobjects.csysobjects', indid 3
id = 10
…
==================== Lava Operator Tree ====================
Emit
(VA = 3)
r:8 er:5
cpu: 0
/
NestLoopJoin
Inner Join
(VA = 2)
r:8 er:5
l:0 el:5
p:0 ep:4
/
OrScan IndexScan
Max Rows: 8 csysobjects
(VA = 0) (VA = 1)
r:8 er:-1 r:8 er:5
l:0 el:-1 l:12 el:5
p:0 ep:-1 p:0 ep:4
============================================================
87. 87Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
AggregatingSelectivity for ORAggregatingSelectivity for OR
END GENERAL OR ANALYSIS FOR all types of indices - INDICES FOUND FOR ALL OR TERMS
Scan on table sysobjects skipped because table scan less than concurrency threshold
Estimating selectivity of index 'sysobjects.csysobjects', indid 3
Estimated selectivity for id,
selectivity = 0.8,
scan selectivity 0.8, filter selectivity 0.8
restricted selectivity 1
special or terms 8
35.2 rows, 1 pages
Data Row Cluster Ratio 0.99999
Index Page Cluster Ratio 1
Data Page Cluster Ratio 1
using no index prefetch (size 4K I/O)
in index cache 'default data cache' (cacheid 0) with LRU replacement
using no table prefetch (size 4K I/O)
in data cache 'default data cache' (cacheid 0) with LRU replacement
Data Page LIO for 'csysobjects' on table 'sysobjects' = 1.600336
Whoa!!! Prediction is 80% of the table…which had
44 rows….thankfully in *this* case, it still was only 1
page
88. 88Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
AggregatingIN()AggregatingIN()
Aggregation isunintelligent
q It doesn’t check how many are from same range cell
Result istheaggregated valueisoften over-inflated
TIP: Makesureyou havehistogram steps> largest IN() list
q For SAP systems, this will be 100
89. 89Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Out of rangehistogramsOut of rangehistograms
Originally added toASE 15.0 for monotonicsequences
q For example, sequential numbers, datetime (e.g. current
datetime)
q Often times if stats only updated every week, a large portion of
the new data values where higher than the histogram range
ü As a result, the optimizer would estimate 0 values and select
an index based on that reduced cost estimate whereas in
reality there could be millions of rows
q With out of range histograms, several factors are used to
estimate how many data values exist beyond the last
histogram cell and cost is adjusted higher
Usually in such cases, out of rangehistogramsisa sign of stalestats
q ….but for high insert/append use cases, you may be updating or
re-reading a row that was just inserted – e.g. reporting on
today’s sales
90. 90Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Low Cardinality ExamplesLow Cardinality Examples
Histogram tuningmay bea bad thingfor short duration “STATUS” columns
q Most of the values in the histogram will be “C” for complete
q Unless there is a “permanent” status higher than “U” for
unprocessed, it is unlikely that update stats will catch a “U”
value
ü During migration, the system is likely quiesced with nothing
incomplete
ü Post-migration, if stats are run during quiet period, likely no
incomplete values exist
q Out of range histogram throws off optimizer….0 would have been
better estimate
ü Running update stats on weekends or nights when quiet simply
causes same problem…as jobs are likely all complete
q Spotted with ‘set option show on’
May alsohappen with very low cardinality “TYPE” columns
q Or any very low cardinality column, in reality when value in
predicate is extremely low occurrence in a very low cardinality
column and value is higher than more common value(s)
91. 91Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
ExampleHistogramExampleHistogram
Histogram for column: "ENTRY_TYPE"
…
Out of range Histogram Adjustment is DEFAULT.
Sticky step count.
Sticky partial_hashing.
Step Weight Value
1 0.00000000 < "C"
2 1.00000000 = "C"
Histogram for column: "STATUS"
…
Out of range Histogram Adjustment is DEFAULT.
Low Domain Hashing.
Sticky step count.
Sticky partial_hashing.
Step Weight Value
1 0.00000000 < "C"
2 0.98791176 = "C"
3 0.00084806 < "T"
4 0.01124019 = "T"
92. 92Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Example‘set option show output’Example‘set option show output’
Estimating selectivity of index 'SAPSR3.ESH_EX_CPOINTER.ESH_EX_CPOINTER~ST', indid 3
STATUS = 'U'
ENTRY_TYPE = 'P'
Estimated selectivity for ENTRY_TYPE,
Out of range histogram adjustment,
selectivity = 0.3333333,
Estimated selectivity for STATUS,
Out of range histogram adjustment,
selectivity = 0.2,
scan selectivity 0.2, filter selectivity 0.2
60412.2 rows, 34.2 pages
Data Row Cluster Ratio 0.9924527
Index Page Cluster Ratio 0.218543
Data Page Cluster Ratio 0.02202437
using index prefetch (size 128K I/O)
Large IO selected: The number of leaf pages qualified is > MIN_PREFETCH pages
in index cache 'default data cache' (cacheid 0) with LRU replacement
93. 93Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Toprevent out of rangehistogramsToprevent out of rangehistograms
Turn off for updatestatistics
q Turn off for columns – not a whole table or specific index
q Syntax
update statistics table_name
[[partition data_partition_name]
[ (column1, column2, …) | (column1), (column2), …] |
index_name [partition index_partition_name]]
[using step values | [out_of_range [on | off| default]]]
[with consumers = consumers][, sampling=N percent]
[, no_hashing | partial_hashing | hashing]
[, max_resource_granularity = N [percent]]
[, histogram_tuning_factor = int ]
[, print_progress = int]
q Example
Update statistics SAPSR3.ESH_EX_CPOINTER (ENTRY_TYPE) out_of_range off
Update statistics SAPSR3.ESH_EX_CPOINTER (STATUS) out_of_range off
Out of rangehistogram is“sticky”
q Just like the number of steps, setting this once causes it to be used as
the default for all future update statistics that does not specify a
value.
94. (c) 2015 Independent SAP Technical User GroupAnnual Conference, 2015
OPTIMIZATION COSTINGOPTIMIZATION COSTING
(PART 2)(PART 2)
Multi-Column Densities& Joins…
95. 95Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Multi-Column DensitiesMulti-Column Densities
A underused secret weapon
q Useful any time multiple predicates exist
q Think of it this way:
ü Two sample predicates
Col_A = ‘5’
Col_B = ‘GREEN’
ü Assume both have a selectivity of 0.1
Combination could still be 0.1 if all Col_A=5 and Col_B=‘GREEN’ are same rows
Combination could be 0.01 (or less) if only a single row had the combination
When doesit matter
q Joins, distinct, subquery (caching), sort estimations, ….
q Anyplace where the estimated number of rows returning
could change the query plan (and tip costs towards an
alternative ‘bad’ plan)
q Especially since we don’t have composite column histograms
96. 96Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Multi-Column Density (Index)Multi-Column Density (Index)
Statistics for index: "aqi_weather_date_idx" (nonclustered)
Index column list: "sample_date", "air_temp", "weather"
Leaf count: 254345
Data page CR count: 167946797.0000000000000000
Index page CR count: 32018.0000000000000000
Data row CR count: 168066295.0000000000000000
Leaf row size: 6.1150672008890936
Index height: 3
Statistics for column group: "sample_date", "air_temp"
Last update of column statistics: May 27 2014 11:45:45:016AM
Range cell density: 0.0000051768562637
Total density: 0.0000051768562637
Range selectivity: default used (0.33)
In between selectivity: default used (0.25)
Unique range values: 0.0000016563476210
Unique total values: 0.0000016563476210
Average column width: default used (2.00)
Rows scanned: 168066824.0000000000000000
Statistics version: 4
Statistics for column group: "sample_date", "air_temp", "weather"
Last update of column statistics: May 27 2014 11:45:45:016AM
Range cell density: 0.0000051075008894
Total density: 0.0000051075008894
Range selectivity: default used (0.33)
In between selectivity: default used (0.25)
Unique range values: 0.0000016297687032
Unique total values: 0.0000016297687032
Average column width: 8.5268955638740458
Rows scanned: 168066824.0000000000000000
Statistics version: 4
This is the cost of a covered query (less
any portion of index not needed)
The ‘weather’ column must not be very distinct as it doesn’t
alter the table total density or range density by very much
If the IO cost of the index is ~page count and the IO cost for
the table is near the leaf count – it is doing an index scan
and then following each leaf…. Often not a good strategy
unless only a few rows
Any NL join using this index would need to traverse
the index tree this many times per outer row
(Note: Index cluster ratios removed due to space)
97. 97Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Usinga Multi-Column DensityUsinga Multi-Column Density
Remember, wedon’t havecompositehistograms
First weconsider theselectivity of each of thecolumnsindividually
q This gives us an idea of how many rows there could be
q For example, col_A has 2 rows & col_B has 5 rows….
ü Total range is between 2 & 10 rows
ü Probability is likely closer to 2…but depends on
reality….
Then welook at multi-column density
q This is our flavor of reality to temper probability
q We use the above with a proprietary formula to compute
the selectivity
ü The more selective each column, the closer to the
multi-column density
98. 98Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Example: Multi-Column DensityExample: Multi-Column Density
Statistics for column group: "sample_date", "air_temp", "weather"
Last update of column statistics: May 27 2014 11:45:45:016AM
Range cell density: 0.0000051075008894
Total density: 0.0000051075008894
Range selectivity: default used (0.33)
In between selectivity: default used (0.25)
Unique range values: 0.0000016297687032
Unique total values: 0.0000016297687032
Average column width: 8.5268955638740458
Rows scanned: 168066824.0000000000000000
Statistics version: 4
1> select l.city, l.county, s.sample_date, s.air_temp
2> from aqi_locations l, aqi_samples s
3> where l.location_id=s.location_id
4> and s.sample_date = 'July 1 2000 12:00:00:000PM'
5> and l.state='PA'
6> and s.weather='Overcast'
7> and s.air_temp = 90
Estimating selectivity of index 'aqi_samples.aqi_weather_date_idx', indid 3
sample_date= Jul 1 2000 12:00:00:000PM
weather = 'Overcast'
air_temp = 90
Estimated selectivity for sample_date,
selectivity = 0.0002490077,
Estimated selectivity for air_temp,
selectivity = 0.01104084,
Estimated selectivity for weather,
selectivity = 0.002359544,
scan selectivity 5.11258e-006, filter selectivity 5.11258e-006
859.2551 rows, 1.300359 pages
Data Row Cluster Ratio 3.186365e-006
Index Page Cluster Ratio 0.9989935
Data Page Cluster Ratio 0.0007121012
using no index prefetch (size 4K I/O)
in index cache 'default data cache' (cacheid 0) with LRU replacement
using no table prefetch (size 4K I/O)
in data cache 'default data cache' (cacheid 0) with LRU replacement
Data Page LIO for 'aqi_weather_date_idx' on table 'aqi_samples' = 859.2551
Selectivity based single histogram cell
for sample_date
Selectivity based single histogram cell for air_temp
Selectivity based on single histogram cell for weather
Selectivity estimate based on numbers of values for
the above combined with multi-cell density. Since
only a few values for each, the selectivity is close to
multi-column density
99. 99Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Problem – LargeEstimatesProblem – LargeEstimates
In somecases, wecan’t usemulti-column densities
q For example, columns involved may have ranges of
values
q The total estimates of rows could then be astronomical
ü Perhaps even higher than the real rowcount
In such cases, wecomputea ‘smart’ density
q We know the best case is the most selective column
q We then simply a formula to derive a selectivity
ü Some cite sum(cell weight**2)
ü Others use W1*W2 + W1*W2*W3 …
100. 100Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Example: Multi-Column EstimateExample: Multi-Column Estimate
1> select l.city, l.county, s.sample_date, s.air_temp
2> from aqi_locations l, aqi_samples s
3> where l.location_id=s.location_id
4> and s.sample_date between 'July 1 2000 00:00:01' and 'July 31 2000 23:59:59'
5> and l.state='PA'
6> and s.weather='Overcast'
7> and s.air_temp < 85
Estimating selectivity of index 'aqi_samples.aqi_weather_date_idx', indid 3
sample_date>= Jul 1 2000 12:00:01:000AM
sample_date <= Jul 31 2000 11:59:59:000PM
weather = 'Overcast'
air_temp < 85
Estimated selectivity for sample_date,
selectivity = 0.007751161,
Estimated selectivity for air_temp,
selectivity = 0.7523476,
Estimated selectivity for weather,
selectivity = 0.002359544,
Intelligent Scan selectivity reduction from 0.007751161 to 0.005852389
scan selectivity 0.005852389, filter selectivity 1.375984e-005
restricted selectivity 0.007751161
983592.5 rows, 1488.526 pages
Data Row Cluster Ratio 3.186365e-006
Index Page Cluster Ratio 0.9989935
Data Page Cluster Ratio 0.0007121012
using index prefetch (size 32K I/O)
Large IO selected: The number of leaf pages qualified is > MIN_PREFETCH pages
in index cache 'default data cache' (cacheid 0) with LRU replacement
using no table prefetch (size 4K I/O)
in data cache 'default data cache' (cacheid 0) with LRU replacement
Data Page LIO for 'aqi_weather_date_idx' on table 'aqi_samples' = 2312.572
Selectivity based on aggregating all
the dates in the range
Selectivity based all temps in unbounded range
Selectivity based on single cell density for weather
The worst case projection is the most selective of
the above
A better estimate is we use a formula to derive a new
value we think is more accurate for the scan selectivity
(estimate of index rows & leaf pages)…loosely it is
sum(W1*W2…) – e.g. W1*W2+W1*W2*W3
The filter selectivity (estimate of data pages) is the
product of the weights (e.g. W1*W2*W3 or
0.007751161* 0.7523476* 0.002359544 =
0.0000137598)
101. 101Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
When tocreate(multi-)column statsWhen tocreate(multi-)column stats
Okay – weknow automatically created for index keys
q …and used for joins
When do/ought wecreateour own
q On the 2-nth index key (or subset)
ü ASE creates stats on {A}, {A,B},{A,B,C}, {A,B,C,D}
ü Might be useful to have {B,C,D} or {B,C}
Help trip ORScans if leading column frequently not a predicate
Help with joins when leading column is specified as literal/lateral join (ala SAP)
q On low cardinality columns we don’t want to index
ü …but frequently used as predicates (such as gender)
ü Especially if often used in queries with joins (help
inner/out table decision)
Not automatically maintained with ‘updateindex stats’
q You need to manually run update stats on each column
density you create
102. 102Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
JoinsJoins
Traditional Logic @ DrivingTable
q Put the table that seems to ‘drive’ the join as the outer table
q Typically, this will be the ‘smaller’ table (or smaller rowset)
q The developer may know the driving table (e.g. #temp)
q …but optimizer has to figure it out
ü Estimate rowsets from each table using index selectivity
ü Estimate joined rows from joining with each table in list
Reducing joined rows by applying index selectivity as filter
But remember, this is a guess at optimization time
AlternativeLogic Pin smaller in cache
q Put larger rowset table as outer and scan once
q Inner (smaller) table can be pinned in cache
ü Avoid higher PIO
In both cases, themulti-column statson join columnsarekey torowset estimates
103. 103Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Join StrategiesJoin Strategies
Remember, wehave3 typesof joins
q Nested Loop Joins
q Merge Joins (including Sort Merge Joins)
q Hash Joins
Optimizer needstofigureout which oneisbest
q For indexed joins, typically an NLJ will be best …
ü ….but this assumes M:N ratio is reasonably small (e.g. 1:10)
q A merged join is great for high cardinality joins
ü M:N is high r 1:1000+
ü Especially if inner table is sorted in join key sequence
q A hash join works best when join keys are not predicates but
predicates eliminate a lot of rows on both sides of join
ü Outer table is filtered by predicates and join keys hashed into
build table
ü Inner table is filtered by predicates, join key hashed and probed
for in build table
104. 104Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Thisiswhy statsaresooo…criticalThisiswhy statsaresooo…critical
Weusethem toestimate
q cardinality of the join
q Rows that qualify from predicates (unjoined)
If theestimatesareoff by a lot
q We likely predict it is a high cardinality join
ü Remember, with 4 join keys, if we don’t have stats on
the other 3 columns, we use magic values of 0.1
q With very high row counts projected from inner table….
ü If we consider 3 levels of indexing and 10M rows,
that’s 40M LIO
ü Sorting 10M rows may only take 20M LIO’s…
ü ….so we degrade into a Sort Merge Join (SMJ)
105. 105Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Join Keys: TheQueryJoin Keys: TheQuery
SELECT TOP 1 T_00."PRRBA"
FROM SAPSR3."/PXY/ACTUAL_DEP" T_00
INNER JOIN SAPSR3."/PXY/SCD" T_01
ON T_01."MANDT" = ?
AND T_01."RBARE" = T_00."PRRBA"
AND T_01."SCNA" = T_00."PRSCNA"
AND T_01."EXECNO" = T_00."PREXEC"
AND T_01."STEP" = T_00."PRST"
WHERE T_00."MANDT" = ?
AND T_00."SCNA" = ?
AND T_00."EXECNO" = ?
AND T_00."STEP" = ?
AND T_00."RBARE" = ?
AND T_01."STATUS" <> ?
AND T_01."STATUS" <> ?
/* R3:/PXY/SAPLRB:72334 T:/PXY/ACTUAL_DEP M:430 */
create unique nonclustered index "/PXY/ACTUAL_DEP~0"
on SAPSR3."/PXY/ACTUAL_DEP"(MANDT, SCNA, EXECNO, STEP, RBARE, PRSCNA, PREXEC, PRST, PRRBA)
create nonclustered index "/PXY/ACTUAL_DEP~00"
on SAPSR3."/PXY/ACTUAL_DEP"(MANDT, PRSCNA, PREXEC, PRST, PRRBA, SCNA, EXECNO, STEP, RBARE)
create unique nonclustered index "/PXY/SCD~0"
on SAPSR3."/PXY/SCD"(MANDT, RBARE, SCNA, EXECNO, STEP)
create nonclustered index "/PXY/SCD~ID1"
on SAPSR3."/PXY/SCD"(MANDT, SCNA, EXECNO, RBARE)
Notice the lateral join on MANDT = <value>.
Knowing that ASE has issues with literals at the
beginning of the join, we will see if adding multi-
column stats on {RBARE, SCNA, EXECNO, STEP}
helps NLJoin costing
106. 106Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Join Keys– Bad Index UsageJoin Keys– Bad Index Usage
| |TOP Operator (VA = 4)
| | Top Limit: 1
| | |MERGE JOIN Operator (Join Type: Inner Join) (VA = 3)
| | | Using Worktable2 for internal storage.
| | | Key Count: 4
| | | Key Ordering: ASC ASC ASC ASC
| | | |SORT Operator (VA = 1)
| | | | Using Worktable1 for internal storage.
| | | | |SCAN Operator (VA = 0)
| | | | | FROM TABLE
| | | | | SAPSR3./PXY/ACTUAL_DEP
| | | | | T_00
| | | | | Index : /PXY/ACTUAL_DEP~0
| | | | | Forward Scan.
| | | | | Positioning by key.
| | | | | Index contains all needed columns. Base table will not be read.
| | | | | Keys are:
| | | | | MANDT ASC
| | | | | SCNA ASC
| | | | | EXECNO ASC
| | | | | STEP ASC
| | | | | RBARE ASC
| | | | | Using I/O Size 16 Kbytes for index leaf pages.
| | | | | With LRU Buffer Replacement Strategy for index leaf pages.
| | | |SCAN Operator (VA = 2)
| | | | FROM TABLE
| | | | SAPSR3./PXY/SCD
| | | | T_01
| | | | Index : /PXY/SCD~0
| | | | Forward Scan.
| | | | Positioning by key.
| | | | Keys are:
| | | | MANDT ASC
| | | | Using I/O Size 16 Kbytes for index leaf pages.
| | | | With LRU Buffer Replacement Strategy for index leaf pages.
| | | | Using I/O Size 16 Kbytes for data pages.
| | | | With LRU Buffer Replacement Strategy for data pages.
109. 109Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Join Permutation Costing(3)Join Permutation Costing(3)
Eqc competition ...
initial old Pops:
( PopNlJoin ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) ( PopRidJoin ( PopIndScan /PXY/SCD~0 SAPSR3./PXY/SCD T_01 ) ) )
cost:2070.783 T(L76.16116,P76.16116,C144.3204) tempdb:0 order: none
initial new Pops:
...
pruned new against total 0
pruned new against old 5
pruned old against new 1
kept old Pops:
( PopNlJoin ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) ( PopRidJoin ( PopIndScan /PXY/SCD~0 SAPSR3./PXY/SCD T_01 ) ) )
cost:2070.783 T(L76.16116,P76.16116,C144.3204) tempdb:0 order: none
kept new Pops:
( PopMergeJoin ( PopSort ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) )
( PopRidJoin ( PopIndScan /PXY/SCD~0 SAPSR3./PXY/SCD T_01 ) ) )
cost:735.8733 T(L30.59676,P24.55275,C608.61) O(L0,P0,C70.16021) tempdb:0.0006185754 order: none
... done Eqc competition.
... done join visit.
Join plans selected for this permutation:
OptBlock0 Eqc{0,1} -> Pops added for the join Eqc{0} - Eqc{1}:
( PopMergeJoin ( PopSort ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) )
( PopRidJoin ( PopIndScan /PXY/SCD~0 SAPSR3./PXY/SCD T_01 ) ) )
cost:735.8733 T(L30.59676,P24.55275,C608.61) O(L0,P0,C70.16021) tempdb:0.0006185754 order: none
move greedy pops to new list
( PopNlJoin ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) ( PopRidJoin ( PopIndScan /PXY/SCD~0 SAPSR3./PXY/SCD T_01 ) ) )
cost:2070.783 T(L76.16116,P76.16116,C144.3204) tempdb:0 order: none
... done move greedy pops to new list.
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
DONE: Complete join order evaluation (perm #1)
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
“old Pops” = 12.5 style optimization – note that the cost is >2000
110. 110Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Join Permutation Costing(4)Join Permutation Costing(4)
** Costing set up for RowLimit optimization **
TopLogProps0( SAPSR3./PXY/ACTUAL_DEP T_00 ) - TopPred: [Tc{} Pe{0,1,2,3,4}] TopSubst: {1,2,3,4,5,6,7,8,9,17}
TopLogProps0( SAPSR3./PXY/SCD T_01 ) - TopPred: [Tc{} Pe{5,6,7}] TopSubst: {11,12,13,14,15,16}
Statistics for rows returned to client...
Estimated rows :14073.64 Estimated row width :7.002473
Estimated client cost is :78.59161
Estimating selectivity of index 'SAPSR3./PXY/SCD./PXY/SCD~0', indid 2
MANDT = '430'
Estimated selectivity for MANDT,
selectivity = 1,
scan selectivity 1, filter selectivity 1
Cost adjusted for RowLimit optimization, Adjustment ratio 7.105484e-05
2503626 rows, 6283 pages
Adjustment ratio 7.105484e-05 applied gives 177.8947 rows, 1 pages
Data Row Cluster Ratio 0.9107559
Index Page Cluster Ratio 0.9874477
Data Page Cluster Ratio 0.242736
using index prefetch (size 128K I/O)
Large IO selected: The number of leaf pages qualified is > MIN_PREFETCH pages
Adjustment using index prefetch (size 128K I/O)
in index cache 'default data cache' (cacheid 0) with LRU replacement
using table prefetch (size 128K I/O)
Large IO selected: The number of leaf pages qualified is > MIN_PREFETCH pages
Adjustment using table prefetch (size 128K I/O)
in data cache 'default data cache' (cacheid 0) with LRU replacement
Data Page LIO for '/PXY/SCD~0' on table 'SAPSR3./PXY/SCD' = 17.83115