SlideShare une entreprise Scribd logo
1  sur  144
Télécharger pour lire hors ligne
(c) 2015 Independent SAP Technical User GroupAnnual Conference, 2015
-ISUG TECH 2015-ISUG TECH 2015
ConferenceConference
:The Science of DBMS Query Optimization:The Science of DBMS Query Optimization
,Jeff Tallman SAP ASE Product Management,Jeff Tallman SAP ASE Product Management
2Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
AgendaAgenda
Intro& Optimization Basics
q Basic optimization cost factors
q Procedure Cache (ASE)
Query Processing& Optimization
q Internals of QP
q Impact of LOP-tree
q Understanding optimization vs. execution
Optimization Costing
q Histograms & column densities
q IN() & OR clauses
q Out of range histograms
q Joins & Multi-column densities
Controllingoptimization
q Sp_chgattribute ‘opt concurrency threshold’
q Sp_modifystats
q Resource Granularity
3Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
SomeCaveatsSomeCaveats
Query Optimization isvery vendor proprietary/confidential
q You can buy books on generic optimization techniques….
q …but DBMS vendors hire PhD’s to develop implementations
ü Query performance often depends on how good the
optimization is
ü This is a key difference between OpenSource and COTS
DBMS packages
 The strength of the query optimizer is largely due to the $$$ vested in skills of
highly educated staffing
Asa result, thissession will NOT explain thesecretsof ASE’soptimizer
q However, it will explain how it works, what influences it, what
resources it uses, etc.
q Additionally, most modern optimizers all use the same lava
tree model
ü Query optimization is based on an upside down tree with
data spewing out the top
4Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Goal of ThisSessionGoal of ThisSession
Thegoal of thissession
q Help you understand the intricacies of query
optimization
q Use that knowledge to write queries that can be
optimized better
q Understand how/when additional index statistics might
be necessary
q Understand how to influence optimization
ü Other than the usual index forcing, AQP plan clauses,
etc.
q Differentiate when the optimizer is messing up…or your
SQL did
Assumptionsfor thissession
5Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
RulesBased OptimizationRulesBased Optimization
Rulesbased optimization
q Index selection and join order processing are based on specific
rules
q For example:
ü Index selection is based on the index whose leading columns
are most covered by query predicates
ü Join order is based on left to right ordering in FROM clause
designates driving tables/join order
Thegood, bad & ugly
q Very good for extremely volatile data in which histogram
statistics are often stale/impossible
q Good for insert intensive monotonic sequences in which new
values are out of range of histograms
q Not so good…in fact sometimes ugly…on data that has any sort
of skew with highly repetitive values
q The really ugly part is if the SQL coders don’t know the “rules”
6Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Cost Based OptimizationCost Based Optimization
Used by all mainstream DBMS’s
q Oracle, IBM DB2 UDB, MS SQL, ASE
Attemptstofind thecheapest method toperform query
q Uses some factoring of IO, CPU and memory
q Formula for cost varies among DBMS’s
Thekey tocostingisindex/column histograms
q In a sense, histograms attempt to report the relative skew of
the data being queried
q The optimizer’s goal is to find the cheapest access path
considering the data skew
q If it wasn’t for the histogram reporting the skew…a rules
based optimization would be the only choice
7Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
SimpleCost Factors(1)SimpleCost Factors(1)
Physical IO
q This is pretty obvious – disks are slow.
q But we also need to predict how many writes (and then
re-reads) we may need to do for intermediate results
Logical IO
q This is where PhD’s are made
q Remember, at query optimization time, we don’t know
what pages we are after….
q However, we need to determine how many LIOs we
expect based on
ü How much of a table is already in cache
ü How often we may revisit the same pages for multiple
rows
8Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
SimpleCost Factors(2)SimpleCost Factors(2)
Memory
q Besides LIO, memory can be used to cache query
intermediate results such as subquery results, hash tables
for HJ, etc.
q In addition, memory can be used to avoid writes – e.g. in
memory sorts for order by, sort merge joins, etc.
CPU
q Again, fairly basic – but every LIO requires CPU
ü We need to do the data comparison for non-index key
predicates
ü Again, though, we really don’t know how fast the CPU is
that we are on…and how awful the data comparisons
will be
 We might apply some fuzzy logic on LIKE ‘%pattern%’ on large varchars or
something….but …..
q Also, basic – sorts require CPU as well
ü Distinct processing, Order by processing, etc.
9Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
ProcedureCache& OptimizationProcedureCache& Optimization
Optimization • oneof theconsumersof proccache
q Index statistics are loaded into proc cache for each query optimization
ü Visible with set option show long
q Temporary work plans are created in proc cache
q Reported via set statistics resource on
q Total consumption not a lot (rule of thumb = #engines * 2MB for OLTP)
Twobigproblems
q There is no ‘sharing’ of index statistics in proc cache
q Index statistics don’t stay in cache
ü As soon as query optimization for that query is finished, the proc
buffers are deallocated.
ü This means a TON of logical IOs on sysstatistics
 Unless you use a lot of fully prepared statements or stored procedures
ü Hence you really want to ensure you have a dedicated systables
cache
q This is largely due to historical aspects
ü Remember, in 1984, 1MB of memory was a lot
ü Today, sum of the index statistics are likely 256MB or less
10Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
LoadingStats& ProcCacheUsageLoadingStats& ProcCacheUsage
Creating Initial Statistics for table aqi_locations l
.....Done creating Initial Statistics for table aqi_locations l
Creating Initial Statistics for table aqi_samples s
.....Done creating Initial Statistics for table aqi_samples s
Creating Initial Statistics for index aqi_locations_PK
.....Done creating Initial Statistics for index aqi_locations_PK
…
Phase 2b initialization of OptBlock0 ...
... phase 2b done.
Start merging statistics for table aqi_locations l
..... Done merging statistics for table aqi_locations l
Start merging statistics for table aqi_samples s
..... Done merging statistics for table aqi_samples s
…
Total estimated I/O cost for statement 1 (at line 1): 33926.
Parse and Compile Time 0.
Adaptive Server cpu time: 0 ms.
Statement: 1 Compile time resource usage: (est worker processes=0 proccache=126),
Execution time resource usage: (worker processes=0 auxsdesc=0 plansize=14
proccache=23 proccache hwm=28 tempdb hwm=2)
Private buffer count: 48,Private HWM buffer count: 48
use demo_db
go
set statement_cache off
set switch on 3604
set option show long
set statistics time, io, resource, plancost on
set showplan on
go
select l.city, l.county, s.sample_date, s.air_temp
from aqi_locations l, aqi_samples s
where l.location_id=s.location_id
and s.sample_date = 'July 1 2000 12:00:00:000PM'
and l.state='PA'
and s.weather='Overcast'
and s.air_temp = 90
go
set switch off 3604
set option show off
set statistics time, io, resource, plancost off
set showplan off
go
Loading stats
Compile time proc cache usage for stats & work plans
126 proc pages * 2k memory page = 252KB
(c) 2015 Independent SAP Technical User GroupAnnual Conference, 2015
QUERY PROCESSING &QUERY PROCESSING &
OPTIMIZATIONOPTIMIZATION
Internals, LOP Trees& Execution
12Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
QP PhasesQP Phases
Receivebuffer
SQL Parsing
Query Normalization
q Resolves object id’s
q Replaces system
functions/functions with
literals with literal values
q Rearranges AND/OR according
to precedence
Pre-Processing
q Transforms subqueries
q Rearranges aggregates
q Creates Logical Operators
(LOP)
Query Optimization
Query Execution
TDSLANG select * from table where due_dt
=getdate() and recv_date is null
SELECT {column list}
FROM • table
COND1 due_dt <=getdate()
COND2 (AND) r recv_date is null
SELECT {column id’s & datatypes}
FROM • objid=123456
COND1 col_id=3 (dt) >= (dt) ‘Jan 1 2015’
COND2 (AND) col_id=4 (dt) IS NULL
Receive Buffer
SQL Parsing
Normalization
Pre-Processing
Query Optimization
Query Execution
Focus
13Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
SomeNoteson WaitEventsSomeNoteson WaitEvents
Believeit or not….
q Until execution phase, all the rest counts as ‘awaiting
command’ in sp_who or WaitEvent ID=250 in
monProcessWaits
q It kinda makes sense….until query is executing…it isn’t
executing…
q ….but parsing, compiling & optimization all can use
considerable CPU time
ü Sooo…that is why set statistics time on reports
compile time separately
Sooo…if ‘awaitingcommand’ a lot….
q See if packets received are increasing
14Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Optimization Startswith LOP TreeOptimization Startswith LOP Tree
Duringpre-processingphase, a LOP treeiscreated
q A high level tree that represents the logical operations
representing the relations between the entities
q Often, the LOP tree is the first place where optimization
starts to go wrong….due to bad query formation by
developers
Use‘set option show on’ toseelop tree
q It will be near the very top of the output
q You will need trace 3604 enabled
Duringexecution, a physical operator (Pop) isused
q Lop Join
q Pop NLJoin
15Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
ExampleQueryExampleQuery
use demo_db
go
set option show on
set switch on 3604
set statistics plancost, time, resource, io on
set showplan on
set statement_cache off -- avoid rerunning goofy plans from previous run
set nodata on -- don’t return results (avoids network time/scrolling of large results)
go
select l.county, avg(s.air_temp)
from aqi_locations l,
aqi_samples s
where l.location_id=s.location_id
and s.sample_date between 'July 1 2000 00:01am' and 'July 31 2000 23:59:59'
and state='PA'
group by l.county
go
set option show off
set switch off 3604
set statistics plancost, time, resource, io off
set showplan off
--set statement_cache off
go
16Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
ExampleLOP TreeExampleLOP Tree
1> select l.county, avg(s.air_temp)
2> from aqi_locations l,
3> aqi_samples s
4> where l.location_id=s.location_id
5> and s.sample_date between 'July 1 2000 00:01am' and 'July 31 2000 23:59:59'
6> and state='PA'
7> group by l.county
The Lop tree:
( project
( group
( join
( scan aqi_locations
)
( scan aqi_samples
)
)
)
)
17Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
LOP Tree& OptBlocksLOP Tree& OptBlocks
Each LOP treelevel becomesan Optblock
q Outermost block (0) is one below
(project)
q Each block will generally have a
relational operator
ü Join, group, scalar, etc.
ü Scan is only considered an
operator if the query only
has one entity and no
other operators
Optimizer will determinean optimal plan for
that block
q ASE set option show will print
optimization for each optblock
q The optblock list is also printed at
The Lop tree:
( project
( group
( join
( scan aqi_locations
)
( scan aqi_samples
)
)
)
)
OptBlock1
OptBlock0
18Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
ExampleOptBlockExampleOptBlock
The Lop tree:
…
OptBlock1
The Lop tree:
( join
( scan aqi_locations
)
( scan aqi_samples
)
)
Generic Tables: ( Gtt1( aqi_locations l ) Gtt2( aqi_samples s ) Gti3( aqi_locations_PK ) …
Generic Columns: ( Gc0(aqi_locations l ,Rid) Gc1(aqi_locations l ,state) Gc2(aqi_locations l ,location_id) …
Predicates: ( { aqi_samples s.sample_date} >= "Jul 1 2000 12:01AM" tc:{5} …
Transitive Closures: ( Tc0 = { Gc0(aqi_locations l ,Rid)} …
OptBlock0
The Lop tree:
( pseudoscan
)
Generic Tables: ( Gtg0 )
Generic Columns: ( Gc8(Gtg0 ,_gcelement_8) Gc9(Gtg0 ,_gcelement_9) Gc10(Gtg0 ,_gcelement_10) …
Predicates: ( )
Transitive Closures: ( Tc7 = { Gc8(Gtg0 ,_gcelement_8) Gc12(Gtg0 ,_virtualagg) …
19Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
If you haveany doubtsIf you haveany doubts
If your index isbeingconsidered….
q It will be listed in Generic Tables with Gtti
ü Format is <tablelist>, <indexlist>
q Example:
ü Generic Tables: ( Gtt1( aqi_locations l ) Gtt2( aqi_samples
s ) Gti3( aqi_locations_PK ) Gti4( city_state_idx )
Gti5( county_state_idx ) Gti6( aqi_samples_PK )
Gti7( aqi_weather_date_idx ) )
If your whereclauseisbeingconsidered…
q It will be listed in Predicates
q Example:
ü Predicates: ( { aqi_samples s.sample_date} >= "Jul 1
2000 12:01AM" tc:{5} { aqi_samples s.sample_date}
<= "Jul 31 2000 11:59PM" tc:{5} { aqi_locations
l.state} = 'PA' tc:{1} )
20Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Tofind optimization detailsTofind optimization details
Look for optblock begin/end section markersin output
q Begin
 **************************************************************************
****
 BEGIN: Search Space Traversal for OptBlock1
 **************************************************************************
****
q End
 **************************************************************************
****
 DONE: Search Space Traversal for OptBlock1
 **************************************************************************
****
Any section could befairly lengthy
q The key is to find the optblock where you think the
problem is….
21Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
TheLOP role…a taleof twoqueriesTheLOP role…a taleof twoqueries
select *
into tempdb..my_objects
from sybsystemprocs..sysobjects
create index type_date_idx
on tempdb..my_objects (type, crdate)
declare @type char(2)
select @type='P'
select @type, max(crdate)
from tempdb..my_objects
where type=@type
declare @type char(2)
select @type='P'
select type, max(crdate)
from tempdb..my_objects
where type=@type
group by type
The setup: “Good” Query: “Bad” Query:
22Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Theshowplans…and final IO costsTheshowplans…and final IO costs
QUERY PLAN FOR STATEMENT 2 (at line 9).
Optimized using Serial Mode
STEP 1
The type of query is SELECT.
2 operator(s) under root
|ROOT:EMIT Operator (VA = 2)
|
| |SCALAR AGGREGATE Operator (VA = 1)
| | Evaluate Ungrouped MAXIMUM AGGREGATE.
| | Scanning only up to the first qualifying row.
| |
| | |SCAN Operator (VA = 0)
| | | FROM TABLE
| | | my_objects
| | | Index : type_date_idx
| | | Backward scan.
| | | Positioning by key.
| | | Index contains all needed columns. Base table will not be read.
| | | Keys are:
| | | type ASC
| | | Using I/O Size 4 Kbytes for index leaf pages.
| | | With LRU Buffer Replacement Strategy for index leaf pages.
Total estimated I/O cost for statement 2 (at line 9): 54.
…
Table: my_objects scan count 1, logical reads: (regular=2 apf=0 total=2),
physical reads: (regular=0 apf=0 total=0), apf IOs used=0
Total actual I/O cost for this command: 4.
“Good” Query Plan & Cost:
QUERY PLAN FOR STATEMENT 2 (at line 9).
Optimized using Serial Mode
STEP 1
The type of query is SELECT.
3 operator(s) under root
|ROOT:EMIT Operator (VA = 3)
|
| |RESTRICT Operator (VA = 2)(0)(0)(0)(4)(0)
| |
| | |GROUP SORTED Operator (VA = 1)
| | | Evaluate Grouped MAXIMUM AGGREGATE.
| | |
| | | |SCAN Operator (VA = 0)
| | | | FROM TABLE
| | | | my_objects
| | | | Index : type_date_idx
| | | | Forward Scan.
| | | | Positioning by key.
| | | | Index contains all needed columns. Base table will not be read.
| | | | Keys are:
| | | | type ASC
| | | | Using I/O Size 4 Kbytes for index leaf pages.
| | | | With LRU Buffer Replacement Strategy for index leaf pages.
Total estimated I/O cost for statement 2 (at line 9): 360.
…
Table: my_objects scan count 1, logical reads: (regular=4 apf=0 total=4),
physical reads: (regular=0 apf=0 total=0), apf IOs used=0
Total actual I/O cost for this command: 8.
“Bad” Query Plan & Cost:
23Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
A first clue…theplancostA first clue…theplancost
==================== Lava Operator Tree ====================
Emit
(VA = 2)
r:1 er:1
cpu: 0
/
ScalarAgg
Max
(VA = 1)
r:1 er:1
cpu: 0
/
IndexScan
type_date_idx
(VA = 0)
r:1 er:1
l:2 el:2
p:0 ep:2
============================================================
“Good” Query LOP Plancost:
==================== Lava Operator Tree ====================
Emit
(VA = 3)
r:1 er:6
cpu: 0
/
Restrict
(0)(0)(0)(4)(0)
(VA = 2)
r:1 er:6
/
GroupSorted
Grouping
(VA = 1)
r:1 er:6
/
IndexScan
type_date_idx
(VA = 0)
r:647 er:598
l:4 el:4
p:0 ep:4
============================================================
“Bad” Query LOP Plancost:
24Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Theactual LOP treesTheactual LOP trees
The Lop tree:
( project
( scalar
( scan my_objects
)
)
)
OptBlock1
The Lop tree:
( scan my_objects
)
OptBlock0
The Lop tree:
( pseudoscan
)
“Good” Query LOP tree:
The Lop tree:
( project
( group
( scan my_objects
)
)
)
OptBlock1
The Lop tree:
( scan my_objects
)
OptBlock0
The Lop tree:
( pseudoscan
)
“Bad” Query LOP Plancost:
25Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
TheLessonTheLesson
TheLOP can influenceoptimization and final costs
q Try to use operators that are lighter weight (e.g. scalar
vs. group by)
q In this case, we knew the @type up front….
ü Re-selecting it in the ‘group by’ variant is
duplicative/redundant
ü Literals, @vars are scalars whereas group by is a
vector
Execution can play a roleaswell
q We saw in this example, in the scalar variant that the
optimizer can limit the rows to be scanned
 | |SCALAR AGGREGATE Operator (VA = 1)
 | | Evaluate Ungrouped MAXIMUM AGGREGATE.
 | | Scanning only up to the first qualifying row.
q Execution can also short-circuit based in certain
26Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Optimization vs. Execution (1)Optimization vs. Execution (1)
Optimizer getsa lot of blamefor thingsit isnot involved in
Example:
q Customer on SCN whines about table scan due to
optimizer ‘bug’ on the following example query

 Select * from sysobjects
 Where id=8 OR 1=2

q Customer “thinks” optimizer should simply use the index

What doyou think thereal problem isand why???
27Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Let’sstart simple(1)Let’sstart simple(1)
1> select count(*) from sysobjects plan '(t_scan sysobjects)'
QUERY PLAN FOR STATEMENT 1 (at line 1).
Optimized using Serial Mode
Optimized using the Abstract Plan in the PLAN clause.
STEP 1
The type of query is SELECT.
2 operator(s) under root
|ROOT:EMIT Operator (VA = 2)
|
| |SCALAR AGGREGATE Operator (VA = 1)
| | Evaluate Ungrouped COUNT AGGREGATE.
| |
| | |SCAN Operator (VA = 0)
| | | FROM TABLE
| | | sysobjects
| | | Table Scan.
| | | Forward Scan.
| | | Positioning at start of table.
| | | Using I/O Size 32 Kbytes for data pages.
| | | With LRU Buffer Replacement Strategy for data pages.
Total estimated I/O cost for statement 1 (at line 1): 414.
Parse and Compile Time 0.
Adaptive Server cpu time: 0 ms.
-----------
702
Let’s force a table scan just to
see how many LIO’s it takes
28Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Let’sstart simple(2)Let’sstart simple(2)
Statement: 1 Compile time resource usage: (est worker processes=0 proccache=57),
Execution time resource usage: (worker processes=0 auxsdesc=0 plansize=6 proccache=7 proccache hwm=7 tempdb hwm=0)
==================== Lava Operator Tree ====================
Emit
(VA = 2)
r:1 er:1
cpu: 0
/
ScalarAgg
Count
(VA = 1)
r:1 er:1
cpu: 0
/
TableScan
sysobjects
(VA = 0)
r:702 er:702
l:26 el:26
p:0 ep:4
============================================================
Table: sysobjects scan count 1, logical reads: (regular=26 apf=0 total=26), physical reads: (regular=0 apf=0 total=0), apf IOs
used=0
Total actual I/O cost for this command: 52.
Total writes for this command: 0
Execution Time 0.
Adaptive Server cpu time: 0 ms. Adaptive Server elapsed time: 0 ms.
The answer is 26…remember
that
29Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
A simplefalseexpression (1)A simplefalseexpression (1)
1> select * from sysobjects where 1=2
QUERY PLAN FOR STATEMENT 1 (at line 1).
Optimized using Serial Mode
STEP 1
The type of query is SELECT.
2 operator(s) under root
|ROOT:EMIT Operator (VA = 2)
|
| |RESTRICT Operator (VA = 1)(4)(0)(0)(0)(0)
| |
| | |SCAN Operator (VA = 0)
| | | FROM TABLE
| | | sysobjects
| | | Table Scan.
| | | Forward Scan.
| | | Positioning at start of table.
| | | Using I/O Size 4 Kbytes for data pages.
| | | With LRU Buffer Replacement Strategy for data pages.
Total estimated I/O cost for statement 1 (at line 1): 237.
Parse and Compile Time 0.
Adaptive Server cpu time: 0 ms.
We are still going to do an
table scan….
30Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
A simplefalseexpression (2)A simplefalseexpression (2)
Statement: 1 Compile time resource usage: (est worker processes=0 proccache=69),
Execution time resource usage: (worker processes=0 auxsdesc=0 plansize=14 proccache=15 proccache hwm=15 tempdb hwm=0)
==================== Lava Operator Tree ====================
Emit
(VA = 2)
r:0 er:702
cpu: 0
/
Restrict
(4)(0)(0)(0)(0)
(VA = 1)
r:0 er:702
/
TableScan
sysobjects
(VA = 0)
r:0 er:702
l:0 el:1
p:0 ep:1
============================================================
Table: sysobjects scan count 0, logical reads: (regular=0 apf=0 total=0), physical reads: (regular=0 apf=0 total=0), apf IOs used=0
Total actual I/O cost for this command: 0.
Total writes for this command: 0
Execution Time 0.
Adaptive Server cpu time: 0 ms. Adaptive Server elapsed time: 0 ms.
(0 rows affected)
What happened to our 26
IO’s???
31Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Digginga Bit Deeper (1)Digginga Bit Deeper (1)
1> select * from sysobjects where 1=2
2>
The Lop tree:
( project
( scan sysobjects
)
)
OptBlock0
The Lop tree:
( scan sysobjects
)
Generic Tables: ( Gtt0( sysobjects ) )
Generic Columns: …
Predicates: ( 1=2)
Transitive Closures: …
We do see the expression…but notice
there is no index listed in Generic Tables…
….and notice that the predicate listed
doesn’t have a condition number (tc{#})…
32Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Digginga Bit Deeper (2)Digginga Bit Deeper (2)
******************************************************************************
BEGIN: Search Space Traversal for OptBlock0
******************************************************************************
Scan plans selected for this optblock:
Statistics for rows returned to client...
Estimated rows :702 Estimated row width :239.5
Estimated client cost is :132.95
Estimating selectivity for table 'sysobjects'
Table scan cost is 702 rows, 21 pages,
Cost adjusted for Fastfirstrow goal, Adjustment ratio0.001424501
Adjusted Table scan cost is 1 rows, 21 pages,
The table (Datarows) has 702 rows, 21 pages,
Data Page Cluster Ratio 0.9999900
Search argument selectivity is 1.
using table prefetch (size 32K I/O)
Large IO selected: The number of leaf pages qualified is > MIN_PREFETCH pages
in data cache 'default data cache' (cacheid 0) with LRU replacement
OptBlock0 Eqc{0} -> Pops added:
( PopTabScan sysobjects ) cost:237.6 T(L1,P0.9999995,C2106) O(L1,P0.9999995,C2106) order: none
The best plan found in OptBlock0 :
( PopTabScan cost:237.6 T(L1,P0.9999995,C2106) O(L1,P0.9999995,C2106) props: [{}] Gtt0( sysobjects ) )
cost:237.6 T(L1,P0.9999995,C2106) O(L1,P0.9999995,C2106) order: none
Hmmm….no indexes looked
at…
33Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Let’sTry SomethingClose(1)Let’sTry SomethingClose(1)
1> select * from sysobjects where id=8 and 1=2
QUERY PLAN FOR STATEMENT 1 (at line 1).
Optimized using Serial Mode
STEP 1
The type of query is SELECT.
2 operator(s) under root
|ROOT:EMIT Operator (VA = 2)
|
| |RESTRICT Operator (VA = 1)(4)(0)(0)(0)(0)
| |
| | |SCAN Operator (VA = 0)
| | | FROM TABLE
| | | sysobjects
| | | Using Clustered Index.
| | | Index : csysobjects
| | | Forward Scan.
| | | Positioning by key.
| | | Keys are:
| | | id ASC
| | | Using I/O Size 4 Kbytes for index leaf pages.
| | | With LRU Buffer Replacement Strategy for index leaf pages.
| | | Using I/O Size 4 Kbytes for data pages.
| | | With LRU Buffer Replacement Strategy for data pages.
Total estimated I/O cost for statement 1 (at line 1): 81.
Parse and Compile Time 0.
Adaptive Server cpu time: 0 ms.
Heyyy!!!! We used an
index…even with a FALSE
expression….
34Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Let’sTry SomethingClose(2)Let’sTry SomethingClose(2)
Statement: 1 Compile time resource usage: (est worker processes=0 proccache=69),
Execution time resource usage: (worker processes=0 auxsdesc=0 plansize=14 proccache=17 proccache hwm=17 tempdb hwm=0)
==================== Lava Operator Tree ====================
Emit
(VA = 2)
r:0 er:71
cpu: 0
/
Restrict
(4)(0)(0)(0)(0)
(VA = 1)
r:0 er:71
/
IndexScan
csysobjects
(VA = 0)
r:0 er:71
l:0 el:3
p:0 ep:3
============================================================
Table: sysobjects scan count 0, logical reads: (regular=0 apf=0 total=0), physical reads: (regular=0 apf=0 total=0), apf IOs used=0
Total actual I/O cost for this command: 0.
Total writes for this command: 0
Execution Time 0.
Adaptive Server cpu time: 0 ms. Adaptive Server elapsed time: 0 ms.
(0 rows affected)
…but we *STILL* didn’t do any
LIO’s….how is that???
35Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Let’sTry SomethingClose(3)Let’sTry SomethingClose(3)
1> select * from sysobjects where id=8 and 1=2
2>
3>
The Lop tree:
( project
( scan sysobjects
)
)
OptBlock0
The Lop tree:
( scan sysobjects
)
Generic Tables: ( Gtt0( sysobjects ) Gti1( csysobjects ) )
Generic Columns: …
Predicates: ( { sysobjects.id } = 8 tc:{25} 1=2)
Transitive Closures: …
…We now have an index to look
at as well as a predicate with a
tc{#}….it applies to the condition
before the label.
36Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Let’sTry SomethingClose(4)Let’sTry SomethingClose(4)
******************************************************************************
BEGIN: Search Space Traversal for OptBlock0
******************************************************************************
Scan plans selected for this optblock:
Statistics for rows returned to client...
Estimated rows :70.2 Estimated row width :239.5
Estimated client cost is :14.7343
Scan on table sysobjects skipped because table scan less than concurrency threshold
Scan on table sysobjects skipped because table scan less than concurrency threshold
Beginning selection of qualifying indexes for table 'sysobjects',
Estimating selectivity of index 'sysobjects.csysobjects', indid 3
id = 8
Estimated selectivity for id,
selectivity = 0.1,
scan selectivity 0.001424501, filter selectivity 0.001424501
restricted selectivity 0.1
Cost adjusted for Fastfirstrow goal, Adjustment ratio 0.01424501
unique index with all keys, one row scans
1 rows, 1 pages
Adjustment ratio 0.01424501 applied gives 0.01424501 rows, 1 pages
Data Row Cluster Ratio 0.06314244
Index Page Cluster Ratio 0.99999
Data Page Cluster Ratio 0.2469512
using no index prefetch (size 4K I/O)
in index cache 'default data cache' (cacheid 0) with LRU replacement
Yep, we evaluated the
index
37Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Let’sTry SomethingClose(5)Let’sTry SomethingClose(5)
******************************************************************************
BEGIN: Search Space Traversal for OptBlock0
******************************************************************************
…
using no table prefetch (size 4K I/O)
in data cache 'default data cache' (cacheid 0) with LRU replacement
Data Page LIO for 'csysobjects' on table 'sysobjects' = 1
OptBlock0 Eqc{0} -> Pops added:
( PopRidJoin ( PopIndScan csysobjects sysobjects ) ) cost:81.39999 T(L3,P3,C4) O(L1,P1,C3) order: none
The best plan found in OptBlock0 :
( PopRidJoin cost:81.39999 T(L3,P3,C4) O(L1,P1,C3) props: [{}] ( PopIndScan cost:54.09999 T(L2,P2,C1) O(L2,P2,C1)
props: [{}] Gti1( csysobjects ) Gtt0( sysobjects ) ) cost:54.09999 T(L2,P2,C1) O(L2,P2,C1) order: none
) cost:81.39999 T(L3,P3,C4) O(L1,P1,C3) order: none
******************************************************************************
DONE: Search Space Traversal for OptBlock0
******************************************************************************
…and that was about it….so we go with the index
38Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Understandingwhat happenedUnderstandingwhat happened
Query optimizer optimizes…not executes
q Expression evaluation happens during execution time
q Soooo….. 1=2 is not even looked at by optimizer
ü Both are literals and optimizer skips this as a literal
expression that cannot be optimized
Query execution can ‘short circuit’
q Obviously false expressions
q N-ary Nested Loop Joins
q …
39Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Soo….What about Our Query?Soo….What about Our Query?
Our Example:

 Select * from sysobjects
 Where id=8 OR 1=2

What happens
q Optimizer evaluates index on id=8
q Optimizer sees OR clause
ü …opposite side of OR clause is unoptimizable expression
which could be *anything* (e.g. an unindexed param
like type=‘U’)
ü Since it could be anything OR clause means table scan
q Since we have to table scan the OR’d condition….
ü No sense in using the index for id=8…we will just hit
those rows on the way by doing the OR clause
40Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Why did I bringthat up???Why did I bringthat up???
Haveyou ever donethisin a stored proc???
 Select….
 from tableA, …
 where …
 and (((@var1=1) and (colA=‘value’))
 or ((@var1=2) and (colB=‘value))
 )
Or worseyet…
 Select….
 from tableA, …
 where …
 and (((@var1=1) and (colA=‘value’))
 or ((@var1=2) and (colB=‘value))
 )
I have….ooops….
41Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
A morecomplicated exampleA morecomplicated example
INSERT INTO #temp (...)
SELECT DISTINCT ...
FROM
MYDBNAME..TABLE_A A
, MYDBNAME..TABLE_B B
, MYDBNAME..TABLE_C C
, MYDBNAME..TABLE_D D
, MYDBNAME..TABLE_E E
, MYDBNAME..TABLE_F F
, MYDBNAME..TABLE_G G
, MYDBNAME..TABLE_H H
WHERE
A.COLUMN_1 = @VARIABLE_1
AND A.COLUMN_2 = @VARIABLE_2
AND A.COLUMN_3 = IsNull(@VARIABLE_3,A.COLUMN_3)
AND A.COLUMN_4 = IsNull(@VARIABLE_4,A.COLUMN_4)
AND A.COLUMN_5 = IsNull(@VARIABLE_5,A.COLUMN_5)
...
AND A.COLUMN_6 BETWEEN @VARIABLE_6 AND @VARIABLE_7
...
ORDER BY ...
Customer is trying to avoid writing IF/ELSE logic
for different conditions/variables being passed
in…if @VAR3-5 are set, the intent would be that
they would be used as SARGs….but if not set,
then the predicate is a no-op as column is
compared to itself….
42Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Simplifying(1)Simplifying(1)
use demo_db
go
set statement_cache off
set switch on 3604
set option show on
set statistics time, io, resource, plancost on
set showplan on
go
declare @air_temp smallint, @weather varchar(30), @bDate datetime, @eDate datetime
select @air_temp=null, @weather=null, @bDate='July 1 2000 00:00:01', @eDate='July 31 2000 23:59:59'
--select @air_temp=80, @weather='sunny',@bDate='July 1 2000 00:00:01', @eDate='July 31 2000 23:59:59'
select count(*)
from aqi_samples
where sample_date between @bDate and @eDate
and air_temp=isnull(@air_temp,air_temp)
and weather=isnull(@weather,weather)
go
set switch off 3604
set option show off
set statistics time, io, resource, plancost off
set showplan off
go
Table has 168M rows with an index on
{sample_date, air_temp, weather}
…first run with nulls for second 2 index keys
43Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Simplifying(2)Simplifying(2)
The Lop tree:
( project
( scalar
( scan aqi_samples
)
)
)
OptBlock1
The Lop tree:
( scan aqi_samples
)
Generic Tables: ( Gtt1( aqi_samples ) Gti2( aqi_samples_PK ) Gti3( aqi_weather_date_idx ) )
Generic Columns: …
Predicates: ( { aqi_samples.sample_date} >= "Jan 1 1900 12:00AM" tc:{3} { aqi_samples.sample_date} <= "Jan 1 1900 12:00AM" tc:{3} )
Transitive Closures: …
OptBlock0
The Lop tree:
( pseudoscan
)
Generic Tables: ( Gta0 )
Generic Columns: …
Predicates: ( )
Transitive Closures: …
The between clause is only one passed to optimizer…
not much of a surprise as with the NULLs, we are
expecting no-ops on air_temp and weather.
Note that since we don’t know the value of @vars at
compile time, we use default date here
44Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Simplifying(3)Simplifying(3)
Total estimated I/O cost for statement 3 (at line 4): 17133977.
==================== Lava Operator Tree ====================
Emit
(VA = 3)
r:1 er:1
cpu: 0
/
ScalarAgg
Count
(VA = 2)
r:1 er:1
cpu: 400
/
Restrict
(0)(0)(0)(11)(0)
(VA = 1)
r:1.303e+006 er:4.202e+007
/
IndexScan
aqi_weather_date
(VA = 0)
r:1.303e+006 er:4.202e+007
l:1969 el:63590
p:251 ep:8005
============================================================
Table: aqi_samples scan count 1, logical reads: (regular=1969 apf=0 total=1969), physical reads: (regular=8 apf=243 total=251), apf IOs used=243
Total actual I/O cost for this command: 10213.
Total writes for this command: 0
Execution Time 4.
Adaptive Server cpu time: 417 ms. Adaptive Server elapsed time: 417 ms.
Our total IO estimate is 17M+….Our estimated
rows (from IndexScan) are off by 30x….which is
bad…
45Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Simplifying– Rerun (1)Simplifying– Rerun (1)
use demo_db
go
set statement_cache off
set switch on 3604
set option show on
set statistics time, io, resource, plancost on
set showplan on
go
declare @air_temp smallint, @weather varchar(30), @bDate datetime, @eDate datetime
--select @air_temp=null, @weather=null, @bDate='July 1 2000 00:00:01', @eDate='July 31 2000 23:59:59'
select @air_temp=80, @weather='sunny',@bDate='July 1 2000 00:00:01', @eDate='July 31 2000 23:59:59'
select count(*)
from aqi_samples
where sample_date between @bDate and @eDate
and air_temp=isnull(@air_temp,air_temp)
and weather=isnull(@weather,weather)
go
set switch off 3604
set option show off
set statistics time, io, resource, plancost off
set showplan off
go
Table has 168M rows with an index on
{sample_date, air_temp, weather}
…second run with values for second 2 index keys
46Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Simplifying- Rerun (2)Simplifying- Rerun (2)
The Lop tree:
( project
( scalar
( scan aqi_samples
)
)
)
OptBlock1
The Lop tree:
( scan aqi_samples
)
Generic Tables: ( Gtt1( aqi_samples ) Gti2( aqi_samples_PK ) Gti3( aqi_weather_date_idx ) )
Generic Columns: …
Predicates: ( { aqi_samples.sample_date} >= "Jan 1 1900 12:00AM" tc:{3} { aqi_samples.sample_date} <= "Jan 1 1900 12:00AM" tc:{3} )
Transitive Closures: …
OptBlock0
The Lop tree:
( pseudoscan
)
Generic Tables: ( Gta0 )
Generic Columns: …
Predicates: ( )
Transitive Closures: …
The between clause is still the only one passed
to optimizer… which means this fails as a coding
style
47Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Simplifying- Rerun (3)Simplifying- Rerun (3)
Total estimated I/O cost for statement 3 (at line 4): 17133977.
==================== Lava Operator Tree ====================
Emit
(VA = 3)
r:1 er:1
cpu: 0
/
ScalarAgg
Count
(VA = 2)
r:1 er:1
cpu: 300
/
Restrict
(0)(0)(0)(11)(0)
(VA = 1)
r:0 er:4.202e+007
/
IndexScan
aqi_weather_date
(VA = 0)
r:1.303e+006 er:4.202e+007
l:1969 el:63590
p:0 ep:8005
============================================================
Table: aqi_samples scan count 1, logical reads: (regular=1969 apf=0 total=1969), physical reads: (regular=0 apf=0 total=0), apf IOs used=0
Total actual I/O cost for this command: 3938.
Total writes for this command: 0
Execution Time 3.
Adaptive Server cpu time: 309 ms. Adaptive Server elapsed time: 309 ms.
We get the same estimates for total IO (17M)
and in the bottom node, but the Restrict filters
out non-qualifying rows – so we get 0….and
finish 100ms faster…the faster execution might
make developer think it worked. However, we
do the same amount of work (1969 LIOs) so the
faster exec is just likely the reduction in
ScalarAgg (which it is) due to fewer rows to
count.
48Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Simplifying– Correct (1)Simplifying– Correct (1)
use demo_db
go
set statement_cache off
set switch on 3604
set option show on
set statistics time, io, resource, plancost on
set showplan on
go
declare @air_temp smallint, @weather varchar(30), @bDate datetime, @eDate datetime
--select @air_temp=null, @weather=null, @bDate='July 1 2000 00:00:01', @eDate='July 31 2000 23:59:59'
select @air_temp=80, @weather='sunny',@bDate='July 1 2000 00:00:01', @eDate='July 31 2000 23:59:59'
select count(*)
from aqi_samples
where sample_date between @bDate and @eDate
and air_temp=@air_temp
and weather=@weather
go
set switch off 3604
set option show off
set statistics time, io, resource, plancost off
set showplan off
go
Table has 168M rows with an index on
{sample_date, air_temp, weather}
…third run with the way it should be…
49Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Simplifying- Correct (2)Simplifying- Correct (2)
The Lop tree:
( project
( scalar
( scan aqi_samples
)
)
)
OptBlock1
The Lop tree:
( scan aqi_samples
)
Generic Tables: ( Gtt1( aqi_samples ) Gti2( aqi_samples_PK ) Gti3( aqi_weather_date_idx ) )
Generic Columns: …
Predicates: ( { aqi_samples.sample_date} >= "Jan 1 1900 12:00AM" tc:{3} { aqi_samples.sample_date} <= "Jan 1 1900 12:00AM" tc:{3}
{ aqi_samples.air_temp} = 0 tc:{2} { aqi_samples.weather} = ' tc:{1} )
Transitive Closures: …
OptBlock0
The Lop tree:
( pseudoscan
)
Generic Tables: ( Gta0 )
Generic Columns: …
Predicates: ( )
Transitive Closures: …
We now have all 3 predicates…since we still
have @vars with unknown values, we substitute
a 0 for int/smallint and ‘ (empty string) for
varchar/char
50Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Simplifying- Correct (3)Simplifying- Correct (3)
Total estimated I/O cost for statement 3 (at line 4): 227844.
==================== Lava Operator Tree ====================
Emit
(VA = 2)
r:1 er:1
cpu: 0
/
ScalarAgg
Count
(VA = 1)
r:1 er:1
cpu: 0
/
IndexScan
aqi_weather_date
(VA = 0)
r:0 er:450006
l:306 el:1307
p:0 ep:165
============================================================
Table: aqi_samples scan count 1, logical reads: (regular=306 apf=0 total=306), physical reads: (regular=0 apf=0 total=0), apf IOs used=0
Total actual I/O cost for this command: 612.
Total writes for this command: 0
Execution Time 0.
Adaptive Server cpu time: 1 ms. Adaptive Server elapsed time: 1 ms.
Total estimated IO is 228K (vs. 17M) and
estimated rowcount is TONS less…still off, but
likely due to data skew and not knowing values
of @vars…. And we only do 300 LIO vs.
1969….and we finish 300x faster
51Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Index Keys: TheQueryIndex Keys: TheQuery
SELECT SUM( T_00 ."MBGBTR" )
FROM "COEP" T_00
INNER JOIN "COBK" T_01
ON T_01 ."KOKRS" = ?
AND T_01 ."BELNR" = T_00 ."BELNR"
WHERE T_00 ."MANDT" = ?
AND T_00 ."LEDNR" = ?
AND T_00 ."OBJNR" = ?
AND ( T_00 ."KSTAR" BETWEEN ? AND ? OR T_00 ."KSTAR" IN ( ? , ? , ? , ? ) )
AND T_01 ."AWTYP" = ?
/* R3:ZVDESR121:558 T:COEP M:400 */
index_name index_keys index_description,
COEP~0 MANDT, KOKRS, BELNR, BUZEI nonclustered, unique
COEP~1 MANDT, LEDNR, OBJNR, GJAHR, WRTTP, VERSN, KSTAR, HRKFT, PERIO,
VRGNG, PAROB, USPOB, VBUND, PARGB, BEKNZ, TWAER nonclustered
COEP~Z02 MANDT, KOKRS, BUKRS, OBJNR nonclustered
COEP_BDLS0 MANDT, LOGSYSO nonclustered
COEP~4 MANDT, TIMESTMP, OBJNR nonclustered
COEP~Z03 MANDT, LEDNR, OBJNR, KSTAR nonclustered
COEP~Z05 MANDT, OBJNR, KSTAR, GJAHR, PERIO, PAROB1, WRTTP nonclustered
COEP~Zt1 MANDT, LEDNR, OBJNR, KSTAR nonclustered
52Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Index Keys– Bad Index AccessIndex Keys– Bad Index Access
|ROOT:EMIT Operator (VA = 5)
|
| |SCALAR AGGREGATE Operator (VA = 4)
| | Evaluate Ungrouped SUM OR AVERAGE AGGREGATE.
| |
| | |NESTED LOOP JOIN Operator (VA = 3) (Join Type: Inner Join)
| | |
| | | |RESTRICT Operator (VA = 1)(0)(0)(0)(4)(0)
| | | |
| | | | |SCAN Operator (VA = 0)
| | | | | FROM TABLE
| | | | | COEP
| | | | | T_00
| | | | | Index : COEP~4
| | | | | Forward Scan.
| | | | | Positioning by key.
| | | | | Keys are:
| | | | | MANDT ASC
| | | | | Using I/O Size 128 Kbytes for index leaf pages.
| | | | | With LRU Buffer Replacement Strategy for index leaf pages.
| | | | | Using I/O Size 128 Kbytes for data pages.
| | | | | With LRU Buffer Replacement Strategy for data pages.
| | |
| | | |SCAN Operator (VA = 2)
| | | | FROM TABLE
| | | | COBK
| | | | T_01
| | | | Index : COBK~Zt1
| | | | Forward Scan.
| | | | Positioning at index start.
| | | | Index contains all needed columns. Base table will not be read.
| | | | Using I/O Size 16 Kbytes for index leaf pages.
| | | | With LRU Buffer Replacement Strategy for index leaf pages.
(c) 2015 Independent SAP Technical User GroupAnnual Conference, 2015
OPTIMIZATION COSTINGOPTIMIZATION COSTING
(PART 1)(PART 1)
Histograms, Column Densities, IN(), Out of RangeHistograms…
54Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
HistogramsHistograms
Thekey tocost-based optimization
q Really is a distribution of data
skew
ü If data was evenly
distributed, we
wouldn’t need
histograms at all
q Mostly used for range scans
q Can be used for equisargs if
data highly skewed..as
most is
Thebasics
q Frequency cells
q Range cells
Statistics for column: "type"
Last update of column statistics: Feb 15 2015 9:18:32:850PM
Range cell density: 0.0053191489361702
Total density: 0.4216274332277049
Range selectivity: default used (0.33)
In between selectivity: default used (0.25)
Unique range values: 0.0053191489361702
Unique total values: 0.2000000000000000
Average column width: default used (2.00)
Rows scanned: 188.0000000000000000
Statistics version: 4
Histogram for column: "type"
Column datatype: char(2)
Requested step count: 20
Actual step count: 9
Sampling Percent: 0
Tuning Factor: 20
Out of range Histogram Adjustment is DEFAULT.
Low Domain Hashing.
Step Weight Value
1 0.00000000 <= "EJ"
2 0.00531915 < "P "
3 0.10638298 = "P "
4 0.00000000 < "S "
5 0.30319148 = "S "
6 0.00000000 < "U "
7 0.56382978 = "U "
8 0.00000000 < "V "
9 0.02127660 = "V "
Range Cells
Frequency Cells
55Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
How Many StepsDoWeNeedHow Many StepsDoWeNeed
Fewer = better for resourceusageand timetofind steps

More= better for optimization accuracy
q Ideally, you want most range scans to be in a single cell
ü Multiple cells means aggregating stats…may be
accurate, but takes longer
ü For example, for datetime, columns see if cells cover
the common query range (week, month, year, ….)
 Hard to near impossible to control to semantic boundaries
q Increase stats may be better for estimates with high
skew
56Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
ExampleDateHistogramExampleDateHistogram
Histogram for column: "sample_date"
Column datatype: datetime
Requested step count: 100
Actual step count: 103
Sampling Percent: 0
Tuning Factor: 20
Out of range Histogram Adjustment is DEFAULT.
Sticky step count.
Sticky hashing.
Step Weight Value
1 0.00000000 <= "Jan 1 1993 11:59:59:996AM"
2 0.01017933 <= "Feb 13 1993 12:00:00:000PM"
3 0.00763450 <= "Mar 18 1993 12:00:00:000PM"
4 0.01018039 <= "May 1 1993 12:00:00:000PM"
5 0.00766925 <= "Jun 3 1993 12:00:00:000PM"
6 0.00777507 <= "Jul 6 1993 12:00:00:000PM"
7 0.00825124 <= "Aug 8 1993 12:00:00:000PM"
8 0.00816318 <= "Sep 10 1993 12:00:00:000PM"
9 0.00796063 <= "Oct 13 1993 12:00:00:000PM"
10 0.00795876 <= "Nov 15 1993 12:00:00:000PM"
11 0.00795651 <= "Dec 18 1993 12:00:00:000PM"
12 0.00788510 <= "Jan 19 1994 12:00:00:000PM"
13 0.01000150 <= "Feb 28 1994 12:00:00:000PM"
14 0.01000150 <= "Apr 9 1994 12:00:00:000PM“
…
~1.5 month spread…. Problem is that on some months it is mid-
month, so a range scan for that month would need 3 cells. If
concerned, likely need to double or triple stats
57Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Histograms& StepsHistograms& Steps
Default no HTF Defaults 40 steps 100 steps 500 steps
Default number of steps 20 20 20 20 20
Histogram tuning factor 1 20 20 20 20
Requested steps 20 20 40 100 500
Actual steps 20 195 509 1550 7580
(Index statistics for combined city,state)
Range cell density 0.00328457 0.00121356 0.00022722 0.00010744 0.00003560
Total density 0.00328457 0.00328457 0.00328457 0.00328457 0.00328457
Unique range values 0.00011547 0.00008212 0.00006416 0.00004897 0.00002615
Unique total values 0.00011547 0.00011547 0.00011547 0.00011547 0.00011547
Impact on estimates for Washington DC & San Francisco CA
DC Cell <= Washington <= Washington = Washington = Washington = Washington
DC Selectivity 0.05184000 0.02155000 0.02063000 0.02063000 0.02063000
DC Row Estimates 5184 2155 2063 2063 2063
SF Cell <= Somerset <= San Jacint = San Franci = San Franci = San Franci
SF Selectivity 0.04875000 0.00678000 0.00634000 0.00634000 0.00634000
SF Row Estimates 4875 678 634 634 634
Statistics from
an index on
{city,state} for a
100,000 row
table with
~6,200 distinct
city names
58Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Column DensitiesColumn Densities
Singlecolumn densities
q Range cell density/unique
range values
ü Tells maximum
uniqueness…
ü Min(weight)!=0 from
range cells
q Total density
ü Relative skewness of the
data
ü Total density approaching
1.0 is extremely
skewed
ü Sum(weights^2)
q Unique total values
ü The number distinct
values in column
ü 1.0/select count(distinct
column)
Multiplecolumn densities
q Automatically created on index
Statistics for column: "type"
Last update of column statistics: Feb 15 2015 9:18:32:850PM
Range cell density: 0.0053191489361702
Total density: 0.4216274332277049
Range selectivity: default used (0.33)
In between selectivity: default used (0.25)
Unique range values: 0.0053191489361702
Unique total values: 0.2000000000000000
Average column width: default used (2.00)
Rows scanned: 188.0000000000000000
Statistics version: 4
Statistics for column group: "sample_date", "air_temp", "weather"
Last update of column statistics: May 27 2014 11:45:45:016AM
Range cell density: 0.0000051075008894
Total density: 0.0000051075008894
Range selectivity: default used (0.33)
In between selectivity: default used (0.25)
Unique range values: 0.0000016297687032
Unique total values: 0.0000016297687032
Average column width: 8.5268955638740458
Rows scanned: 168066824.0000000000000000
Statistics version: 4
59Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
UsingColumn DensitiesUsingColumn Densities
If thecolumn valueisknown and…
q …value falls in a range cell ….Estimate will be range cell
value
ü Whether range or frequency cell
If thecolumn valueisnot known
q Optimized with a literal placeholder (0, ‘’, Jan 1 1900,
etc.)
q Selectivity is total density
60Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Column Selectivity vs. Density (1)Column Selectivity vs. Density (1)
Statistics for column: "id"
Last update of column statistics: Feb 16 2015 4:47:23:956PM
Range cell density: 0.0092592412744228
Total density: 0.0113194187537711
Unique range values: 0.0041383133267069
Unique total values: 0.0055248618784530
Step Weight Value
1 0.00000000 < 1
2 0.01093356 = 1
3 0.01387721 <= 2
4 0.01261564 <= 3
5 0.00714886 <= 4
6 0.00294365 <= 5
7 0.00462574 <= 6
8 0.00210261 <= 8
9 0.00336417 <= 9
10 0.00336417 <= 11
11 0.00378469 <= 12
12 0.00925147 <= 13
13 0.00210261 <= 15
14 0.01808242 <= 16
15 0.00252313 <= 17
16 0.00252313 <= 18
17 0.00168209 <= 19
18 0.00000000 < 21
19 0.00630782 = 21
20 0.00252313 <= 22
21 0.01429773 <= 23
22 0.03868797 <= 24
23 0.00378469 <= 25
1> declare @id int
2> select @id=8
3> select * from syscolumns where id=@id
Estimating selectivity of index 'syscolumns.csyscolumns', indid 2
id = 0
Estimated selectivity for id,
selectivity = 0.01131942,
scan selectivity 0.01131942, filter selectivity 0.01131942
26.91758 rows, 1 pages
range cell unknown
1> select * from syscolumns where id=8
Estimating selectivity of index 'syscolumns.csyscolumns', indid 2
id = 8
Estimated selectivity for id,
selectivity = 0.002102607,
scan selectivity 0.002102607, filter selectivity 0.002102607
5 rows, 1 pages
Weight < range cell density selectivity = weight
61Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Column Selectivity vs. Density (2)Column Selectivity vs. Density (2)
Statistics for column: "id"
Last update of column statistics: Feb 16 2015 4:47:23:956PM
Range cell density: 0.0092592412744228
Total density: 0.0113194187537711
Unique range values: 0.0041383133267069
Unique total values: 0.0055248618784530
Step Weight Value
1 0.00000000 < 1
2 0.01093356 = 1
3 0.01387721 <= 2
4 0.01261564 <= 3
5 0.00714886 <= 4
6 0.00294365 <= 5
7 0.00462574 <= 6
8 0.00210261 <= 8
9 0.00336417 <= 9
10 0.00336417 <= 11
11 0.00378469 <= 12
12 0.00925147 <= 13
13 0.00210261 <= 15
14 0.01808242 <= 16
15 0.00252313 <= 17
16 0.00252313 <= 18
17 0.00168209 <= 19
18 0.00000000 < 21
19 0.00630782 = 21
20 0.00252313 <= 22
21 0.01429773 <= 23
22 0.03868797 <= 24
23 0.00378469 <= 25
1> select * from syscolumns where id=21
Estimating selectivity of index 'syscolumns.csyscolumns', indid 2
id = 21
Estimated selectivity for id,
selectivity = 0.006307822,
scan selectivity 0.006307822, filter selectivity 0.006307822
15 rows, 1 pages
Frequency cell selectivity = weight
1> select * from syscolumns where id=24
Estimating selectivity of index 'syscolumns.csyscolumns', indid 2
id = 24
Estimated selectivity for id,
selectivity = 0.03868797,
scan selectivity 0.03868797, filter selectivity 0.03868797
92 rows, 1 pages
Weight > range cell density selectivity = weight
62Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Column Selectivity vs. Density (3)Column Selectivity vs. Density (3)
Statistics for column: "id"
Last update of column statistics: Feb 16 2015 4:47:23:956PM
Range cell density: 0.0092592412744228
Total density: 0.0113194187537711
Unique range values: 0.0041383133267069
Unique total values: 0.0055248618784530
Step Weight Value
1 0.00000000 < 1
2 0.01093356 = 1
3 0.01387721 <= 2
4 0.01261564 <= 3
5 0.00714886 <= 4
6 0.00294365 <= 5
7 0.00462574 <= 6
8 0.00210261 <= 8
9 0.00336417 <= 9
10 0.00336417 <= 11
11 0.00378469 <= 12
12 0.00925147 <= 13
13 0.00210261 <= 15
14 0.01808242 <= 16
15 0.00252313 <= 17
16 0.00252313 <= 18
17 0.00168209 <= 19
18 0.00000000 < 21
19 0.00630782 = 21
20 0.00252313 <= 22
21 0.01429773 <= 23
22 0.03868797 <= 24
23 0.00378469 <= 25
1> select * from syscolumns where id between 5 and 10
Estimating selectivity of index 'syscolumns.csyscolumns', indid 2
id >= 5
id <= 10
Estimated selectivity for id,
selectivity = 0.01471826,
scan selectivity 0.01471826, filter selectivity 0.01471826
35.00002 rows, 1 pages
Range query
Note that the sum of steps 6 10 is 0.01640034. However, since
we are only using a portion of step 10 and the distribute is 2 values
per step, we use the formula:
Sum(step6..step9) + step10/2.0 = 0.01471826
63Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
DebuggingSelectivityDebuggingSelectivity
You’veprobably noticed….
q You need to have ‘set option show’ and optdiag output
Find theindex you thought it should haveused
q Look at the selectivity for each predicate
q Check out the optdiag to see if it was a really skewed
value
But sometimesyou just havetolook at thequery
q …your expectation may be due to knowledge you infer
ü But optimizer doesn’t know
ü ….such as the relationship between two columns
q …and sometimes the indexing doesn’t support the query
64Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Unbounded DateRangeUnbounded DateRange
create table jobs (
job_number numeric(30,0),
…
job_category varchar(20), -- 10 distinct values
job_priority tinyint, -- 100 distinct values
job_begindate datetime,
job_enddate datetime,
job_status char(1), -- 6 distinct values
…,
primary key (job_number)
)
Consider the above table for each of the scenarios on the following slides. Note the key
columns of job dates and those that have some distinct values listed.
65Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Scenario#1Scenario#1
Consider theindex:
 create index job_begin_idx on jobs (job_begindate)
…and thetypical query
 Select * from jobs
 Where job_begindate >= $begin_date
 and job_enddate <= $end_date


Why isLIO sometimeshigh and sometimeslow?
66Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Scenario#1: TheProblemsScenario#1: TheProblems
Becausetheindex only hasbegin date
q On very recent dates, it can go near the end of the
index and scan to the end…
q But on dates in the past – even a few months ago
ü It positions to the $begin_date
ü Scans to end of index
ü For each leaf node, it does a LIO to data page
to compare $end_date
ü Some quick math….assume 50 rows per page
per index leaf node
 100 leaf pages = 5000 data page LIO’s ≈ 1
sec CPU (@5LIO/ms)
 1000 leaf pages = 50000 data page LIO’s
≈ 10 sec CPU
 10000 leaf pages = 500000 data page
LIO’s ≈ 100 sec CPU
 100000 leaf pages = 5000000 data page
LIO’s ≈ 1000 sec CPU (16m40s)
Soooo….
q For dates not very recent, we get an index leaf scan
to end of index
q Plus a datapage lookup for every leaf row
2010
2011
2012
2013
2014
> 01Mar2011
> 01Nov2012
> 01Jan2014
67Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Scenario#1: TheSolutionsScenario#1: TheSolutions
Solution #1: Add job_enddatetoindex
 create index job_date_idx
 on jobs (job_begindate, job_enddate)
Solution #2: Add implied boundary todatequery
 Select * from jobs
 Where job_begindate between $begin_date and $end_date
 and job_enddate between $begin_date and $end_date

Why both???
q Wouldn’t fixing the index be enough – why bother the
coders and try to teach them better coding style???
68Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Scenario#2Scenario#2
Consider theindex:
 create index job_begin_idx
 on jobs (job_category, job_begindate)
…and thetypical query
 Select * from jobs
 Where job_begindate >= $begin_date
 and job_enddate <= $end_date

Why doesit sometimesusetheindex and other timesnot?
69Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Scenario#2: TheProblemScenario#2: TheProblem
Theproblem iswearemissinga predicateon leadingindex columns
q A similar situation occurs when we have intermediate index keys for
which we have no valid SARGs
Tohandlethis, ASE doesa bit of a trick
q It looks at cardinality of unknown keys
ü If low it considers an ORScan for each value
ü If high, it considers an index leaf scan
q Then it considers the selectivity of the known predicates
Sooo…asa result
q If we pick a date that is fairly recent (index is more selective), then we
will likely do an ORScan and then a index leaf scan from the begin
date until the next job_category
q If we pick a date that isn’t very selective, then the ORScan becomes too
expensive due to leaf scan per Orscan and we compare the multiple
index leaf scan vs. single table scan
70Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Scenario#2: TheSolutionScenario#2: TheSolution
Solution: Add implied boundary todatequery
 Select * from jobs
 Where job_begindate between $begin_date and $end_date
 and job_enddate between $begin_date and $end_date

…and thisiswhy wefix both theindex and thequery
q In the above case, considering the index in scenario #2,
as long as the range is fairly selective, we likely will do
the ORScan
71Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
OrScan in Lava TreeOrScan in Lava Tree
==================== Lava Operator Tree ====================
Emit
(VA = 4)
r:5 er:1
cpu: 0
/
NestLoopJoin
Inner Join
(VA = 3)
r:5 er:1
l:0 el:8
p:0 ep:8
/ 
OrScan Restrict
Max Rows: 2 (0)(0)(0)(4)(0)
(VA = 0) (VA = 2)
r:2 er:-1 r:5 er:1
l:0 el:-1
p:0 ep:-1
/
IndexScan
TBTCO~7
(VA = 1)
r:9 er:1
l:28 el:8
p:0 ep:8
============================================================
72Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
OrScan in Show PlanOrScan in Show Plan
|ROOT:EMIT Operator (VA = 6)
|
| |NESTED LOOP JOIN Operator (VA = 5) (Join Type: Inner Join)
| |
| | |NESTED LOOP JOIN Operator (VA = 3) (Join Type: Inner Join)
| | |
| | | |SCAN Operator (VA = 0)
| | | | FROM OR List
| | | | OR List has up to 12 rows of OR/IN values.
| | |
| | | |RESTRICT Operator (VA = 2)(0)(0)(0)(13)(0)
| | | |
| | | | |SCAN Operator (VA = 1)
| | | | | FROM TABLE
| | | | | SAPSR3.MSEG
| | | | | T_01
| | | | | Index : MSEG~1
| | | | | Forward Scan.
| | | | | Positioning by key.
| | | | | Keys are:
| | | | | MANDT ASC
| | | | | MATNR ASC
| | | | | Using I/O Size 128 Kbytes for index leaf pages.
| | | | | With LRU Buffer Replacement Strategy for index leaf pages.
| | | | | Using I/O Size 128 Kbytes for data pages.
| | | | | With LRU Buffer Replacement Strategy for data pages.
| |
73Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Scenario#3Scenario#3
Consider thefollowingindex
 create index job_begin_idx
 on jobs (job_category, job_status, job_begindate,
job_enddate)
…and thetypical query
 Select * from jobs
 Where job_category = ‘night batch’
 and job_status in (‘U’, ‘A’, ‘E’)
 and job_begindate >= $begin_date
 and job_enddate <= $end_date

Why might weonly position by job_category, job_status?
74Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Scenario#3: TheProblemScenario#3: TheProblem
Theproblem iswedon’t havemulti-density stats
q And creating them might be a bit of a nightmare
Asa result, ASE doesthefollowing
q It weighs each selectivity individually:
ü ‘nightly batch’ + ‘U’ + $begin_date
ü ‘nightly batch’ + ‘A’ + $begin_date
ü ‘nightly batch’ + ‘E’ + $begin_date
q Then aggregates
Here’stheproblem….assumeweonly have20 steps
q Let’s pick a begin date 3 or more steps from the end
ü …and assume end_date is in the same step
ü …but remember, we have an unbounded range on both ….so
 …effectively it will think it will be 3 steps for each $begin_date….not 1
 …and it will thing $end_date is atrocious as is 17 steps worth (from beginning)
q If we aggregate, then we will have 3x….so 9 steps….40% of table is 8
steps….we might table scan or look for different index
75Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Scenario#3: TheSolutionScenario#3: TheSolution
Updatecolumn statsfor distinctivecolumns
q Use 100 steps or similar large value
ü update statistics job_status (job_begindate) using 100
values
q Result is that each step has a much lower selectivity
value
Add thebounded rangeintothequery
q This means we aggregate only across the exact range of
dates we want…which reduces the impact of the IN()
clause
q

76Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
ASE’sOR StrategyASE’sOR Strategy
If thequery containsan OR clauseon different columns
q ASE will (and can) use two different indexes
ü On index for predicates on one side of OR
ü …and a different index for predicates on other side of
OR
ü This would be similar to splitting the query in two with
union
q However, if one side of OR drives a tablescan – ASE will
tablescan
ü Remember, we saw this with the id=8 OR 1=2
example
Common issues
q One side of OR not indexed well….drives tablescan
q Developer attempted to use 1 index to cover both
77Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
An Exampleof Indexingvs. ORAn Exampleof Indexingvs. OR
Consider thefollowingquery:
 SELECT "VBELV" ,"POSNV" ,"VBELN" ,"POSNN" ,"VBTYP_N" ,"RFMNG" ,"MEINS" ,"VBTYP_V"
 ,"ERDAT" ,"ERZET" ,"AEDAT" ,"STUFE" ,"VRKME"
 FROM "VBFA"
 WHERE "MANDT" = ? AND ( "ERDAT" = ? OR "AEDAT" = ? )
 /* R3:SAPLZFEDWS1:767 T:VBFA M:430 */

Now, consider theindexes:
 index_name index_keys
 -------------------------------------
--------------------------------------------
 VBFA~0 MANDT, VBELV, POSNV, VBELN, POSNN, VBTYP_N
 VBFA~Z01 MANDT, VBELN
 VBFA~Z02 ERDAT, BWART
 VBFA~Z04 MANDT, ERDAT, AEDAT
 VBFA~Z99 MANDT, LOGSYS

Issueisthat thequery seemstodrivea tablescan….
q …it seems obvious that VBFA~Z04 should be used…..
q ….or is it???
78Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Let’slook a littlecloserLet’slook a littlecloser
Lookingat systabstats
 ColumnName ColumnID Row_Count RequestedSteps ActualSteps ApproxDistincts
DistinctsPerStep
 -------------- -------- -------------------- -------------- ----------- ---------------
-----------------
 AEDAT 22 1255008198 50 50 1625 33.0
 BWART 17 1255008198 50 29 64 2.0
 ERDAT 14 1255008198 50 245 4674 19.0
 LOGSYS 38 1255008198 50 2 1 1.0
 MANDT 1 1255008198 50 2 1 1.0
 POSNN 5 1255008198 50 573 93300 163.0
 POSNV 3 1255008198 50 231 12649 55.0
 VBELN 4 1255008198 50 38 85330918 2245550.0
 VBELV 2 1255008198 50 38 31223216 821664.0
 VBTYP_N 6 1255008198 50 31 25 1.0
Hmmmm….not very good query criteria
q MANDT is useless as always
q AEDAT and ERDAT are not very distinct….1625 and 4674 values
respectively
ü Which means each distinct value will return ~250K to ~1M
79Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
AEDAT Stats….from optdiagAEDAT Stats….from optdiag
Statistics for column: AEDAT
Last update of column statistics: Jan 10 2014 7:21:35:026PM
Range cell density: 0.0000017268359901
Total density: 0.9986527756879466
…
Unique range values: 0.0000004149259654
Unique total values: 0.0006153846153846
…
Histogram for column: AEDAT
Column datatype: varchar(24)
…
Statistics step count sticky
Statistics hashing sticky
Statistics hashing low domain used
Step Weight Value (only 255 bytes used)
1 0.00000000 < '00000000'
2 0.99932617 = '00000000'
3 0.00001720 <= '20080724'
4 0.00001430 <= '20080826'
5 0.00001409 <= '20081030'
6 0.00001545 <= '20081113'
7 0.00001415 <= '20081216'
8 0.00001419 <= '20090310'
9 0.00001468 <= '20090331'
10 0.00002772 <= '20090615'
…
OUCH!!!!!
80Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
ERDAT Stats….from optdiagERDAT Stats….from optdiag
Statistics for column: ERDAT
Last update of column statistics: Jan 10 2014 7:21:35:026PM
Range cell density: 0.0005738551548958
Total density: 0.0006834762135235
…
Unique range values: 0.0001879716956084
Unique total values: 0.0002139495079161
…
Requested step count: 50
Actual step count: 245
…
Statistics step count sticky
Statistics hashing sticky
Statistics hashing low domain used
Step Weight Value (only 255 bytes used)
1 0.00000000 < '00000000'
2 0.00004201 = '00000000'
3 0.01879592 <= '20030624'
4 0.01879998 <= '20040316'
5 0.01888011 <= '20041015'
6 0.01887963 <= '20050502'
7 0.01878721 <= '20051031'
8 0.01888958 <= '20060420'
9 0.01879898 <= '20061014'
10 0.01882141 <= '20070417'
BETTER!!!!
81Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Tounderstand, let’ssimplify thingsTounderstand, let’ssimplify things
Assumewehavea tableof customer transactions…
q with 1 billion rows
q PKEY is transaction_id (not that it matters…..)
q Has an index (IDX~1) on {purchase_date, ship_date}
ü Both purchase_date and ship_date are not very distinct
ü think about it …only 365 in a year….~3600 in 10 years…
not very distinctive out of 1 billion row table
Now consider thequery:

 Select * from cust_transactions
 where purchase_date=‘Jan 1 2014’ OR ship_date=‘Jan 1 2014’

Seetheproblem?.... Think about it….
82Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
TheProblemTheProblem
Theproblem query:

 Select * from cust_transactions
 where purchase_date=‘Jan 1 2014’ OR ship_date=‘Jan 1 2014’

Theproblems….
q We can use the index IDX~1 for the purchase_date case …..depending
of course on selectivity of the data provided
q …but the OR clause means it that we also need to look for the ship date
ü individually and not in combination with purchase date – remember a
composite index works on COMBINING cols
q ….using IDX~1 for that is sort of useless as we can’t use the leading
purchase_date column as the OR clause is disjunctive…..the query
really could be expressed as:

 select * from cust_transactions where purchase_date=‘Jan 1 2014’
 union
 select * from cust_transactions where ship_date=‘Jan 1 2014’
83Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Remember special OR strategy???Remember special OR strategy???
When an OR condition exists:
q ASE can use multiple indexes – a different index for each side of
the OR
q This ‘special OR strategy’ is also known as ‘index union’
When lookingat thequery & index
q ASE says index is probably okay for purchase_date….
q ….but says it will need to tablescan for ship_date
q Why the tablescan
ü Remember, this is a DOL table and the index keys are sorted by
purchase_date, then ship_date
ü ….so we would have to scan ALL the leaf pages to find that
ship_date
ü ….only to find out that 1/4000th of the table qualifies
ü ….and they are scattered around due to purchase date,
so….LIO exceeds cost of tablescan so we do tablescan
ü ….especially if we have an OR value of ‘00000000’….which is
99% of the table.
84Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
What about IN()???What about IN()???
If you werewatchingclosely….you already know theanswer
If you think about it….
q …an IN() is like an OR list…
q ….in fact ASE flattens into one
So, all wedois:
q Cost each one individually
q Aggregate them into a final cost
85Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
A SimpleIN() exampleA SimpleIN() example
1> select * from sysobjects where id in (2,4,6,8,10,12,14,16)
The Lop tree:
( project
( scan sysobjects
)
)
OptBlock0
The Lop tree:
( scan sysobjects
)
Generic Tables: ( Gtt0( sysobjects ) Gti1( csysobjects ) )
Generic Columns: …
Predicates: ( ( { sysobjects.id } = 16 tc:{25} OR{ sysobjects.id } = 14 tc:{25}
OR { sysobjects.id } = 12 tc:{25} OR{ sysobjects.id } = 10 tc:{25}
OR { sysobjects.id } = 8 tc:{25} OR{ sysobjects.id } = 6 tc:{25}
OR { sysobjects.id } = 4 tc:{25} OR{ sysobjects.id } = 2 tc:{25} ) tc:{25} )
Transitive Closures: …)
IN() clause is expanded to OR’s….note that all
have the same transitive closure id (tc:{25})
86Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Individual OR term selectivityIndividual OR term selectivity
BEGIN GENERAL OR ANALYSIS OF all types of indices FOR sysobjects
ANALYZING OR TERM 1
Estimating selectivity of index 'sysobjects.csysobjects', indid 3
id = 16
Estimated selectivity for id,
selectivity = 0.1,
scan selectivity 0.02272727, filter selectivity 0.02272727
restricted selectivity 0.1
unique index with all keys, one row scans
1 rows, 1 pages
…
ANALYZING OR TERM 2
Estimating selectivity of index 'sysobjects.csysobjects', indid 3
id = 14
…
ANALYZING OR TERM 3
Estimating selectivity of index 'sysobjects.csysobjects', indid 3
id = 12
…
ANALYZING OR TERM 4
Estimating selectivity of index 'sysobjects.csysobjects', indid 3
id = 10
…
==================== Lava Operator Tree ====================
Emit
(VA = 3)
r:8 er:5
cpu: 0
/
NestLoopJoin
Inner Join
(VA = 2)
r:8 er:5
l:0 el:5
p:0 ep:4
/ 
OrScan IndexScan
Max Rows: 8 csysobjects
(VA = 0) (VA = 1)
r:8 er:-1 r:8 er:5
l:0 el:-1 l:12 el:5
p:0 ep:-1 p:0 ep:4
============================================================
87Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
AggregatingSelectivity for ORAggregatingSelectivity for OR
END GENERAL OR ANALYSIS FOR all types of indices - INDICES FOUND FOR ALL OR TERMS
Scan on table sysobjects skipped because table scan less than concurrency threshold
Estimating selectivity of index 'sysobjects.csysobjects', indid 3
Estimated selectivity for id,
selectivity = 0.8,
scan selectivity 0.8, filter selectivity 0.8
restricted selectivity 1
special or terms 8
35.2 rows, 1 pages
Data Row Cluster Ratio 0.99999
Index Page Cluster Ratio 1
Data Page Cluster Ratio 1
using no index prefetch (size 4K I/O)
in index cache 'default data cache' (cacheid 0) with LRU replacement
using no table prefetch (size 4K I/O)
in data cache 'default data cache' (cacheid 0) with LRU replacement
Data Page LIO for 'csysobjects' on table 'sysobjects' = 1.600336
Whoa!!! Prediction is 80% of the table…which had
44 rows….thankfully in *this* case, it still was only 1
page
88Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
AggregatingIN()AggregatingIN()
Aggregation isunintelligent
q It doesn’t check how many are from same range cell
Result istheaggregated valueisoften over-inflated

TIP: Makesureyou havehistogram steps> largest IN() list
q For SAP systems, this will be 100
89Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Out of rangehistogramsOut of rangehistograms
Originally added toASE 15.0 for monotonicsequences
q For example, sequential numbers, datetime (e.g. current
datetime)
q Often times if stats only updated every week, a large portion of
the new data values where higher than the histogram range
ü As a result, the optimizer would estimate 0 values and select
an index based on that reduced cost estimate whereas in
reality there could be millions of rows
q With out of range histograms, several factors are used to
estimate how many data values exist beyond the last
histogram cell and cost is adjusted higher
Usually in such cases, out of rangehistogramsisa sign of stalestats
q ….but for high insert/append use cases, you may be updating or
re-reading a row that was just inserted – e.g. reporting on
today’s sales
90Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Low Cardinality ExamplesLow Cardinality Examples
Histogram tuningmay bea bad thingfor short duration “STATUS” columns
q Most of the values in the histogram will be “C” for complete
q Unless there is a “permanent” status higher than “U” for
unprocessed, it is unlikely that update stats will catch a “U”
value
ü During migration, the system is likely quiesced with nothing
incomplete
ü Post-migration, if stats are run during quiet period, likely no
incomplete values exist
q Out of range histogram throws off optimizer….0 would have been
better estimate
ü Running update stats on weekends or nights when quiet simply
causes same problem…as jobs are likely all complete
q Spotted with ‘set option show on’
May alsohappen with very low cardinality “TYPE” columns
q Or any very low cardinality column, in reality when value in
predicate is extremely low occurrence in a very low cardinality
column and value is higher than more common value(s)
91Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
ExampleHistogramExampleHistogram
Histogram for column: "ENTRY_TYPE"
…
Out of range Histogram Adjustment is DEFAULT.
Sticky step count.
Sticky partial_hashing.
Step Weight Value
1 0.00000000 < "C"
2 1.00000000 = "C"
Histogram for column: "STATUS"
…
Out of range Histogram Adjustment is DEFAULT.
Low Domain Hashing.
Sticky step count.
Sticky partial_hashing.
Step Weight Value
1 0.00000000 < "C"
2 0.98791176 = "C"
3 0.00084806 < "T"
4 0.01124019 = "T"
92Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Example‘set option show output’Example‘set option show output’
Estimating selectivity of index 'SAPSR3.ESH_EX_CPOINTER.ESH_EX_CPOINTER~ST', indid 3
STATUS = 'U'
ENTRY_TYPE = 'P'
Estimated selectivity for ENTRY_TYPE,
Out of range histogram adjustment,
selectivity = 0.3333333,
Estimated selectivity for STATUS,
Out of range histogram adjustment,
selectivity = 0.2,
scan selectivity 0.2, filter selectivity 0.2
60412.2 rows, 34.2 pages
Data Row Cluster Ratio 0.9924527
Index Page Cluster Ratio 0.218543
Data Page Cluster Ratio 0.02202437
using index prefetch (size 128K I/O)
Large IO selected: The number of leaf pages qualified is > MIN_PREFETCH pages
in index cache 'default data cache' (cacheid 0) with LRU replacement
93Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Toprevent out of rangehistogramsToprevent out of rangehistograms
Turn off for updatestatistics
q Turn off for columns – not a whole table or specific index
q Syntax
 update statistics table_name
 [[partition data_partition_name]
 [ (column1, column2, …) | (column1), (column2), …] |
 index_name [partition index_partition_name]]
 [using step values | [out_of_range [on | off| default]]]
 [with consumers = consumers][, sampling=N percent]
 [, no_hashing | partial_hashing | hashing]
 [, max_resource_granularity = N [percent]]
 [, histogram_tuning_factor = int ]
 [, print_progress = int]
q Example
 Update statistics SAPSR3.ESH_EX_CPOINTER (ENTRY_TYPE) out_of_range off
 Update statistics SAPSR3.ESH_EX_CPOINTER (STATUS) out_of_range off
Out of rangehistogram is“sticky”
q Just like the number of steps, setting this once causes it to be used as
the default for all future update statistics that does not specify a
value.
(c) 2015 Independent SAP Technical User GroupAnnual Conference, 2015
OPTIMIZATION COSTINGOPTIMIZATION COSTING
(PART 2)(PART 2)
Multi-Column Densities& Joins…
95Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Multi-Column DensitiesMulti-Column Densities
A underused secret weapon
q Useful any time multiple predicates exist
q Think of it this way:
ü Two sample predicates
 Col_A = ‘5’
 Col_B = ‘GREEN’
ü Assume both have a selectivity of 0.1
 Combination could still be 0.1 if all Col_A=5 and Col_B=‘GREEN’ are same rows
 Combination could be 0.01 (or less) if only a single row had the combination
When doesit matter
q Joins, distinct, subquery (caching), sort estimations, ….
q Anyplace where the estimated number of rows returning
could change the query plan (and tip costs towards an
alternative ‘bad’ plan)
q Especially since we don’t have composite column histograms
96Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Multi-Column Density (Index)Multi-Column Density (Index)
Statistics for index: "aqi_weather_date_idx" (nonclustered)
Index column list: "sample_date", "air_temp", "weather"
Leaf count: 254345
Data page CR count: 167946797.0000000000000000
Index page CR count: 32018.0000000000000000
Data row CR count: 168066295.0000000000000000
Leaf row size: 6.1150672008890936
Index height: 3
Statistics for column group: "sample_date", "air_temp"
Last update of column statistics: May 27 2014 11:45:45:016AM
Range cell density: 0.0000051768562637
Total density: 0.0000051768562637
Range selectivity: default used (0.33)
In between selectivity: default used (0.25)
Unique range values: 0.0000016563476210
Unique total values: 0.0000016563476210
Average column width: default used (2.00)
Rows scanned: 168066824.0000000000000000
Statistics version: 4
Statistics for column group: "sample_date", "air_temp", "weather"
Last update of column statistics: May 27 2014 11:45:45:016AM
Range cell density: 0.0000051075008894
Total density: 0.0000051075008894
Range selectivity: default used (0.33)
In between selectivity: default used (0.25)
Unique range values: 0.0000016297687032
Unique total values: 0.0000016297687032
Average column width: 8.5268955638740458
Rows scanned: 168066824.0000000000000000
Statistics version: 4
This is the cost of a covered query (less
any portion of index not needed)
The ‘weather’ column must not be very distinct as it doesn’t
alter the table total density or range density by very much
If the IO cost of the index is ~page count and the IO cost for
the table is near the leaf count – it is doing an index scan
and then following each leaf…. Often not a good strategy
unless only a few rows
Any NL join using this index would need to traverse
the index tree this many times per outer row
(Note: Index cluster ratios removed due to space)
97Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Usinga Multi-Column DensityUsinga Multi-Column Density
Remember, wedon’t havecompositehistograms
First weconsider theselectivity of each of thecolumnsindividually
q This gives us an idea of how many rows there could be
q For example, col_A has 2 rows & col_B has 5 rows….
ü Total range is between 2 & 10 rows
ü Probability is likely closer to 2…but depends on
reality….
Then welook at multi-column density
q This is our flavor of reality to temper probability
q We use the above with a proprietary formula to compute
the selectivity
ü The more selective each column, the closer to the
multi-column density
98Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Example: Multi-Column DensityExample: Multi-Column Density
Statistics for column group: "sample_date", "air_temp", "weather"
Last update of column statistics: May 27 2014 11:45:45:016AM
Range cell density: 0.0000051075008894
Total density: 0.0000051075008894
Range selectivity: default used (0.33)
In between selectivity: default used (0.25)
Unique range values: 0.0000016297687032
Unique total values: 0.0000016297687032
Average column width: 8.5268955638740458
Rows scanned: 168066824.0000000000000000
Statistics version: 4
1> select l.city, l.county, s.sample_date, s.air_temp
2> from aqi_locations l, aqi_samples s
3> where l.location_id=s.location_id
4> and s.sample_date = 'July 1 2000 12:00:00:000PM'
5> and l.state='PA'
6> and s.weather='Overcast'
7> and s.air_temp = 90
Estimating selectivity of index 'aqi_samples.aqi_weather_date_idx', indid 3
sample_date= Jul 1 2000 12:00:00:000PM
weather = 'Overcast'
air_temp = 90
Estimated selectivity for sample_date,
selectivity = 0.0002490077,
Estimated selectivity for air_temp,
selectivity = 0.01104084,
Estimated selectivity for weather,
selectivity = 0.002359544,
scan selectivity 5.11258e-006, filter selectivity 5.11258e-006
859.2551 rows, 1.300359 pages
Data Row Cluster Ratio 3.186365e-006
Index Page Cluster Ratio 0.9989935
Data Page Cluster Ratio 0.0007121012
using no index prefetch (size 4K I/O)
in index cache 'default data cache' (cacheid 0) with LRU replacement
using no table prefetch (size 4K I/O)
in data cache 'default data cache' (cacheid 0) with LRU replacement
Data Page LIO for 'aqi_weather_date_idx' on table 'aqi_samples' = 859.2551
Selectivity based single histogram cell
for sample_date
Selectivity based single histogram cell for air_temp
Selectivity based on single histogram cell for weather
Selectivity estimate based on numbers of values for
the above combined with multi-cell density. Since
only a few values for each, the selectivity is close to
multi-column density
99Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Problem – LargeEstimatesProblem – LargeEstimates
In somecases, wecan’t usemulti-column densities
q For example, columns involved may have ranges of
values
q The total estimates of rows could then be astronomical
ü Perhaps even higher than the real rowcount
In such cases, wecomputea ‘smart’ density
q We know the best case is the most selective column
q We then simply a formula to derive a selectivity
ü Some cite sum(cell weight**2)
ü Others use W1*W2 + W1*W2*W3 …
100Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Example: Multi-Column EstimateExample: Multi-Column Estimate
1> select l.city, l.county, s.sample_date, s.air_temp
2> from aqi_locations l, aqi_samples s
3> where l.location_id=s.location_id
4> and s.sample_date between 'July 1 2000 00:00:01' and 'July 31 2000 23:59:59'
5> and l.state='PA'
6> and s.weather='Overcast'
7> and s.air_temp < 85
Estimating selectivity of index 'aqi_samples.aqi_weather_date_idx', indid 3
sample_date>= Jul 1 2000 12:00:01:000AM
sample_date <= Jul 31 2000 11:59:59:000PM
weather = 'Overcast'
air_temp < 85
Estimated selectivity for sample_date,
selectivity = 0.007751161,
Estimated selectivity for air_temp,
selectivity = 0.7523476,
Estimated selectivity for weather,
selectivity = 0.002359544,
Intelligent Scan selectivity reduction from 0.007751161 to 0.005852389
scan selectivity 0.005852389, filter selectivity 1.375984e-005
restricted selectivity 0.007751161
983592.5 rows, 1488.526 pages
Data Row Cluster Ratio 3.186365e-006
Index Page Cluster Ratio 0.9989935
Data Page Cluster Ratio 0.0007121012
using index prefetch (size 32K I/O)
Large IO selected: The number of leaf pages qualified is > MIN_PREFETCH pages
in index cache 'default data cache' (cacheid 0) with LRU replacement
using no table prefetch (size 4K I/O)
in data cache 'default data cache' (cacheid 0) with LRU replacement
Data Page LIO for 'aqi_weather_date_idx' on table 'aqi_samples' = 2312.572
Selectivity based on aggregating all
the dates in the range
Selectivity based all temps in unbounded range
Selectivity based on single cell density for weather
The worst case projection is the most selective of
the above
A better estimate is we use a formula to derive a new
value we think is more accurate for the scan selectivity
(estimate of index rows & leaf pages)…loosely it is
sum(W1*W2…) – e.g. W1*W2+W1*W2*W3
The filter selectivity (estimate of data pages) is the
product of the weights (e.g. W1*W2*W3 or
0.007751161* 0.7523476* 0.002359544 =
0.0000137598)
101Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
When tocreate(multi-)column statsWhen tocreate(multi-)column stats
Okay – weknow automatically created for index keys
q …and used for joins
When do/ought wecreateour own
q On the 2-nth index key (or subset)
ü ASE creates stats on {A}, {A,B},{A,B,C}, {A,B,C,D}
ü Might be useful to have {B,C,D} or {B,C}
 Help trip ORScans if leading column frequently not a predicate
 Help with joins when leading column is specified as literal/lateral join (ala SAP)
q On low cardinality columns we don’t want to index
ü …but frequently used as predicates (such as gender)
ü Especially if often used in queries with joins (help
inner/out table decision)
Not automatically maintained with ‘updateindex stats’
q You need to manually run update stats on each column
density you create
102Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
JoinsJoins
Traditional Logic @ DrivingTable
q Put the table that seems to ‘drive’ the join as the outer table
q Typically, this will be the ‘smaller’ table (or smaller rowset)
q The developer may know the driving table (e.g. #temp)
q …but optimizer has to figure it out
ü Estimate rowsets from each table using index selectivity
ü Estimate joined rows from joining with each table in list
 Reducing joined rows by applying index selectivity as filter
 But remember, this is a guess at optimization time
AlternativeLogic Pin smaller in cache
q Put larger rowset table as outer and scan once
q Inner (smaller) table can be pinned in cache
ü Avoid higher PIO
In both cases, themulti-column statson join columnsarekey torowset estimates
103Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Join StrategiesJoin Strategies
Remember, wehave3 typesof joins
q Nested Loop Joins
q Merge Joins (including Sort Merge Joins)
q Hash Joins
Optimizer needstofigureout which oneisbest
q For indexed joins, typically an NLJ will be best …
ü ….but this assumes M:N ratio is reasonably small (e.g. 1:10)
q A merged join is great for high cardinality joins
ü M:N is high r 1:1000+
ü Especially if inner table is sorted in join key sequence
q A hash join works best when join keys are not predicates but
predicates eliminate a lot of rows on both sides of join
ü Outer table is filtered by predicates and join keys hashed into
build table
ü Inner table is filtered by predicates, join key hashed and probed
for in build table
104Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Thisiswhy statsaresooo…criticalThisiswhy statsaresooo…critical
Weusethem toestimate
q cardinality of the join
q Rows that qualify from predicates (unjoined)
If theestimatesareoff by a lot
q We likely predict it is a high cardinality join
ü Remember, with 4 join keys, if we don’t have stats on
the other 3 columns, we use magic values of 0.1
q With very high row counts projected from inner table….
ü If we consider 3 levels of indexing and 10M rows,
that’s 40M LIO
ü Sorting 10M rows may only take 20M LIO’s…
ü ….so we degrade into a Sort Merge Join (SMJ)
105Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Join Keys: TheQueryJoin Keys: TheQuery
SELECT TOP 1 T_00."PRRBA"
FROM SAPSR3."/PXY/ACTUAL_DEP" T_00
INNER JOIN SAPSR3."/PXY/SCD" T_01
ON T_01."MANDT" = ?
AND T_01."RBARE" = T_00."PRRBA"
AND T_01."SCNA" = T_00."PRSCNA"
AND T_01."EXECNO" = T_00."PREXEC"
AND T_01."STEP" = T_00."PRST"
WHERE T_00."MANDT" = ?
AND T_00."SCNA" = ?
AND T_00."EXECNO" = ?
AND T_00."STEP" = ?
AND T_00."RBARE" = ?
AND T_01."STATUS" <> ?
AND T_01."STATUS" <> ?
/* R3:/PXY/SAPLRB:72334 T:/PXY/ACTUAL_DEP M:430 */
create unique nonclustered index "/PXY/ACTUAL_DEP~0"
on SAPSR3."/PXY/ACTUAL_DEP"(MANDT, SCNA, EXECNO, STEP, RBARE, PRSCNA, PREXEC, PRST, PRRBA)
create nonclustered index "/PXY/ACTUAL_DEP~00"
on SAPSR3."/PXY/ACTUAL_DEP"(MANDT, PRSCNA, PREXEC, PRST, PRRBA, SCNA, EXECNO, STEP, RBARE)
create unique nonclustered index "/PXY/SCD~0"
on SAPSR3."/PXY/SCD"(MANDT, RBARE, SCNA, EXECNO, STEP)
create nonclustered index "/PXY/SCD~ID1"
on SAPSR3."/PXY/SCD"(MANDT, SCNA, EXECNO, RBARE)
Notice the lateral join on MANDT = <value>.
Knowing that ASE has issues with literals at the
beginning of the join, we will see if adding multi-
column stats on {RBARE, SCNA, EXECNO, STEP}
helps NLJoin costing
106Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Join Keys– Bad Index UsageJoin Keys– Bad Index Usage
| |TOP Operator (VA = 4)
| | Top Limit: 1
| | |MERGE JOIN Operator (Join Type: Inner Join) (VA = 3)
| | | Using Worktable2 for internal storage.
| | | Key Count: 4
| | | Key Ordering: ASC ASC ASC ASC
| | | |SORT Operator (VA = 1)
| | | | Using Worktable1 for internal storage.
| | | | |SCAN Operator (VA = 0)
| | | | | FROM TABLE
| | | | | SAPSR3./PXY/ACTUAL_DEP
| | | | | T_00
| | | | | Index : /PXY/ACTUAL_DEP~0
| | | | | Forward Scan.
| | | | | Positioning by key.
| | | | | Index contains all needed columns. Base table will not be read.
| | | | | Keys are:
| | | | | MANDT ASC
| | | | | SCNA ASC
| | | | | EXECNO ASC
| | | | | STEP ASC
| | | | | RBARE ASC
| | | | | Using I/O Size 16 Kbytes for index leaf pages.
| | | | | With LRU Buffer Replacement Strategy for index leaf pages.
| | | |SCAN Operator (VA = 2)
| | | | FROM TABLE
| | | | SAPSR3./PXY/SCD
| | | | T_01
| | | | Index : /PXY/SCD~0
| | | | Forward Scan.
| | | | Positioning by key.
| | | | Keys are:
| | | | MANDT ASC
| | | | Using I/O Size 16 Kbytes for index leaf pages.
| | | | With LRU Buffer Replacement Strategy for index leaf pages.
| | | | Using I/O Size 16 Kbytes for data pages.
| | | | With LRU Buffer Replacement Strategy for data pages.
107Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Join Permutation Costing(1)Join Permutation Costing(1)
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
BEGIN: Complete join order evaluation (perm #1)
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Permutation Order: Gt0( SAPSR3./PXY/ACTUAL_DEP T_00 ) |X| Gt1( SAPSR3./PXY/SCD T_01 )
joining using ( PopNlJoin () () ) cost:0 tempdb:0 order: none
outer Pops:
( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 )
cost:81.29999 T(L3,P3,C2.999999) O(L3,P3,C2.999999) tempdb:0 order: <3,2,1,9>
( PopSort ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) )
cost:114.148 T(L9.765611,P3.76561,C4.765611) O(L6,P0,C1) tempdb:0.001237151 order: {1,2,3,9} Has BmoSort
inner Pops:
( PopRidJoin ( PopIndScan /PXY/SCD~0 SAPSR3./PXY/SCD T_01 ) )
cost:1989.483 T(L73.16116,P73.16116,C141.3204) O(L70.16116,P70.16116,C140.3204) tempdb:0.0006185754 order: <9,3,2,1>
joining using ( PopMergeJoin () () ) cost:0 tempdb:0 order: none
outer Pops:
( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 )
cost:81.29999 T(L3,P3,C2.999999) O(L3,P3,C2.999999) tempdb:0 order: <3,2,1,9>
( PopSort ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) )
cost:114.148 T(L9.765611,P3.76561,C4.765611) O(L6,P0,C1) tempdb:0.001237151 order: {1,2,3,9} Has BmoSort
inner Pops:
( PopRidJoin ( PopIndScan /PXY/SCD~ID1 SAPSR3./PXY/SCD T_01 ) )
cost:1162186 T(L183590.3,P5562.217,C6559500) O(L182634.3,P4606.217,C4055874) tempdb:0 order: <3,2,9>
( PopRidJoin ( PopIndScan /PXY/SCD~0 SAPSR3./PXY/SCD T_01 ) )
cost:614.7092 T(L20.83115,P20.78714,C533.6843) O(L17.83115,P17.78714,C355.7895) tempdb:0 order: <9,3,2,1>
( PopSort ( PopRidJoin ( PopIndScan /PXY/SCD~ID1 SAPSR3./PXY/SCD T_01 ) ) )
cost:4406059 T(L44736.09,P46577.09,C3.15216e+07) O(L1851,P3692,C3.147871e+07) tempdb:3077.973 order: {1,2,3,9} Has BmoSort
108Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Join Permutation Costing(2)Join Permutation Costing(2)
Eagerly enforcing...
the cheapest Pop:
( PopMergeJoin ( PopSort ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) )
( PopRidJoin ( PopIndScan /PXY/SCD~0 SAPSR3./PXY/SCD T_01 ) ) )
cost:735.8733 T(L30.59676,P24.55275,C608.61) O(L0,P0,C70.16021) tempdb:0.0006185754 order: none
... Pop enforcers:
... PopLet enforcers:
... done eager enforcement.
All Pops/PopLets before EqcN selection:
-> initial Pops:
( PopMergeJoin ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 )
( PopRidJoin ( PopIndScan /PXY/SCD~ID1 SAPSR3./PXY/SCD T_01 ) ) )
cost:1288721 T(L191677,P7108.215,C7276614) O(L8083.682,P1542.997,C717110.6) tempdb:0 order: none
( PopMergeJoin ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 )
( PopSort ( PopRidJoin ( PopIndScan /PXY/SCD~ID1 SAPSR3./PXY/SCD T_01 ) ) ) )
cost:4406148 T(L44739.09,P46580.09,C3.152167e+07) O(L0,P0,C70.16021) tempdb:1538.986 order: none
( PopMergeJoin ( PopSort ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) )
( PopRidJoin ( PopIndScan /PXY/SCD~ID1 SAPSR3./PXY/SCD T_01 ) ) )
cost:1162645 T(L183600,P5565.983,C6562956) O(L0,P0,C3451.033) tempdb:0.0006185754 order: none
( PopMergeJoin ( PopSort ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) )
( PopRidJoin ( PopIndScan /PXY/SCD~0 SAPSR3./PXY/SCD T_01 ) ) )
cost:735.8733 T(L30.59676,P24.55275,C608.61) O(L0,P0,C70.16021) tempdb:0.0006185754 order: none
( PopMergeJoin ( PopSort ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) )
( PopSort ( PopRidJoin ( PopIndScan /PXY/SCD~ID1 SAPSR3./PXY/SCD T_01 ) ) ) )
cost:4406180 T(L44745.86,P46580.86,C3.152167e+07) O(L0,P0,C70.16021) tempdb:1538.987 order: none Has BmoSort
( PopNlJoin ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 )
( PopRidJoin ( PopIndScan /PXY/SCD~0 SAPSR3./PXY/SCD T_01 ) ) )
cost:2070.783 T(L76.16116,P76.16116,C144.3204) tempdb:0 order: none
( PopNlJoin ( PopSort ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) )
( PopRidJoin ( PopIndScan /PXY/SCD~0 SAPSR3./PXY/SCD T_01 ) ) )
cost:2103.631 T(L82.92677,P76.92677,C146.086) tempdb:0.0006185754 order: none Has BmoSort
109Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Join Permutation Costing(3)Join Permutation Costing(3)
Eqc competition ...
initial old Pops:
( PopNlJoin ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) ( PopRidJoin ( PopIndScan /PXY/SCD~0 SAPSR3./PXY/SCD T_01 ) ) )
cost:2070.783 T(L76.16116,P76.16116,C144.3204) tempdb:0 order: none
initial new Pops:
...
pruned new against total 0
pruned new against old 5
pruned old against new 1
kept old Pops:
( PopNlJoin ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) ( PopRidJoin ( PopIndScan /PXY/SCD~0 SAPSR3./PXY/SCD T_01 ) ) )
cost:2070.783 T(L76.16116,P76.16116,C144.3204) tempdb:0 order: none
kept new Pops:
( PopMergeJoin ( PopSort ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) )
( PopRidJoin ( PopIndScan /PXY/SCD~0 SAPSR3./PXY/SCD T_01 ) ) )
cost:735.8733 T(L30.59676,P24.55275,C608.61) O(L0,P0,C70.16021) tempdb:0.0006185754 order: none
... done Eqc competition.
... done join visit.
Join plans selected for this permutation:
OptBlock0 Eqc{0,1} -> Pops added for the join Eqc{0} - Eqc{1}:
( PopMergeJoin ( PopSort ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) )
( PopRidJoin ( PopIndScan /PXY/SCD~0 SAPSR3./PXY/SCD T_01 ) ) )
cost:735.8733 T(L30.59676,P24.55275,C608.61) O(L0,P0,C70.16021) tempdb:0.0006185754 order: none
move greedy pops to new list
( PopNlJoin ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) ( PopRidJoin ( PopIndScan /PXY/SCD~0 SAPSR3./PXY/SCD T_01 ) ) )
cost:2070.783 T(L76.16116,P76.16116,C144.3204) tempdb:0 order: none
... done move greedy pops to new list.
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
DONE: Complete join order evaluation (perm #1)
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
“old Pops” = 12.5 style optimization – note that the cost is >2000
110Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group
Join Permutation Costing(4)Join Permutation Costing(4)
** Costing set up for RowLimit optimization **
TopLogProps0( SAPSR3./PXY/ACTUAL_DEP T_00 ) - TopPred: [Tc{} Pe{0,1,2,3,4}] TopSubst: {1,2,3,4,5,6,7,8,9,17}
TopLogProps0( SAPSR3./PXY/SCD T_01 ) - TopPred: [Tc{} Pe{5,6,7}] TopSubst: {11,12,13,14,15,16}
Statistics for rows returned to client...
Estimated rows :14073.64 Estimated row width :7.002473
Estimated client cost is :78.59161
Estimating selectivity of index 'SAPSR3./PXY/SCD./PXY/SCD~0', indid 2
MANDT = '430'
Estimated selectivity for MANDT,
selectivity = 1,
scan selectivity 1, filter selectivity 1
Cost adjusted for RowLimit optimization, Adjustment ratio 7.105484e-05
2503626 rows, 6283 pages
Adjustment ratio 7.105484e-05 applied gives 177.8947 rows, 1 pages
Data Row Cluster Ratio 0.9107559
Index Page Cluster Ratio 0.9874477
Data Page Cluster Ratio 0.242736
using index prefetch (size 128K I/O)
Large IO selected: The number of leaf pages qualified is > MIN_PREFETCH pages
Adjustment using index prefetch (size 128K I/O)
in index cache 'default data cache' (cacheid 0) with LRU replacement
using table prefetch (size 128K I/O)
Large IO selected: The number of leaf pages qualified is > MIN_PREFETCH pages
Adjustment using table prefetch (size 128K I/O)
in data cache 'default data cache' (cacheid 0) with LRU replacement
Data Page LIO for '/PXY/SCD~0' on table 'SAPSR3./PXY/SCD' = 17.83115
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization
The Science of DBMS: Query Optimization

Contenu connexe

Tendances

Snowflake essentials
Snowflake essentialsSnowflake essentials
Snowflake essentialsqureshihamid
 
Changing the game with cloud dw
Changing the game with cloud dwChanging the game with cloud dw
Changing the game with cloud dwelephantscale
 
Enterprise Data Lake
Enterprise Data LakeEnterprise Data Lake
Enterprise Data Lakesambiswal
 
Presto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix ContainersPresto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix Containerskbajda
 
Talend Open Studio for Big Data | Talend Open Studio Tutorial | Talend Online...
Talend Open Studio for Big Data | Talend Open Studio Tutorial | Talend Online...Talend Open Studio for Big Data | Talend Open Studio Tutorial | Talend Online...
Talend Open Studio for Big Data | Talend Open Studio Tutorial | Talend Online...Edureka!
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing conceptspcherukumalla
 
Intro to Talend Open Studio for Data Integration
Intro to Talend Open Studio for Data IntegrationIntro to Talend Open Studio for Data Integration
Intro to Talend Open Studio for Data IntegrationPhilip Yurchuk
 
Intro to Big Data and NoSQL
Intro to Big Data and NoSQLIntro to Big Data and NoSQL
Intro to Big Data and NoSQLDon Demcsak
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversScyllaDB
 
Intro to Azure Data Factory v1
Intro to Azure Data Factory v1Intro to Azure Data Factory v1
Intro to Azure Data Factory v1Eric Bragas
 
Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0Databricks
 
Tips and Tricks for SAP Sybase IQ
Tips and Tricks for SAP  Sybase IQTips and Tricks for SAP  Sybase IQ
Tips and Tricks for SAP Sybase IQDon Brizendine
 
Cloudera Impala Source Code Explanation and Analysis
Cloudera Impala Source Code Explanation and AnalysisCloudera Impala Source Code Explanation and Analysis
Cloudera Impala Source Code Explanation and AnalysisYue Chen
 
Cloudera Impala Internals
Cloudera Impala InternalsCloudera Impala Internals
Cloudera Impala InternalsDavid Groozman
 

Tendances (20)

Snowflake essentials
Snowflake essentialsSnowflake essentials
Snowflake essentials
 
Changing the game with cloud dw
Changing the game with cloud dwChanging the game with cloud dw
Changing the game with cloud dw
 
Enterprise Data Lake
Enterprise Data LakeEnterprise Data Lake
Enterprise Data Lake
 
Presto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix ContainersPresto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix Containers
 
Talend Open Studio for Big Data | Talend Open Studio Tutorial | Talend Online...
Talend Open Studio for Big Data | Talend Open Studio Tutorial | Talend Online...Talend Open Studio for Big Data | Talend Open Studio Tutorial | Talend Online...
Talend Open Studio for Big Data | Talend Open Studio Tutorial | Talend Online...
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
 
Intro to Talend Open Studio for Data Integration
Intro to Talend Open Studio for Data IntegrationIntro to Talend Open Studio for Data Integration
Intro to Talend Open Studio for Data Integration
 
Intro to Big Data and NoSQL
Intro to Big Data and NoSQLIntro to Big Data and NoSQL
Intro to Big Data and NoSQL
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the Covers
 
Intro to Azure Data Factory v1
Intro to Azure Data Factory v1Intro to Azure Data Factory v1
Intro to Azure Data Factory v1
 
Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0
 
Tips and Tricks for SAP Sybase IQ
Tips and Tricks for SAP  Sybase IQTips and Tricks for SAP  Sybase IQ
Tips and Tricks for SAP Sybase IQ
 
Cloudera Impala Source Code Explanation and Analysis
Cloudera Impala Source Code Explanation and AnalysisCloudera Impala Source Code Explanation and Analysis
Cloudera Impala Source Code Explanation and Analysis
 
Azure SQL Database
Azure SQL DatabaseAzure SQL Database
Azure SQL Database
 
SQL
SQLSQL
SQL
 
Cloudera Impala Internals
Cloudera Impala InternalsCloudera Impala Internals
Cloudera Impala Internals
 
MS SQL Server
MS SQL ServerMS SQL Server
MS SQL Server
 
Azure Synapse Analytics
Azure Synapse AnalyticsAzure Synapse Analytics
Azure Synapse Analytics
 
Data Vault Overview
Data Vault OverviewData Vault Overview
Data Vault Overview
 

En vedette

ASE Semantic Partitions- A Case Study
ASE Semantic Partitions- A Case Study ASE Semantic Partitions- A Case Study
ASE Semantic Partitions- A Case Study SAP Technology
 
Advanced ASE Performance Tuning Tips
Advanced ASE Performance Tuning Tips Advanced ASE Performance Tuning Tips
Advanced ASE Performance Tuning Tips SAP Technology
 
Leveraging SAP ASE Workload Analyzer to optimize your database environment
Leveraging SAP ASE Workload Analyzer to optimize your database environmentLeveraging SAP ASE Workload Analyzer to optimize your database environment
Leveraging SAP ASE Workload Analyzer to optimize your database environmentSAP Technology
 
Git migration - Lessons learned
Git migration - Lessons learnedGit migration - Lessons learned
Git migration - Lessons learnedTomasz Zarna
 
Configuring and using SIDB for ASE CE SP130
Configuring and using SIDB for ASE CE SP130Configuring and using SIDB for ASE CE SP130
Configuring and using SIDB for ASE CE SP130SAP Technology
 
Sybase ASE 15.7- Two Case Studies of Successful Migration
Sybase ASE 15.7- Two Case Studies of Successful Migration Sybase ASE 15.7- Two Case Studies of Successful Migration
Sybase ASE 15.7- Two Case Studies of Successful Migration SAP Technology
 
What's New in SAP Replication Server 15.7.1 SP100
What's New in SAP Replication Server 15.7.1 SP100What's New in SAP Replication Server 15.7.1 SP100
What's New in SAP Replication Server 15.7.1 SP100Dobler Consulting
 
ASE Tempdb Performance and Tuning
ASE Tempdb Performance and Tuning ASE Tempdb Performance and Tuning
ASE Tempdb Performance and Tuning SAP Technology
 
Tips Tricks and Little known features in SAP ASE
Tips Tricks and Little known features in SAP ASETips Tricks and Little known features in SAP ASE
Tips Tricks and Little known features in SAP ASESAP Technology
 
SAP ASE 16 SP02 Performance Features
SAP ASE 16 SP02 Performance FeaturesSAP ASE 16 SP02 Performance Features
SAP ASE 16 SP02 Performance FeaturesSAP Technology
 
Storage Optimization and Operational Simplicity in SAP Adaptive Server Enter...
Storage Optimization and Operational Simplicity in SAP  Adaptive Server Enter...Storage Optimization and Operational Simplicity in SAP  Adaptive Server Enter...
Storage Optimization and Operational Simplicity in SAP Adaptive Server Enter...SAP Technology
 
ASE Performance and Tuning Parameters Beyond the cfg File
ASE Performance and Tuning Parameters Beyond the cfg FileASE Performance and Tuning Parameters Beyond the cfg File
ASE Performance and Tuning Parameters Beyond the cfg FileSAP Technology
 
Hadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsHadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsLynn Langit
 

En vedette (15)

ASE Semantic Partitions- A Case Study
ASE Semantic Partitions- A Case Study ASE Semantic Partitions- A Case Study
ASE Semantic Partitions- A Case Study
 
Advanced ASE Performance Tuning Tips
Advanced ASE Performance Tuning Tips Advanced ASE Performance Tuning Tips
Advanced ASE Performance Tuning Tips
 
Leveraging SAP ASE Workload Analyzer to optimize your database environment
Leveraging SAP ASE Workload Analyzer to optimize your database environmentLeveraging SAP ASE Workload Analyzer to optimize your database environment
Leveraging SAP ASE Workload Analyzer to optimize your database environment
 
Git migration - Lessons learned
Git migration - Lessons learnedGit migration - Lessons learned
Git migration - Lessons learned
 
How To Be a Great DBA
How To Be a Great DBAHow To Be a Great DBA
How To Be a Great DBA
 
Configuring and using SIDB for ASE CE SP130
Configuring and using SIDB for ASE CE SP130Configuring and using SIDB for ASE CE SP130
Configuring and using SIDB for ASE CE SP130
 
Sybase ASE 15.7- Two Case Studies of Successful Migration
Sybase ASE 15.7- Two Case Studies of Successful Migration Sybase ASE 15.7- Two Case Studies of Successful Migration
Sybase ASE 15.7- Two Case Studies of Successful Migration
 
Sap replication server
Sap replication serverSap replication server
Sap replication server
 
What's New in SAP Replication Server 15.7.1 SP100
What's New in SAP Replication Server 15.7.1 SP100What's New in SAP Replication Server 15.7.1 SP100
What's New in SAP Replication Server 15.7.1 SP100
 
ASE Tempdb Performance and Tuning
ASE Tempdb Performance and Tuning ASE Tempdb Performance and Tuning
ASE Tempdb Performance and Tuning
 
Tips Tricks and Little known features in SAP ASE
Tips Tricks and Little known features in SAP ASETips Tricks and Little known features in SAP ASE
Tips Tricks and Little known features in SAP ASE
 
SAP ASE 16 SP02 Performance Features
SAP ASE 16 SP02 Performance FeaturesSAP ASE 16 SP02 Performance Features
SAP ASE 16 SP02 Performance Features
 
Storage Optimization and Operational Simplicity in SAP Adaptive Server Enter...
Storage Optimization and Operational Simplicity in SAP  Adaptive Server Enter...Storage Optimization and Operational Simplicity in SAP  Adaptive Server Enter...
Storage Optimization and Operational Simplicity in SAP Adaptive Server Enter...
 
ASE Performance and Tuning Parameters Beyond the cfg File
ASE Performance and Tuning Parameters Beyond the cfg FileASE Performance and Tuning Parameters Beyond the cfg File
ASE Performance and Tuning Parameters Beyond the cfg File
 
Hadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsHadoop MapReduce Fundamentals
Hadoop MapReduce Fundamentals
 

Similaire à The Science of DBMS: Query Optimization

Maximizing Database Tuning in SAP SQL Anywhere
Maximizing Database Tuning in SAP SQL AnywhereMaximizing Database Tuning in SAP SQL Anywhere
Maximizing Database Tuning in SAP SQL AnywhereSAP Technology
 
The Science of DBMS: Data Storage & Organization
The Science of DBMS: Data Storage & Organization The Science of DBMS: Data Storage & Organization
The Science of DBMS: Data Storage & Organization SAP Technology
 
An In-Depth Look at SAP SQL Anywhere Performance Features
An In-Depth Look at SAP SQL Anywhere Performance FeaturesAn In-Depth Look at SAP SQL Anywhere Performance Features
An In-Depth Look at SAP SQL Anywhere Performance FeaturesSAP Technology
 
FEASIBLE-Benchmark-Framework-ISWC2015
FEASIBLE-Benchmark-Framework-ISWC2015FEASIBLE-Benchmark-Framework-ISWC2015
FEASIBLE-Benchmark-Framework-ISWC2015Muhammad Saleem
 
SQL 2016 Query Store: Et si mes queries m'étaient contées...
SQL 2016 Query Store: Et si mes queries m'étaient contées...SQL 2016 Query Store: Et si mes queries m'étaient contées...
SQL 2016 Query Store: Et si mes queries m'étaient contées...Isabelle Van Campenhoudt
 
Columnstore improvements in SQL Server 2016
Columnstore improvements in SQL Server 2016Columnstore improvements in SQL Server 2016
Columnstore improvements in SQL Server 2016Niko Neugebauer
 
Auto-Pilot for Apache Spark Using Machine Learning
Auto-Pilot for Apache Spark Using Machine LearningAuto-Pilot for Apache Spark Using Machine Learning
Auto-Pilot for Apache Spark Using Machine LearningDatabricks
 
Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...
Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...
Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...DataKitchen
 
Presentación Oracle Database Migración consideraciones 10g/11g/12c
Presentación Oracle Database Migración consideraciones 10g/11g/12cPresentación Oracle Database Migración consideraciones 10g/11g/12c
Presentación Oracle Database Migración consideraciones 10g/11g/12cRonald Francisco Vargas Quesada
 
Big Data, Bigger Analytics
Big Data, Bigger AnalyticsBig Data, Bigger Analytics
Big Data, Bigger AnalyticsItzhak Kameli
 
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...PAPIs.io
 
Query generation across multiple data stores [SBTB 2016]
Query generation across multiple data stores [SBTB 2016]Query generation across multiple data stores [SBTB 2016]
Query generation across multiple data stores [SBTB 2016]Hiral Patel
 
Choosing Indexes For Performance
Choosing Indexes For PerformanceChoosing Indexes For Performance
Choosing Indexes For PerformanceSAP Technology
 
Machine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabsMachine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabszekeLabs Technologies
 
Fast federated SQL with Apache Calcite
Fast federated SQL with Apache CalciteFast federated SQL with Apache Calcite
Fast federated SQL with Apache CalciteChris Baynes
 
SQL Anywhere Tips and Tricks
SQL Anywhere Tips and TricksSQL Anywhere Tips and Tricks
SQL Anywhere Tips and TricksSAP Technology
 
3 query tuning techniques every sql server programmer should know
3 query tuning techniques every sql server programmer should know3 query tuning techniques every sql server programmer should know
3 query tuning techniques every sql server programmer should knowRodrigo Crespi
 
In-memory ColumnStore Index
In-memory ColumnStore IndexIn-memory ColumnStore Index
In-memory ColumnStore IndexSolidQ
 

Similaire à The Science of DBMS: Query Optimization (20)

Maximizing Database Tuning in SAP SQL Anywhere
Maximizing Database Tuning in SAP SQL AnywhereMaximizing Database Tuning in SAP SQL Anywhere
Maximizing Database Tuning in SAP SQL Anywhere
 
The Science of DBMS: Data Storage & Organization
The Science of DBMS: Data Storage & Organization The Science of DBMS: Data Storage & Organization
The Science of DBMS: Data Storage & Organization
 
An In-Depth Look at SAP SQL Anywhere Performance Features
An In-Depth Look at SAP SQL Anywhere Performance FeaturesAn In-Depth Look at SAP SQL Anywhere Performance Features
An In-Depth Look at SAP SQL Anywhere Performance Features
 
FEASIBLE-Benchmark-Framework-ISWC2015
FEASIBLE-Benchmark-Framework-ISWC2015FEASIBLE-Benchmark-Framework-ISWC2015
FEASIBLE-Benchmark-Framework-ISWC2015
 
SQL 2016 Query Store: Et si mes queries m'étaient contées...
SQL 2016 Query Store: Et si mes queries m'étaient contées...SQL 2016 Query Store: Et si mes queries m'étaient contées...
SQL 2016 Query Store: Et si mes queries m'étaient contées...
 
Columnstore improvements in SQL Server 2016
Columnstore improvements in SQL Server 2016Columnstore improvements in SQL Server 2016
Columnstore improvements in SQL Server 2016
 
OOW13 Exadata and ODI with Parallel
OOW13 Exadata and ODI with ParallelOOW13 Exadata and ODI with Parallel
OOW13 Exadata and ODI with Parallel
 
Auto-Pilot for Apache Spark Using Machine Learning
Auto-Pilot for Apache Spark Using Machine LearningAuto-Pilot for Apache Spark Using Machine Learning
Auto-Pilot for Apache Spark Using Machine Learning
 
Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...
Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...
Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...
 
Presentación Oracle Database Migración consideraciones 10g/11g/12c
Presentación Oracle Database Migración consideraciones 10g/11g/12cPresentación Oracle Database Migración consideraciones 10g/11g/12c
Presentación Oracle Database Migración consideraciones 10g/11g/12c
 
Exploring T-SQL Anti-Patterns
Exploring T-SQL Anti-Patterns Exploring T-SQL Anti-Patterns
Exploring T-SQL Anti-Patterns
 
Big Data, Bigger Analytics
Big Data, Bigger AnalyticsBig Data, Bigger Analytics
Big Data, Bigger Analytics
 
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
 
Query generation across multiple data stores [SBTB 2016]
Query generation across multiple data stores [SBTB 2016]Query generation across multiple data stores [SBTB 2016]
Query generation across multiple data stores [SBTB 2016]
 
Choosing Indexes For Performance
Choosing Indexes For PerformanceChoosing Indexes For Performance
Choosing Indexes For Performance
 
Machine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabsMachine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabs
 
Fast federated SQL with Apache Calcite
Fast federated SQL with Apache CalciteFast federated SQL with Apache Calcite
Fast federated SQL with Apache Calcite
 
SQL Anywhere Tips and Tricks
SQL Anywhere Tips and TricksSQL Anywhere Tips and Tricks
SQL Anywhere Tips and Tricks
 
3 query tuning techniques every sql server programmer should know
3 query tuning techniques every sql server programmer should know3 query tuning techniques every sql server programmer should know
3 query tuning techniques every sql server programmer should know
 
In-memory ColumnStore Index
In-memory ColumnStore IndexIn-memory ColumnStore Index
In-memory ColumnStore Index
 

Plus de SAP Technology

SAP Integration Suite L1
SAP Integration Suite L1SAP Integration Suite L1
SAP Integration Suite L1SAP Technology
 
Future-Proof Your Business Processes by Automating SAP S/4HANA processes with...
Future-Proof Your Business Processes by Automating SAP S/4HANA processes with...Future-Proof Your Business Processes by Automating SAP S/4HANA processes with...
Future-Proof Your Business Processes by Automating SAP S/4HANA processes with...SAP Technology
 
7 Top Reasons to Automate Processes with SAP Intelligent Robotic Processes Au...
7 Top Reasons to Automate Processes with SAP Intelligent Robotic Processes Au...7 Top Reasons to Automate Processes with SAP Intelligent Robotic Processes Au...
7 Top Reasons to Automate Processes with SAP Intelligent Robotic Processes Au...SAP Technology
 
Extend SAP S/4HANA to deliver real-time intelligent processes
Extend SAP S/4HANA to deliver real-time intelligent processesExtend SAP S/4HANA to deliver real-time intelligent processes
Extend SAP S/4HANA to deliver real-time intelligent processesSAP Technology
 
Process optimization and automation for SAP S/4HANA with SAP’s Business Techn...
Process optimization and automation for SAP S/4HANA with SAP’s Business Techn...Process optimization and automation for SAP S/4HANA with SAP’s Business Techn...
Process optimization and automation for SAP S/4HANA with SAP’s Business Techn...SAP Technology
 
Accelerate your journey to SAP S/4HANA with SAP’s Business Technology Platform
Accelerate your journey to SAP S/4HANA with SAP’s Business Technology PlatformAccelerate your journey to SAP S/4HANA with SAP’s Business Technology Platform
Accelerate your journey to SAP S/4HANA with SAP’s Business Technology PlatformSAP Technology
 
Accelerate Your Move to an Intelligent Enterprise with SAP Cloud Platform and...
Accelerate Your Move to an Intelligent Enterprise with SAP Cloud Platform and...Accelerate Your Move to an Intelligent Enterprise with SAP Cloud Platform and...
Accelerate Your Move to an Intelligent Enterprise with SAP Cloud Platform and...SAP Technology
 
Transform your business with intelligent insights and SAP S/4HANA
Transform your business with intelligent insights and SAP S/4HANATransform your business with intelligent insights and SAP S/4HANA
Transform your business with intelligent insights and SAP S/4HANASAP Technology
 
SAP Cloud Platform for SAP S/4HANA: Accelerate your move to an Intelligent En...
SAP Cloud Platform for SAP S/4HANA: Accelerate your move to an Intelligent En...SAP Cloud Platform for SAP S/4HANA: Accelerate your move to an Intelligent En...
SAP Cloud Platform for SAP S/4HANA: Accelerate your move to an Intelligent En...SAP Technology
 
Innovate collaborative applications with SAP Jam Collaboration & SAP Cloud Pl...
Innovate collaborative applications with SAP Jam Collaboration & SAP Cloud Pl...Innovate collaborative applications with SAP Jam Collaboration & SAP Cloud Pl...
Innovate collaborative applications with SAP Jam Collaboration & SAP Cloud Pl...SAP Technology
 
The IoT Imperative for Consumer Products
The IoT Imperative for Consumer ProductsThe IoT Imperative for Consumer Products
The IoT Imperative for Consumer ProductsSAP Technology
 
The IoT Imperative for Discrete Manufacturers - Automotive, Aerospace & Defen...
The IoT Imperative for Discrete Manufacturers - Automotive, Aerospace & Defen...The IoT Imperative for Discrete Manufacturers - Automotive, Aerospace & Defen...
The IoT Imperative for Discrete Manufacturers - Automotive, Aerospace & Defen...SAP Technology
 
IoT is Enabling a New Era of Shareholder Value in Energy and Natural Resource...
IoT is Enabling a New Era of Shareholder Value in Energy and Natural Resource...IoT is Enabling a New Era of Shareholder Value in Energy and Natural Resource...
IoT is Enabling a New Era of Shareholder Value in Energy and Natural Resource...SAP Technology
 
The IoT Imperative in Government and Healthcare
The IoT Imperative in Government and HealthcareThe IoT Imperative in Government and Healthcare
The IoT Imperative in Government and HealthcareSAP Technology
 
SAP S/4HANA Finance and the Digital Core
SAP S/4HANA Finance and the Digital CoreSAP S/4HANA Finance and the Digital Core
SAP S/4HANA Finance and the Digital CoreSAP Technology
 
Five Reasons To Skip SAP Suite on HANA and Go Directly to SAP S/4HANA
Five Reasons To Skip SAP Suite on HANA and Go Directly to SAP S/4HANAFive Reasons To Skip SAP Suite on HANA and Go Directly to SAP S/4HANA
Five Reasons To Skip SAP Suite on HANA and Go Directly to SAP S/4HANASAP Technology
 
SAP Helps Reduce Silos Between Business and Spatial Data
SAP Helps Reduce Silos Between Business and Spatial DataSAP Helps Reduce Silos Between Business and Spatial Data
SAP Helps Reduce Silos Between Business and Spatial DataSAP Technology
 
Spotlight on Financial Services with Calypso and SAP ASE
Spotlight on Financial Services with Calypso and SAP ASESpotlight on Financial Services with Calypso and SAP ASE
Spotlight on Financial Services with Calypso and SAP ASESAP Technology
 
Spark Usage in Enterprise Business Operations
Spark Usage in Enterprise Business OperationsSpark Usage in Enterprise Business Operations
Spark Usage in Enterprise Business OperationsSAP Technology
 

Plus de SAP Technology (20)

SAP Integration Suite L1
SAP Integration Suite L1SAP Integration Suite L1
SAP Integration Suite L1
 
Future-Proof Your Business Processes by Automating SAP S/4HANA processes with...
Future-Proof Your Business Processes by Automating SAP S/4HANA processes with...Future-Proof Your Business Processes by Automating SAP S/4HANA processes with...
Future-Proof Your Business Processes by Automating SAP S/4HANA processes with...
 
7 Top Reasons to Automate Processes with SAP Intelligent Robotic Processes Au...
7 Top Reasons to Automate Processes with SAP Intelligent Robotic Processes Au...7 Top Reasons to Automate Processes with SAP Intelligent Robotic Processes Au...
7 Top Reasons to Automate Processes with SAP Intelligent Robotic Processes Au...
 
Extend SAP S/4HANA to deliver real-time intelligent processes
Extend SAP S/4HANA to deliver real-time intelligent processesExtend SAP S/4HANA to deliver real-time intelligent processes
Extend SAP S/4HANA to deliver real-time intelligent processes
 
Process optimization and automation for SAP S/4HANA with SAP’s Business Techn...
Process optimization and automation for SAP S/4HANA with SAP’s Business Techn...Process optimization and automation for SAP S/4HANA with SAP’s Business Techn...
Process optimization and automation for SAP S/4HANA with SAP’s Business Techn...
 
Accelerate your journey to SAP S/4HANA with SAP’s Business Technology Platform
Accelerate your journey to SAP S/4HANA with SAP’s Business Technology PlatformAccelerate your journey to SAP S/4HANA with SAP’s Business Technology Platform
Accelerate your journey to SAP S/4HANA with SAP’s Business Technology Platform
 
Accelerate Your Move to an Intelligent Enterprise with SAP Cloud Platform and...
Accelerate Your Move to an Intelligent Enterprise with SAP Cloud Platform and...Accelerate Your Move to an Intelligent Enterprise with SAP Cloud Platform and...
Accelerate Your Move to an Intelligent Enterprise with SAP Cloud Platform and...
 
Transform your business with intelligent insights and SAP S/4HANA
Transform your business with intelligent insights and SAP S/4HANATransform your business with intelligent insights and SAP S/4HANA
Transform your business with intelligent insights and SAP S/4HANA
 
SAP Cloud Platform for SAP S/4HANA: Accelerate your move to an Intelligent En...
SAP Cloud Platform for SAP S/4HANA: Accelerate your move to an Intelligent En...SAP Cloud Platform for SAP S/4HANA: Accelerate your move to an Intelligent En...
SAP Cloud Platform for SAP S/4HANA: Accelerate your move to an Intelligent En...
 
Innovate collaborative applications with SAP Jam Collaboration & SAP Cloud Pl...
Innovate collaborative applications with SAP Jam Collaboration & SAP Cloud Pl...Innovate collaborative applications with SAP Jam Collaboration & SAP Cloud Pl...
Innovate collaborative applications with SAP Jam Collaboration & SAP Cloud Pl...
 
The IoT Imperative for Consumer Products
The IoT Imperative for Consumer ProductsThe IoT Imperative for Consumer Products
The IoT Imperative for Consumer Products
 
The IoT Imperative for Discrete Manufacturers - Automotive, Aerospace & Defen...
The IoT Imperative for Discrete Manufacturers - Automotive, Aerospace & Defen...The IoT Imperative for Discrete Manufacturers - Automotive, Aerospace & Defen...
The IoT Imperative for Discrete Manufacturers - Automotive, Aerospace & Defen...
 
IoT is Enabling a New Era of Shareholder Value in Energy and Natural Resource...
IoT is Enabling a New Era of Shareholder Value in Energy and Natural Resource...IoT is Enabling a New Era of Shareholder Value in Energy and Natural Resource...
IoT is Enabling a New Era of Shareholder Value in Energy and Natural Resource...
 
The IoT Imperative in Government and Healthcare
The IoT Imperative in Government and HealthcareThe IoT Imperative in Government and Healthcare
The IoT Imperative in Government and Healthcare
 
SAP S/4HANA Finance and the Digital Core
SAP S/4HANA Finance and the Digital CoreSAP S/4HANA Finance and the Digital Core
SAP S/4HANA Finance and the Digital Core
 
Five Reasons To Skip SAP Suite on HANA and Go Directly to SAP S/4HANA
Five Reasons To Skip SAP Suite on HANA and Go Directly to SAP S/4HANAFive Reasons To Skip SAP Suite on HANA and Go Directly to SAP S/4HANA
Five Reasons To Skip SAP Suite on HANA and Go Directly to SAP S/4HANA
 
SAP Helps Reduce Silos Between Business and Spatial Data
SAP Helps Reduce Silos Between Business and Spatial DataSAP Helps Reduce Silos Between Business and Spatial Data
SAP Helps Reduce Silos Between Business and Spatial Data
 
Why SAP HANA?
Why SAP HANA?Why SAP HANA?
Why SAP HANA?
 
Spotlight on Financial Services with Calypso and SAP ASE
Spotlight on Financial Services with Calypso and SAP ASESpotlight on Financial Services with Calypso and SAP ASE
Spotlight on Financial Services with Calypso and SAP ASE
 
Spark Usage in Enterprise Business Operations
Spark Usage in Enterprise Business OperationsSpark Usage in Enterprise Business Operations
Spark Usage in Enterprise Business Operations
 

Dernier

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 

Dernier (20)

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 

The Science of DBMS: Query Optimization

  • 1. (c) 2015 Independent SAP Technical User GroupAnnual Conference, 2015 -ISUG TECH 2015-ISUG TECH 2015 ConferenceConference :The Science of DBMS Query Optimization:The Science of DBMS Query Optimization ,Jeff Tallman SAP ASE Product Management,Jeff Tallman SAP ASE Product Management
  • 2. 2Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group AgendaAgenda Intro& Optimization Basics q Basic optimization cost factors q Procedure Cache (ASE) Query Processing& Optimization q Internals of QP q Impact of LOP-tree q Understanding optimization vs. execution Optimization Costing q Histograms & column densities q IN() & OR clauses q Out of range histograms q Joins & Multi-column densities Controllingoptimization q Sp_chgattribute ‘opt concurrency threshold’ q Sp_modifystats q Resource Granularity
  • 3. 3Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group SomeCaveatsSomeCaveats Query Optimization isvery vendor proprietary/confidential q You can buy books on generic optimization techniques…. q …but DBMS vendors hire PhD’s to develop implementations ü Query performance often depends on how good the optimization is ü This is a key difference between OpenSource and COTS DBMS packages  The strength of the query optimizer is largely due to the $$$ vested in skills of highly educated staffing Asa result, thissession will NOT explain thesecretsof ASE’soptimizer q However, it will explain how it works, what influences it, what resources it uses, etc. q Additionally, most modern optimizers all use the same lava tree model ü Query optimization is based on an upside down tree with data spewing out the top
  • 4. 4Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Goal of ThisSessionGoal of ThisSession Thegoal of thissession q Help you understand the intricacies of query optimization q Use that knowledge to write queries that can be optimized better q Understand how/when additional index statistics might be necessary q Understand how to influence optimization ü Other than the usual index forcing, AQP plan clauses, etc. q Differentiate when the optimizer is messing up…or your SQL did Assumptionsfor thissession
  • 5. 5Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group RulesBased OptimizationRulesBased Optimization Rulesbased optimization q Index selection and join order processing are based on specific rules q For example: ü Index selection is based on the index whose leading columns are most covered by query predicates ü Join order is based on left to right ordering in FROM clause designates driving tables/join order Thegood, bad & ugly q Very good for extremely volatile data in which histogram statistics are often stale/impossible q Good for insert intensive monotonic sequences in which new values are out of range of histograms q Not so good…in fact sometimes ugly…on data that has any sort of skew with highly repetitive values q The really ugly part is if the SQL coders don’t know the “rules”
  • 6. 6Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Cost Based OptimizationCost Based Optimization Used by all mainstream DBMS’s q Oracle, IBM DB2 UDB, MS SQL, ASE Attemptstofind thecheapest method toperform query q Uses some factoring of IO, CPU and memory q Formula for cost varies among DBMS’s Thekey tocostingisindex/column histograms q In a sense, histograms attempt to report the relative skew of the data being queried q The optimizer’s goal is to find the cheapest access path considering the data skew q If it wasn’t for the histogram reporting the skew…a rules based optimization would be the only choice
  • 7. 7Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group SimpleCost Factors(1)SimpleCost Factors(1) Physical IO q This is pretty obvious – disks are slow. q But we also need to predict how many writes (and then re-reads) we may need to do for intermediate results Logical IO q This is where PhD’s are made q Remember, at query optimization time, we don’t know what pages we are after…. q However, we need to determine how many LIOs we expect based on ü How much of a table is already in cache ü How often we may revisit the same pages for multiple rows
  • 8. 8Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group SimpleCost Factors(2)SimpleCost Factors(2) Memory q Besides LIO, memory can be used to cache query intermediate results such as subquery results, hash tables for HJ, etc. q In addition, memory can be used to avoid writes – e.g. in memory sorts for order by, sort merge joins, etc. CPU q Again, fairly basic – but every LIO requires CPU ü We need to do the data comparison for non-index key predicates ü Again, though, we really don’t know how fast the CPU is that we are on…and how awful the data comparisons will be  We might apply some fuzzy logic on LIKE ‘%pattern%’ on large varchars or something….but ….. q Also, basic – sorts require CPU as well ü Distinct processing, Order by processing, etc.
  • 9. 9Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group ProcedureCache& OptimizationProcedureCache& Optimization Optimization • oneof theconsumersof proccache q Index statistics are loaded into proc cache for each query optimization ü Visible with set option show long q Temporary work plans are created in proc cache q Reported via set statistics resource on q Total consumption not a lot (rule of thumb = #engines * 2MB for OLTP) Twobigproblems q There is no ‘sharing’ of index statistics in proc cache q Index statistics don’t stay in cache ü As soon as query optimization for that query is finished, the proc buffers are deallocated. ü This means a TON of logical IOs on sysstatistics  Unless you use a lot of fully prepared statements or stored procedures ü Hence you really want to ensure you have a dedicated systables cache q This is largely due to historical aspects ü Remember, in 1984, 1MB of memory was a lot ü Today, sum of the index statistics are likely 256MB or less
  • 10. 10Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group LoadingStats& ProcCacheUsageLoadingStats& ProcCacheUsage Creating Initial Statistics for table aqi_locations l .....Done creating Initial Statistics for table aqi_locations l Creating Initial Statistics for table aqi_samples s .....Done creating Initial Statistics for table aqi_samples s Creating Initial Statistics for index aqi_locations_PK .....Done creating Initial Statistics for index aqi_locations_PK … Phase 2b initialization of OptBlock0 ... ... phase 2b done. Start merging statistics for table aqi_locations l ..... Done merging statistics for table aqi_locations l Start merging statistics for table aqi_samples s ..... Done merging statistics for table aqi_samples s … Total estimated I/O cost for statement 1 (at line 1): 33926. Parse and Compile Time 0. Adaptive Server cpu time: 0 ms. Statement: 1 Compile time resource usage: (est worker processes=0 proccache=126), Execution time resource usage: (worker processes=0 auxsdesc=0 plansize=14 proccache=23 proccache hwm=28 tempdb hwm=2) Private buffer count: 48,Private HWM buffer count: 48 use demo_db go set statement_cache off set switch on 3604 set option show long set statistics time, io, resource, plancost on set showplan on go select l.city, l.county, s.sample_date, s.air_temp from aqi_locations l, aqi_samples s where l.location_id=s.location_id and s.sample_date = 'July 1 2000 12:00:00:000PM' and l.state='PA' and s.weather='Overcast' and s.air_temp = 90 go set switch off 3604 set option show off set statistics time, io, resource, plancost off set showplan off go Loading stats Compile time proc cache usage for stats & work plans 126 proc pages * 2k memory page = 252KB
  • 11. (c) 2015 Independent SAP Technical User GroupAnnual Conference, 2015 QUERY PROCESSING &QUERY PROCESSING & OPTIMIZATIONOPTIMIZATION Internals, LOP Trees& Execution
  • 12. 12Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group QP PhasesQP Phases Receivebuffer SQL Parsing Query Normalization q Resolves object id’s q Replaces system functions/functions with literals with literal values q Rearranges AND/OR according to precedence Pre-Processing q Transforms subqueries q Rearranges aggregates q Creates Logical Operators (LOP) Query Optimization Query Execution TDSLANG select * from table where due_dt =getdate() and recv_date is null SELECT {column list} FROM • table COND1 due_dt <=getdate() COND2 (AND) r recv_date is null SELECT {column id’s & datatypes} FROM • objid=123456 COND1 col_id=3 (dt) >= (dt) ‘Jan 1 2015’ COND2 (AND) col_id=4 (dt) IS NULL Receive Buffer SQL Parsing Normalization Pre-Processing Query Optimization Query Execution Focus
  • 13. 13Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group SomeNoteson WaitEventsSomeNoteson WaitEvents Believeit or not…. q Until execution phase, all the rest counts as ‘awaiting command’ in sp_who or WaitEvent ID=250 in monProcessWaits q It kinda makes sense….until query is executing…it isn’t executing… q ….but parsing, compiling & optimization all can use considerable CPU time ü Sooo…that is why set statistics time on reports compile time separately Sooo…if ‘awaitingcommand’ a lot…. q See if packets received are increasing
  • 14. 14Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Optimization Startswith LOP TreeOptimization Startswith LOP Tree Duringpre-processingphase, a LOP treeiscreated q A high level tree that represents the logical operations representing the relations between the entities q Often, the LOP tree is the first place where optimization starts to go wrong….due to bad query formation by developers Use‘set option show on’ toseelop tree q It will be near the very top of the output q You will need trace 3604 enabled Duringexecution, a physical operator (Pop) isused q Lop Join q Pop NLJoin
  • 15. 15Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group ExampleQueryExampleQuery use demo_db go set option show on set switch on 3604 set statistics plancost, time, resource, io on set showplan on set statement_cache off -- avoid rerunning goofy plans from previous run set nodata on -- don’t return results (avoids network time/scrolling of large results) go select l.county, avg(s.air_temp) from aqi_locations l, aqi_samples s where l.location_id=s.location_id and s.sample_date between 'July 1 2000 00:01am' and 'July 31 2000 23:59:59' and state='PA' group by l.county go set option show off set switch off 3604 set statistics plancost, time, resource, io off set showplan off --set statement_cache off go
  • 16. 16Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group ExampleLOP TreeExampleLOP Tree 1> select l.county, avg(s.air_temp) 2> from aqi_locations l, 3> aqi_samples s 4> where l.location_id=s.location_id 5> and s.sample_date between 'July 1 2000 00:01am' and 'July 31 2000 23:59:59' 6> and state='PA' 7> group by l.county The Lop tree: ( project ( group ( join ( scan aqi_locations ) ( scan aqi_samples ) ) ) )
  • 17. 17Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group LOP Tree& OptBlocksLOP Tree& OptBlocks Each LOP treelevel becomesan Optblock q Outermost block (0) is one below (project) q Each block will generally have a relational operator ü Join, group, scalar, etc. ü Scan is only considered an operator if the query only has one entity and no other operators Optimizer will determinean optimal plan for that block q ASE set option show will print optimization for each optblock q The optblock list is also printed at The Lop tree: ( project ( group ( join ( scan aqi_locations ) ( scan aqi_samples ) ) ) ) OptBlock1 OptBlock0
  • 18. 18Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group ExampleOptBlockExampleOptBlock The Lop tree: … OptBlock1 The Lop tree: ( join ( scan aqi_locations ) ( scan aqi_samples ) ) Generic Tables: ( Gtt1( aqi_locations l ) Gtt2( aqi_samples s ) Gti3( aqi_locations_PK ) … Generic Columns: ( Gc0(aqi_locations l ,Rid) Gc1(aqi_locations l ,state) Gc2(aqi_locations l ,location_id) … Predicates: ( { aqi_samples s.sample_date} >= "Jul 1 2000 12:01AM" tc:{5} … Transitive Closures: ( Tc0 = { Gc0(aqi_locations l ,Rid)} … OptBlock0 The Lop tree: ( pseudoscan ) Generic Tables: ( Gtg0 ) Generic Columns: ( Gc8(Gtg0 ,_gcelement_8) Gc9(Gtg0 ,_gcelement_9) Gc10(Gtg0 ,_gcelement_10) … Predicates: ( ) Transitive Closures: ( Tc7 = { Gc8(Gtg0 ,_gcelement_8) Gc12(Gtg0 ,_virtualagg) …
  • 19. 19Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group If you haveany doubtsIf you haveany doubts If your index isbeingconsidered…. q It will be listed in Generic Tables with Gtti ü Format is <tablelist>, <indexlist> q Example: ü Generic Tables: ( Gtt1( aqi_locations l ) Gtt2( aqi_samples s ) Gti3( aqi_locations_PK ) Gti4( city_state_idx ) Gti5( county_state_idx ) Gti6( aqi_samples_PK ) Gti7( aqi_weather_date_idx ) ) If your whereclauseisbeingconsidered… q It will be listed in Predicates q Example: ü Predicates: ( { aqi_samples s.sample_date} >= "Jul 1 2000 12:01AM" tc:{5} { aqi_samples s.sample_date} <= "Jul 31 2000 11:59PM" tc:{5} { aqi_locations l.state} = 'PA' tc:{1} )
  • 20. 20Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Tofind optimization detailsTofind optimization details Look for optblock begin/end section markersin output q Begin  ************************************************************************** ****  BEGIN: Search Space Traversal for OptBlock1  ************************************************************************** **** q End  ************************************************************************** ****  DONE: Search Space Traversal for OptBlock1  ************************************************************************** **** Any section could befairly lengthy q The key is to find the optblock where you think the problem is….
  • 21. 21Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group TheLOP role…a taleof twoqueriesTheLOP role…a taleof twoqueries select * into tempdb..my_objects from sybsystemprocs..sysobjects create index type_date_idx on tempdb..my_objects (type, crdate) declare @type char(2) select @type='P' select @type, max(crdate) from tempdb..my_objects where type=@type declare @type char(2) select @type='P' select type, max(crdate) from tempdb..my_objects where type=@type group by type The setup: “Good” Query: “Bad” Query:
  • 22. 22Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Theshowplans…and final IO costsTheshowplans…and final IO costs QUERY PLAN FOR STATEMENT 2 (at line 9). Optimized using Serial Mode STEP 1 The type of query is SELECT. 2 operator(s) under root |ROOT:EMIT Operator (VA = 2) | | |SCALAR AGGREGATE Operator (VA = 1) | | Evaluate Ungrouped MAXIMUM AGGREGATE. | | Scanning only up to the first qualifying row. | | | | |SCAN Operator (VA = 0) | | | FROM TABLE | | | my_objects | | | Index : type_date_idx | | | Backward scan. | | | Positioning by key. | | | Index contains all needed columns. Base table will not be read. | | | Keys are: | | | type ASC | | | Using I/O Size 4 Kbytes for index leaf pages. | | | With LRU Buffer Replacement Strategy for index leaf pages. Total estimated I/O cost for statement 2 (at line 9): 54. … Table: my_objects scan count 1, logical reads: (regular=2 apf=0 total=2), physical reads: (regular=0 apf=0 total=0), apf IOs used=0 Total actual I/O cost for this command: 4. “Good” Query Plan & Cost: QUERY PLAN FOR STATEMENT 2 (at line 9). Optimized using Serial Mode STEP 1 The type of query is SELECT. 3 operator(s) under root |ROOT:EMIT Operator (VA = 3) | | |RESTRICT Operator (VA = 2)(0)(0)(0)(4)(0) | | | | |GROUP SORTED Operator (VA = 1) | | | Evaluate Grouped MAXIMUM AGGREGATE. | | | | | | |SCAN Operator (VA = 0) | | | | FROM TABLE | | | | my_objects | | | | Index : type_date_idx | | | | Forward Scan. | | | | Positioning by key. | | | | Index contains all needed columns. Base table will not be read. | | | | Keys are: | | | | type ASC | | | | Using I/O Size 4 Kbytes for index leaf pages. | | | | With LRU Buffer Replacement Strategy for index leaf pages. Total estimated I/O cost for statement 2 (at line 9): 360. … Table: my_objects scan count 1, logical reads: (regular=4 apf=0 total=4), physical reads: (regular=0 apf=0 total=0), apf IOs used=0 Total actual I/O cost for this command: 8. “Bad” Query Plan & Cost:
  • 23. 23Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group A first clue…theplancostA first clue…theplancost ==================== Lava Operator Tree ==================== Emit (VA = 2) r:1 er:1 cpu: 0 / ScalarAgg Max (VA = 1) r:1 er:1 cpu: 0 / IndexScan type_date_idx (VA = 0) r:1 er:1 l:2 el:2 p:0 ep:2 ============================================================ “Good” Query LOP Plancost: ==================== Lava Operator Tree ==================== Emit (VA = 3) r:1 er:6 cpu: 0 / Restrict (0)(0)(0)(4)(0) (VA = 2) r:1 er:6 / GroupSorted Grouping (VA = 1) r:1 er:6 / IndexScan type_date_idx (VA = 0) r:647 er:598 l:4 el:4 p:0 ep:4 ============================================================ “Bad” Query LOP Plancost:
  • 24. 24Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Theactual LOP treesTheactual LOP trees The Lop tree: ( project ( scalar ( scan my_objects ) ) ) OptBlock1 The Lop tree: ( scan my_objects ) OptBlock0 The Lop tree: ( pseudoscan ) “Good” Query LOP tree: The Lop tree: ( project ( group ( scan my_objects ) ) ) OptBlock1 The Lop tree: ( scan my_objects ) OptBlock0 The Lop tree: ( pseudoscan ) “Bad” Query LOP Plancost:
  • 25. 25Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group TheLessonTheLesson TheLOP can influenceoptimization and final costs q Try to use operators that are lighter weight (e.g. scalar vs. group by) q In this case, we knew the @type up front…. ü Re-selecting it in the ‘group by’ variant is duplicative/redundant ü Literals, @vars are scalars whereas group by is a vector Execution can play a roleaswell q We saw in this example, in the scalar variant that the optimizer can limit the rows to be scanned  | |SCALAR AGGREGATE Operator (VA = 1)  | | Evaluate Ungrouped MAXIMUM AGGREGATE.  | | Scanning only up to the first qualifying row. q Execution can also short-circuit based in certain
  • 26. 26Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Optimization vs. Execution (1)Optimization vs. Execution (1) Optimizer getsa lot of blamefor thingsit isnot involved in Example: q Customer on SCN whines about table scan due to optimizer ‘bug’ on the following example query   Select * from sysobjects  Where id=8 OR 1=2  q Customer “thinks” optimizer should simply use the index  What doyou think thereal problem isand why???
  • 27. 27Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Let’sstart simple(1)Let’sstart simple(1) 1> select count(*) from sysobjects plan '(t_scan sysobjects)' QUERY PLAN FOR STATEMENT 1 (at line 1). Optimized using Serial Mode Optimized using the Abstract Plan in the PLAN clause. STEP 1 The type of query is SELECT. 2 operator(s) under root |ROOT:EMIT Operator (VA = 2) | | |SCALAR AGGREGATE Operator (VA = 1) | | Evaluate Ungrouped COUNT AGGREGATE. | | | | |SCAN Operator (VA = 0) | | | FROM TABLE | | | sysobjects | | | Table Scan. | | | Forward Scan. | | | Positioning at start of table. | | | Using I/O Size 32 Kbytes for data pages. | | | With LRU Buffer Replacement Strategy for data pages. Total estimated I/O cost for statement 1 (at line 1): 414. Parse and Compile Time 0. Adaptive Server cpu time: 0 ms. ----------- 702 Let’s force a table scan just to see how many LIO’s it takes
  • 28. 28Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Let’sstart simple(2)Let’sstart simple(2) Statement: 1 Compile time resource usage: (est worker processes=0 proccache=57), Execution time resource usage: (worker processes=0 auxsdesc=0 plansize=6 proccache=7 proccache hwm=7 tempdb hwm=0) ==================== Lava Operator Tree ==================== Emit (VA = 2) r:1 er:1 cpu: 0 / ScalarAgg Count (VA = 1) r:1 er:1 cpu: 0 / TableScan sysobjects (VA = 0) r:702 er:702 l:26 el:26 p:0 ep:4 ============================================================ Table: sysobjects scan count 1, logical reads: (regular=26 apf=0 total=26), physical reads: (regular=0 apf=0 total=0), apf IOs used=0 Total actual I/O cost for this command: 52. Total writes for this command: 0 Execution Time 0. Adaptive Server cpu time: 0 ms. Adaptive Server elapsed time: 0 ms. The answer is 26…remember that
  • 29. 29Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group A simplefalseexpression (1)A simplefalseexpression (1) 1> select * from sysobjects where 1=2 QUERY PLAN FOR STATEMENT 1 (at line 1). Optimized using Serial Mode STEP 1 The type of query is SELECT. 2 operator(s) under root |ROOT:EMIT Operator (VA = 2) | | |RESTRICT Operator (VA = 1)(4)(0)(0)(0)(0) | | | | |SCAN Operator (VA = 0) | | | FROM TABLE | | | sysobjects | | | Table Scan. | | | Forward Scan. | | | Positioning at start of table. | | | Using I/O Size 4 Kbytes for data pages. | | | With LRU Buffer Replacement Strategy for data pages. Total estimated I/O cost for statement 1 (at line 1): 237. Parse and Compile Time 0. Adaptive Server cpu time: 0 ms. We are still going to do an table scan….
  • 30. 30Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group A simplefalseexpression (2)A simplefalseexpression (2) Statement: 1 Compile time resource usage: (est worker processes=0 proccache=69), Execution time resource usage: (worker processes=0 auxsdesc=0 plansize=14 proccache=15 proccache hwm=15 tempdb hwm=0) ==================== Lava Operator Tree ==================== Emit (VA = 2) r:0 er:702 cpu: 0 / Restrict (4)(0)(0)(0)(0) (VA = 1) r:0 er:702 / TableScan sysobjects (VA = 0) r:0 er:702 l:0 el:1 p:0 ep:1 ============================================================ Table: sysobjects scan count 0, logical reads: (regular=0 apf=0 total=0), physical reads: (regular=0 apf=0 total=0), apf IOs used=0 Total actual I/O cost for this command: 0. Total writes for this command: 0 Execution Time 0. Adaptive Server cpu time: 0 ms. Adaptive Server elapsed time: 0 ms. (0 rows affected) What happened to our 26 IO’s???
  • 31. 31Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Digginga Bit Deeper (1)Digginga Bit Deeper (1) 1> select * from sysobjects where 1=2 2> The Lop tree: ( project ( scan sysobjects ) ) OptBlock0 The Lop tree: ( scan sysobjects ) Generic Tables: ( Gtt0( sysobjects ) ) Generic Columns: … Predicates: ( 1=2) Transitive Closures: … We do see the expression…but notice there is no index listed in Generic Tables… ….and notice that the predicate listed doesn’t have a condition number (tc{#})…
  • 32. 32Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Digginga Bit Deeper (2)Digginga Bit Deeper (2) ****************************************************************************** BEGIN: Search Space Traversal for OptBlock0 ****************************************************************************** Scan plans selected for this optblock: Statistics for rows returned to client... Estimated rows :702 Estimated row width :239.5 Estimated client cost is :132.95 Estimating selectivity for table 'sysobjects' Table scan cost is 702 rows, 21 pages, Cost adjusted for Fastfirstrow goal, Adjustment ratio0.001424501 Adjusted Table scan cost is 1 rows, 21 pages, The table (Datarows) has 702 rows, 21 pages, Data Page Cluster Ratio 0.9999900 Search argument selectivity is 1. using table prefetch (size 32K I/O) Large IO selected: The number of leaf pages qualified is > MIN_PREFETCH pages in data cache 'default data cache' (cacheid 0) with LRU replacement OptBlock0 Eqc{0} -> Pops added: ( PopTabScan sysobjects ) cost:237.6 T(L1,P0.9999995,C2106) O(L1,P0.9999995,C2106) order: none The best plan found in OptBlock0 : ( PopTabScan cost:237.6 T(L1,P0.9999995,C2106) O(L1,P0.9999995,C2106) props: [{}] Gtt0( sysobjects ) ) cost:237.6 T(L1,P0.9999995,C2106) O(L1,P0.9999995,C2106) order: none Hmmm….no indexes looked at…
  • 33. 33Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Let’sTry SomethingClose(1)Let’sTry SomethingClose(1) 1> select * from sysobjects where id=8 and 1=2 QUERY PLAN FOR STATEMENT 1 (at line 1). Optimized using Serial Mode STEP 1 The type of query is SELECT. 2 operator(s) under root |ROOT:EMIT Operator (VA = 2) | | |RESTRICT Operator (VA = 1)(4)(0)(0)(0)(0) | | | | |SCAN Operator (VA = 0) | | | FROM TABLE | | | sysobjects | | | Using Clustered Index. | | | Index : csysobjects | | | Forward Scan. | | | Positioning by key. | | | Keys are: | | | id ASC | | | Using I/O Size 4 Kbytes for index leaf pages. | | | With LRU Buffer Replacement Strategy for index leaf pages. | | | Using I/O Size 4 Kbytes for data pages. | | | With LRU Buffer Replacement Strategy for data pages. Total estimated I/O cost for statement 1 (at line 1): 81. Parse and Compile Time 0. Adaptive Server cpu time: 0 ms. Heyyy!!!! We used an index…even with a FALSE expression….
  • 34. 34Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Let’sTry SomethingClose(2)Let’sTry SomethingClose(2) Statement: 1 Compile time resource usage: (est worker processes=0 proccache=69), Execution time resource usage: (worker processes=0 auxsdesc=0 plansize=14 proccache=17 proccache hwm=17 tempdb hwm=0) ==================== Lava Operator Tree ==================== Emit (VA = 2) r:0 er:71 cpu: 0 / Restrict (4)(0)(0)(0)(0) (VA = 1) r:0 er:71 / IndexScan csysobjects (VA = 0) r:0 er:71 l:0 el:3 p:0 ep:3 ============================================================ Table: sysobjects scan count 0, logical reads: (regular=0 apf=0 total=0), physical reads: (regular=0 apf=0 total=0), apf IOs used=0 Total actual I/O cost for this command: 0. Total writes for this command: 0 Execution Time 0. Adaptive Server cpu time: 0 ms. Adaptive Server elapsed time: 0 ms. (0 rows affected) …but we *STILL* didn’t do any LIO’s….how is that???
  • 35. 35Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Let’sTry SomethingClose(3)Let’sTry SomethingClose(3) 1> select * from sysobjects where id=8 and 1=2 2> 3> The Lop tree: ( project ( scan sysobjects ) ) OptBlock0 The Lop tree: ( scan sysobjects ) Generic Tables: ( Gtt0( sysobjects ) Gti1( csysobjects ) ) Generic Columns: … Predicates: ( { sysobjects.id } = 8 tc:{25} 1=2) Transitive Closures: … …We now have an index to look at as well as a predicate with a tc{#}….it applies to the condition before the label.
  • 36. 36Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Let’sTry SomethingClose(4)Let’sTry SomethingClose(4) ****************************************************************************** BEGIN: Search Space Traversal for OptBlock0 ****************************************************************************** Scan plans selected for this optblock: Statistics for rows returned to client... Estimated rows :70.2 Estimated row width :239.5 Estimated client cost is :14.7343 Scan on table sysobjects skipped because table scan less than concurrency threshold Scan on table sysobjects skipped because table scan less than concurrency threshold Beginning selection of qualifying indexes for table 'sysobjects', Estimating selectivity of index 'sysobjects.csysobjects', indid 3 id = 8 Estimated selectivity for id, selectivity = 0.1, scan selectivity 0.001424501, filter selectivity 0.001424501 restricted selectivity 0.1 Cost adjusted for Fastfirstrow goal, Adjustment ratio 0.01424501 unique index with all keys, one row scans 1 rows, 1 pages Adjustment ratio 0.01424501 applied gives 0.01424501 rows, 1 pages Data Row Cluster Ratio 0.06314244 Index Page Cluster Ratio 0.99999 Data Page Cluster Ratio 0.2469512 using no index prefetch (size 4K I/O) in index cache 'default data cache' (cacheid 0) with LRU replacement Yep, we evaluated the index
  • 37. 37Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Let’sTry SomethingClose(5)Let’sTry SomethingClose(5) ****************************************************************************** BEGIN: Search Space Traversal for OptBlock0 ****************************************************************************** … using no table prefetch (size 4K I/O) in data cache 'default data cache' (cacheid 0) with LRU replacement Data Page LIO for 'csysobjects' on table 'sysobjects' = 1 OptBlock0 Eqc{0} -> Pops added: ( PopRidJoin ( PopIndScan csysobjects sysobjects ) ) cost:81.39999 T(L3,P3,C4) O(L1,P1,C3) order: none The best plan found in OptBlock0 : ( PopRidJoin cost:81.39999 T(L3,P3,C4) O(L1,P1,C3) props: [{}] ( PopIndScan cost:54.09999 T(L2,P2,C1) O(L2,P2,C1) props: [{}] Gti1( csysobjects ) Gtt0( sysobjects ) ) cost:54.09999 T(L2,P2,C1) O(L2,P2,C1) order: none ) cost:81.39999 T(L3,P3,C4) O(L1,P1,C3) order: none ****************************************************************************** DONE: Search Space Traversal for OptBlock0 ****************************************************************************** …and that was about it….so we go with the index
  • 38. 38Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Understandingwhat happenedUnderstandingwhat happened Query optimizer optimizes…not executes q Expression evaluation happens during execution time q Soooo….. 1=2 is not even looked at by optimizer ü Both are literals and optimizer skips this as a literal expression that cannot be optimized Query execution can ‘short circuit’ q Obviously false expressions q N-ary Nested Loop Joins q …
  • 39. 39Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Soo….What about Our Query?Soo….What about Our Query? Our Example:   Select * from sysobjects  Where id=8 OR 1=2  What happens q Optimizer evaluates index on id=8 q Optimizer sees OR clause ü …opposite side of OR clause is unoptimizable expression which could be *anything* (e.g. an unindexed param like type=‘U’) ü Since it could be anything OR clause means table scan q Since we have to table scan the OR’d condition…. ü No sense in using the index for id=8…we will just hit those rows on the way by doing the OR clause
  • 40. 40Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Why did I bringthat up???Why did I bringthat up??? Haveyou ever donethisin a stored proc???  Select….  from tableA, …  where …  and (((@var1=1) and (colA=‘value’))  or ((@var1=2) and (colB=‘value))  ) Or worseyet…  Select….  from tableA, …  where …  and (((@var1=1) and (colA=‘value’))  or ((@var1=2) and (colB=‘value))  ) I have….ooops….
  • 41. 41Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group A morecomplicated exampleA morecomplicated example INSERT INTO #temp (...) SELECT DISTINCT ... FROM MYDBNAME..TABLE_A A , MYDBNAME..TABLE_B B , MYDBNAME..TABLE_C C , MYDBNAME..TABLE_D D , MYDBNAME..TABLE_E E , MYDBNAME..TABLE_F F , MYDBNAME..TABLE_G G , MYDBNAME..TABLE_H H WHERE A.COLUMN_1 = @VARIABLE_1 AND A.COLUMN_2 = @VARIABLE_2 AND A.COLUMN_3 = IsNull(@VARIABLE_3,A.COLUMN_3) AND A.COLUMN_4 = IsNull(@VARIABLE_4,A.COLUMN_4) AND A.COLUMN_5 = IsNull(@VARIABLE_5,A.COLUMN_5) ... AND A.COLUMN_6 BETWEEN @VARIABLE_6 AND @VARIABLE_7 ... ORDER BY ... Customer is trying to avoid writing IF/ELSE logic for different conditions/variables being passed in…if @VAR3-5 are set, the intent would be that they would be used as SARGs….but if not set, then the predicate is a no-op as column is compared to itself….
  • 42. 42Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Simplifying(1)Simplifying(1) use demo_db go set statement_cache off set switch on 3604 set option show on set statistics time, io, resource, plancost on set showplan on go declare @air_temp smallint, @weather varchar(30), @bDate datetime, @eDate datetime select @air_temp=null, @weather=null, @bDate='July 1 2000 00:00:01', @eDate='July 31 2000 23:59:59' --select @air_temp=80, @weather='sunny',@bDate='July 1 2000 00:00:01', @eDate='July 31 2000 23:59:59' select count(*) from aqi_samples where sample_date between @bDate and @eDate and air_temp=isnull(@air_temp,air_temp) and weather=isnull(@weather,weather) go set switch off 3604 set option show off set statistics time, io, resource, plancost off set showplan off go Table has 168M rows with an index on {sample_date, air_temp, weather} …first run with nulls for second 2 index keys
  • 43. 43Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Simplifying(2)Simplifying(2) The Lop tree: ( project ( scalar ( scan aqi_samples ) ) ) OptBlock1 The Lop tree: ( scan aqi_samples ) Generic Tables: ( Gtt1( aqi_samples ) Gti2( aqi_samples_PK ) Gti3( aqi_weather_date_idx ) ) Generic Columns: … Predicates: ( { aqi_samples.sample_date} >= "Jan 1 1900 12:00AM" tc:{3} { aqi_samples.sample_date} <= "Jan 1 1900 12:00AM" tc:{3} ) Transitive Closures: … OptBlock0 The Lop tree: ( pseudoscan ) Generic Tables: ( Gta0 ) Generic Columns: … Predicates: ( ) Transitive Closures: … The between clause is only one passed to optimizer… not much of a surprise as with the NULLs, we are expecting no-ops on air_temp and weather. Note that since we don’t know the value of @vars at compile time, we use default date here
  • 44. 44Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Simplifying(3)Simplifying(3) Total estimated I/O cost for statement 3 (at line 4): 17133977. ==================== Lava Operator Tree ==================== Emit (VA = 3) r:1 er:1 cpu: 0 / ScalarAgg Count (VA = 2) r:1 er:1 cpu: 400 / Restrict (0)(0)(0)(11)(0) (VA = 1) r:1.303e+006 er:4.202e+007 / IndexScan aqi_weather_date (VA = 0) r:1.303e+006 er:4.202e+007 l:1969 el:63590 p:251 ep:8005 ============================================================ Table: aqi_samples scan count 1, logical reads: (regular=1969 apf=0 total=1969), physical reads: (regular=8 apf=243 total=251), apf IOs used=243 Total actual I/O cost for this command: 10213. Total writes for this command: 0 Execution Time 4. Adaptive Server cpu time: 417 ms. Adaptive Server elapsed time: 417 ms. Our total IO estimate is 17M+….Our estimated rows (from IndexScan) are off by 30x….which is bad…
  • 45. 45Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Simplifying– Rerun (1)Simplifying– Rerun (1) use demo_db go set statement_cache off set switch on 3604 set option show on set statistics time, io, resource, plancost on set showplan on go declare @air_temp smallint, @weather varchar(30), @bDate datetime, @eDate datetime --select @air_temp=null, @weather=null, @bDate='July 1 2000 00:00:01', @eDate='July 31 2000 23:59:59' select @air_temp=80, @weather='sunny',@bDate='July 1 2000 00:00:01', @eDate='July 31 2000 23:59:59' select count(*) from aqi_samples where sample_date between @bDate and @eDate and air_temp=isnull(@air_temp,air_temp) and weather=isnull(@weather,weather) go set switch off 3604 set option show off set statistics time, io, resource, plancost off set showplan off go Table has 168M rows with an index on {sample_date, air_temp, weather} …second run with values for second 2 index keys
  • 46. 46Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Simplifying- Rerun (2)Simplifying- Rerun (2) The Lop tree: ( project ( scalar ( scan aqi_samples ) ) ) OptBlock1 The Lop tree: ( scan aqi_samples ) Generic Tables: ( Gtt1( aqi_samples ) Gti2( aqi_samples_PK ) Gti3( aqi_weather_date_idx ) ) Generic Columns: … Predicates: ( { aqi_samples.sample_date} >= "Jan 1 1900 12:00AM" tc:{3} { aqi_samples.sample_date} <= "Jan 1 1900 12:00AM" tc:{3} ) Transitive Closures: … OptBlock0 The Lop tree: ( pseudoscan ) Generic Tables: ( Gta0 ) Generic Columns: … Predicates: ( ) Transitive Closures: … The between clause is still the only one passed to optimizer… which means this fails as a coding style
  • 47. 47Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Simplifying- Rerun (3)Simplifying- Rerun (3) Total estimated I/O cost for statement 3 (at line 4): 17133977. ==================== Lava Operator Tree ==================== Emit (VA = 3) r:1 er:1 cpu: 0 / ScalarAgg Count (VA = 2) r:1 er:1 cpu: 300 / Restrict (0)(0)(0)(11)(0) (VA = 1) r:0 er:4.202e+007 / IndexScan aqi_weather_date (VA = 0) r:1.303e+006 er:4.202e+007 l:1969 el:63590 p:0 ep:8005 ============================================================ Table: aqi_samples scan count 1, logical reads: (regular=1969 apf=0 total=1969), physical reads: (regular=0 apf=0 total=0), apf IOs used=0 Total actual I/O cost for this command: 3938. Total writes for this command: 0 Execution Time 3. Adaptive Server cpu time: 309 ms. Adaptive Server elapsed time: 309 ms. We get the same estimates for total IO (17M) and in the bottom node, but the Restrict filters out non-qualifying rows – so we get 0….and finish 100ms faster…the faster execution might make developer think it worked. However, we do the same amount of work (1969 LIOs) so the faster exec is just likely the reduction in ScalarAgg (which it is) due to fewer rows to count.
  • 48. 48Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Simplifying– Correct (1)Simplifying– Correct (1) use demo_db go set statement_cache off set switch on 3604 set option show on set statistics time, io, resource, plancost on set showplan on go declare @air_temp smallint, @weather varchar(30), @bDate datetime, @eDate datetime --select @air_temp=null, @weather=null, @bDate='July 1 2000 00:00:01', @eDate='July 31 2000 23:59:59' select @air_temp=80, @weather='sunny',@bDate='July 1 2000 00:00:01', @eDate='July 31 2000 23:59:59' select count(*) from aqi_samples where sample_date between @bDate and @eDate and air_temp=@air_temp and weather=@weather go set switch off 3604 set option show off set statistics time, io, resource, plancost off set showplan off go Table has 168M rows with an index on {sample_date, air_temp, weather} …third run with the way it should be…
  • 49. 49Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Simplifying- Correct (2)Simplifying- Correct (2) The Lop tree: ( project ( scalar ( scan aqi_samples ) ) ) OptBlock1 The Lop tree: ( scan aqi_samples ) Generic Tables: ( Gtt1( aqi_samples ) Gti2( aqi_samples_PK ) Gti3( aqi_weather_date_idx ) ) Generic Columns: … Predicates: ( { aqi_samples.sample_date} >= "Jan 1 1900 12:00AM" tc:{3} { aqi_samples.sample_date} <= "Jan 1 1900 12:00AM" tc:{3} { aqi_samples.air_temp} = 0 tc:{2} { aqi_samples.weather} = ' tc:{1} ) Transitive Closures: … OptBlock0 The Lop tree: ( pseudoscan ) Generic Tables: ( Gta0 ) Generic Columns: … Predicates: ( ) Transitive Closures: … We now have all 3 predicates…since we still have @vars with unknown values, we substitute a 0 for int/smallint and ‘ (empty string) for varchar/char
  • 50. 50Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Simplifying- Correct (3)Simplifying- Correct (3) Total estimated I/O cost for statement 3 (at line 4): 227844. ==================== Lava Operator Tree ==================== Emit (VA = 2) r:1 er:1 cpu: 0 / ScalarAgg Count (VA = 1) r:1 er:1 cpu: 0 / IndexScan aqi_weather_date (VA = 0) r:0 er:450006 l:306 el:1307 p:0 ep:165 ============================================================ Table: aqi_samples scan count 1, logical reads: (regular=306 apf=0 total=306), physical reads: (regular=0 apf=0 total=0), apf IOs used=0 Total actual I/O cost for this command: 612. Total writes for this command: 0 Execution Time 0. Adaptive Server cpu time: 1 ms. Adaptive Server elapsed time: 1 ms. Total estimated IO is 228K (vs. 17M) and estimated rowcount is TONS less…still off, but likely due to data skew and not knowing values of @vars…. And we only do 300 LIO vs. 1969….and we finish 300x faster
  • 51. 51Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Index Keys: TheQueryIndex Keys: TheQuery SELECT SUM( T_00 ."MBGBTR" ) FROM "COEP" T_00 INNER JOIN "COBK" T_01 ON T_01 ."KOKRS" = ? AND T_01 ."BELNR" = T_00 ."BELNR" WHERE T_00 ."MANDT" = ? AND T_00 ."LEDNR" = ? AND T_00 ."OBJNR" = ? AND ( T_00 ."KSTAR" BETWEEN ? AND ? OR T_00 ."KSTAR" IN ( ? , ? , ? , ? ) ) AND T_01 ."AWTYP" = ? /* R3:ZVDESR121:558 T:COEP M:400 */ index_name index_keys index_description, COEP~0 MANDT, KOKRS, BELNR, BUZEI nonclustered, unique COEP~1 MANDT, LEDNR, OBJNR, GJAHR, WRTTP, VERSN, KSTAR, HRKFT, PERIO, VRGNG, PAROB, USPOB, VBUND, PARGB, BEKNZ, TWAER nonclustered COEP~Z02 MANDT, KOKRS, BUKRS, OBJNR nonclustered COEP_BDLS0 MANDT, LOGSYSO nonclustered COEP~4 MANDT, TIMESTMP, OBJNR nonclustered COEP~Z03 MANDT, LEDNR, OBJNR, KSTAR nonclustered COEP~Z05 MANDT, OBJNR, KSTAR, GJAHR, PERIO, PAROB1, WRTTP nonclustered COEP~Zt1 MANDT, LEDNR, OBJNR, KSTAR nonclustered
  • 52. 52Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Index Keys– Bad Index AccessIndex Keys– Bad Index Access |ROOT:EMIT Operator (VA = 5) | | |SCALAR AGGREGATE Operator (VA = 4) | | Evaluate Ungrouped SUM OR AVERAGE AGGREGATE. | | | | |NESTED LOOP JOIN Operator (VA = 3) (Join Type: Inner Join) | | | | | | |RESTRICT Operator (VA = 1)(0)(0)(0)(4)(0) | | | | | | | | |SCAN Operator (VA = 0) | | | | | FROM TABLE | | | | | COEP | | | | | T_00 | | | | | Index : COEP~4 | | | | | Forward Scan. | | | | | Positioning by key. | | | | | Keys are: | | | | | MANDT ASC | | | | | Using I/O Size 128 Kbytes for index leaf pages. | | | | | With LRU Buffer Replacement Strategy for index leaf pages. | | | | | Using I/O Size 128 Kbytes for data pages. | | | | | With LRU Buffer Replacement Strategy for data pages. | | | | | | |SCAN Operator (VA = 2) | | | | FROM TABLE | | | | COBK | | | | T_01 | | | | Index : COBK~Zt1 | | | | Forward Scan. | | | | Positioning at index start. | | | | Index contains all needed columns. Base table will not be read. | | | | Using I/O Size 16 Kbytes for index leaf pages. | | | | With LRU Buffer Replacement Strategy for index leaf pages.
  • 53. (c) 2015 Independent SAP Technical User GroupAnnual Conference, 2015 OPTIMIZATION COSTINGOPTIMIZATION COSTING (PART 1)(PART 1) Histograms, Column Densities, IN(), Out of RangeHistograms…
  • 54. 54Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group HistogramsHistograms Thekey tocost-based optimization q Really is a distribution of data skew ü If data was evenly distributed, we wouldn’t need histograms at all q Mostly used for range scans q Can be used for equisargs if data highly skewed..as most is Thebasics q Frequency cells q Range cells Statistics for column: "type" Last update of column statistics: Feb 15 2015 9:18:32:850PM Range cell density: 0.0053191489361702 Total density: 0.4216274332277049 Range selectivity: default used (0.33) In between selectivity: default used (0.25) Unique range values: 0.0053191489361702 Unique total values: 0.2000000000000000 Average column width: default used (2.00) Rows scanned: 188.0000000000000000 Statistics version: 4 Histogram for column: "type" Column datatype: char(2) Requested step count: 20 Actual step count: 9 Sampling Percent: 0 Tuning Factor: 20 Out of range Histogram Adjustment is DEFAULT. Low Domain Hashing. Step Weight Value 1 0.00000000 <= "EJ" 2 0.00531915 < "P " 3 0.10638298 = "P " 4 0.00000000 < "S " 5 0.30319148 = "S " 6 0.00000000 < "U " 7 0.56382978 = "U " 8 0.00000000 < "V " 9 0.02127660 = "V " Range Cells Frequency Cells
  • 55. 55Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group How Many StepsDoWeNeedHow Many StepsDoWeNeed Fewer = better for resourceusageand timetofind steps  More= better for optimization accuracy q Ideally, you want most range scans to be in a single cell ü Multiple cells means aggregating stats…may be accurate, but takes longer ü For example, for datetime, columns see if cells cover the common query range (week, month, year, ….)  Hard to near impossible to control to semantic boundaries q Increase stats may be better for estimates with high skew
  • 56. 56Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group ExampleDateHistogramExampleDateHistogram Histogram for column: "sample_date" Column datatype: datetime Requested step count: 100 Actual step count: 103 Sampling Percent: 0 Tuning Factor: 20 Out of range Histogram Adjustment is DEFAULT. Sticky step count. Sticky hashing. Step Weight Value 1 0.00000000 <= "Jan 1 1993 11:59:59:996AM" 2 0.01017933 <= "Feb 13 1993 12:00:00:000PM" 3 0.00763450 <= "Mar 18 1993 12:00:00:000PM" 4 0.01018039 <= "May 1 1993 12:00:00:000PM" 5 0.00766925 <= "Jun 3 1993 12:00:00:000PM" 6 0.00777507 <= "Jul 6 1993 12:00:00:000PM" 7 0.00825124 <= "Aug 8 1993 12:00:00:000PM" 8 0.00816318 <= "Sep 10 1993 12:00:00:000PM" 9 0.00796063 <= "Oct 13 1993 12:00:00:000PM" 10 0.00795876 <= "Nov 15 1993 12:00:00:000PM" 11 0.00795651 <= "Dec 18 1993 12:00:00:000PM" 12 0.00788510 <= "Jan 19 1994 12:00:00:000PM" 13 0.01000150 <= "Feb 28 1994 12:00:00:000PM" 14 0.01000150 <= "Apr 9 1994 12:00:00:000PM“ … ~1.5 month spread…. Problem is that on some months it is mid- month, so a range scan for that month would need 3 cells. If concerned, likely need to double or triple stats
  • 57. 57Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Histograms& StepsHistograms& Steps Default no HTF Defaults 40 steps 100 steps 500 steps Default number of steps 20 20 20 20 20 Histogram tuning factor 1 20 20 20 20 Requested steps 20 20 40 100 500 Actual steps 20 195 509 1550 7580 (Index statistics for combined city,state) Range cell density 0.00328457 0.00121356 0.00022722 0.00010744 0.00003560 Total density 0.00328457 0.00328457 0.00328457 0.00328457 0.00328457 Unique range values 0.00011547 0.00008212 0.00006416 0.00004897 0.00002615 Unique total values 0.00011547 0.00011547 0.00011547 0.00011547 0.00011547 Impact on estimates for Washington DC & San Francisco CA DC Cell <= Washington <= Washington = Washington = Washington = Washington DC Selectivity 0.05184000 0.02155000 0.02063000 0.02063000 0.02063000 DC Row Estimates 5184 2155 2063 2063 2063 SF Cell <= Somerset <= San Jacint = San Franci = San Franci = San Franci SF Selectivity 0.04875000 0.00678000 0.00634000 0.00634000 0.00634000 SF Row Estimates 4875 678 634 634 634 Statistics from an index on {city,state} for a 100,000 row table with ~6,200 distinct city names
  • 58. 58Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Column DensitiesColumn Densities Singlecolumn densities q Range cell density/unique range values ü Tells maximum uniqueness… ü Min(weight)!=0 from range cells q Total density ü Relative skewness of the data ü Total density approaching 1.0 is extremely skewed ü Sum(weights^2) q Unique total values ü The number distinct values in column ü 1.0/select count(distinct column) Multiplecolumn densities q Automatically created on index Statistics for column: "type" Last update of column statistics: Feb 15 2015 9:18:32:850PM Range cell density: 0.0053191489361702 Total density: 0.4216274332277049 Range selectivity: default used (0.33) In between selectivity: default used (0.25) Unique range values: 0.0053191489361702 Unique total values: 0.2000000000000000 Average column width: default used (2.00) Rows scanned: 188.0000000000000000 Statistics version: 4 Statistics for column group: "sample_date", "air_temp", "weather" Last update of column statistics: May 27 2014 11:45:45:016AM Range cell density: 0.0000051075008894 Total density: 0.0000051075008894 Range selectivity: default used (0.33) In between selectivity: default used (0.25) Unique range values: 0.0000016297687032 Unique total values: 0.0000016297687032 Average column width: 8.5268955638740458 Rows scanned: 168066824.0000000000000000 Statistics version: 4
  • 59. 59Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group UsingColumn DensitiesUsingColumn Densities If thecolumn valueisknown and… q …value falls in a range cell ….Estimate will be range cell value ü Whether range or frequency cell If thecolumn valueisnot known q Optimized with a literal placeholder (0, ‘’, Jan 1 1900, etc.) q Selectivity is total density
  • 60. 60Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Column Selectivity vs. Density (1)Column Selectivity vs. Density (1) Statistics for column: "id" Last update of column statistics: Feb 16 2015 4:47:23:956PM Range cell density: 0.0092592412744228 Total density: 0.0113194187537711 Unique range values: 0.0041383133267069 Unique total values: 0.0055248618784530 Step Weight Value 1 0.00000000 < 1 2 0.01093356 = 1 3 0.01387721 <= 2 4 0.01261564 <= 3 5 0.00714886 <= 4 6 0.00294365 <= 5 7 0.00462574 <= 6 8 0.00210261 <= 8 9 0.00336417 <= 9 10 0.00336417 <= 11 11 0.00378469 <= 12 12 0.00925147 <= 13 13 0.00210261 <= 15 14 0.01808242 <= 16 15 0.00252313 <= 17 16 0.00252313 <= 18 17 0.00168209 <= 19 18 0.00000000 < 21 19 0.00630782 = 21 20 0.00252313 <= 22 21 0.01429773 <= 23 22 0.03868797 <= 24 23 0.00378469 <= 25 1> declare @id int 2> select @id=8 3> select * from syscolumns where id=@id Estimating selectivity of index 'syscolumns.csyscolumns', indid 2 id = 0 Estimated selectivity for id, selectivity = 0.01131942, scan selectivity 0.01131942, filter selectivity 0.01131942 26.91758 rows, 1 pages range cell unknown 1> select * from syscolumns where id=8 Estimating selectivity of index 'syscolumns.csyscolumns', indid 2 id = 8 Estimated selectivity for id, selectivity = 0.002102607, scan selectivity 0.002102607, filter selectivity 0.002102607 5 rows, 1 pages Weight < range cell density selectivity = weight
  • 61. 61Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Column Selectivity vs. Density (2)Column Selectivity vs. Density (2) Statistics for column: "id" Last update of column statistics: Feb 16 2015 4:47:23:956PM Range cell density: 0.0092592412744228 Total density: 0.0113194187537711 Unique range values: 0.0041383133267069 Unique total values: 0.0055248618784530 Step Weight Value 1 0.00000000 < 1 2 0.01093356 = 1 3 0.01387721 <= 2 4 0.01261564 <= 3 5 0.00714886 <= 4 6 0.00294365 <= 5 7 0.00462574 <= 6 8 0.00210261 <= 8 9 0.00336417 <= 9 10 0.00336417 <= 11 11 0.00378469 <= 12 12 0.00925147 <= 13 13 0.00210261 <= 15 14 0.01808242 <= 16 15 0.00252313 <= 17 16 0.00252313 <= 18 17 0.00168209 <= 19 18 0.00000000 < 21 19 0.00630782 = 21 20 0.00252313 <= 22 21 0.01429773 <= 23 22 0.03868797 <= 24 23 0.00378469 <= 25 1> select * from syscolumns where id=21 Estimating selectivity of index 'syscolumns.csyscolumns', indid 2 id = 21 Estimated selectivity for id, selectivity = 0.006307822, scan selectivity 0.006307822, filter selectivity 0.006307822 15 rows, 1 pages Frequency cell selectivity = weight 1> select * from syscolumns where id=24 Estimating selectivity of index 'syscolumns.csyscolumns', indid 2 id = 24 Estimated selectivity for id, selectivity = 0.03868797, scan selectivity 0.03868797, filter selectivity 0.03868797 92 rows, 1 pages Weight > range cell density selectivity = weight
  • 62. 62Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Column Selectivity vs. Density (3)Column Selectivity vs. Density (3) Statistics for column: "id" Last update of column statistics: Feb 16 2015 4:47:23:956PM Range cell density: 0.0092592412744228 Total density: 0.0113194187537711 Unique range values: 0.0041383133267069 Unique total values: 0.0055248618784530 Step Weight Value 1 0.00000000 < 1 2 0.01093356 = 1 3 0.01387721 <= 2 4 0.01261564 <= 3 5 0.00714886 <= 4 6 0.00294365 <= 5 7 0.00462574 <= 6 8 0.00210261 <= 8 9 0.00336417 <= 9 10 0.00336417 <= 11 11 0.00378469 <= 12 12 0.00925147 <= 13 13 0.00210261 <= 15 14 0.01808242 <= 16 15 0.00252313 <= 17 16 0.00252313 <= 18 17 0.00168209 <= 19 18 0.00000000 < 21 19 0.00630782 = 21 20 0.00252313 <= 22 21 0.01429773 <= 23 22 0.03868797 <= 24 23 0.00378469 <= 25 1> select * from syscolumns where id between 5 and 10 Estimating selectivity of index 'syscolumns.csyscolumns', indid 2 id >= 5 id <= 10 Estimated selectivity for id, selectivity = 0.01471826, scan selectivity 0.01471826, filter selectivity 0.01471826 35.00002 rows, 1 pages Range query Note that the sum of steps 6 10 is 0.01640034. However, since we are only using a portion of step 10 and the distribute is 2 values per step, we use the formula: Sum(step6..step9) + step10/2.0 = 0.01471826
  • 63. 63Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group DebuggingSelectivityDebuggingSelectivity You’veprobably noticed…. q You need to have ‘set option show’ and optdiag output Find theindex you thought it should haveused q Look at the selectivity for each predicate q Check out the optdiag to see if it was a really skewed value But sometimesyou just havetolook at thequery q …your expectation may be due to knowledge you infer ü But optimizer doesn’t know ü ….such as the relationship between two columns q …and sometimes the indexing doesn’t support the query
  • 64. 64Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Unbounded DateRangeUnbounded DateRange create table jobs ( job_number numeric(30,0), … job_category varchar(20), -- 10 distinct values job_priority tinyint, -- 100 distinct values job_begindate datetime, job_enddate datetime, job_status char(1), -- 6 distinct values …, primary key (job_number) ) Consider the above table for each of the scenarios on the following slides. Note the key columns of job dates and those that have some distinct values listed.
  • 65. 65Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Scenario#1Scenario#1 Consider theindex:  create index job_begin_idx on jobs (job_begindate) …and thetypical query  Select * from jobs  Where job_begindate >= $begin_date  and job_enddate <= $end_date   Why isLIO sometimeshigh and sometimeslow?
  • 66. 66Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Scenario#1: TheProblemsScenario#1: TheProblems Becausetheindex only hasbegin date q On very recent dates, it can go near the end of the index and scan to the end… q But on dates in the past – even a few months ago ü It positions to the $begin_date ü Scans to end of index ü For each leaf node, it does a LIO to data page to compare $end_date ü Some quick math….assume 50 rows per page per index leaf node  100 leaf pages = 5000 data page LIO’s ≈ 1 sec CPU (@5LIO/ms)  1000 leaf pages = 50000 data page LIO’s ≈ 10 sec CPU  10000 leaf pages = 500000 data page LIO’s ≈ 100 sec CPU  100000 leaf pages = 5000000 data page LIO’s ≈ 1000 sec CPU (16m40s) Soooo…. q For dates not very recent, we get an index leaf scan to end of index q Plus a datapage lookup for every leaf row 2010 2011 2012 2013 2014 > 01Mar2011 > 01Nov2012 > 01Jan2014
  • 67. 67Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Scenario#1: TheSolutionsScenario#1: TheSolutions Solution #1: Add job_enddatetoindex  create index job_date_idx  on jobs (job_begindate, job_enddate) Solution #2: Add implied boundary todatequery  Select * from jobs  Where job_begindate between $begin_date and $end_date  and job_enddate between $begin_date and $end_date  Why both??? q Wouldn’t fixing the index be enough – why bother the coders and try to teach them better coding style???
  • 68. 68Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Scenario#2Scenario#2 Consider theindex:  create index job_begin_idx  on jobs (job_category, job_begindate) …and thetypical query  Select * from jobs  Where job_begindate >= $begin_date  and job_enddate <= $end_date  Why doesit sometimesusetheindex and other timesnot?
  • 69. 69Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Scenario#2: TheProblemScenario#2: TheProblem Theproblem iswearemissinga predicateon leadingindex columns q A similar situation occurs when we have intermediate index keys for which we have no valid SARGs Tohandlethis, ASE doesa bit of a trick q It looks at cardinality of unknown keys ü If low it considers an ORScan for each value ü If high, it considers an index leaf scan q Then it considers the selectivity of the known predicates Sooo…asa result q If we pick a date that is fairly recent (index is more selective), then we will likely do an ORScan and then a index leaf scan from the begin date until the next job_category q If we pick a date that isn’t very selective, then the ORScan becomes too expensive due to leaf scan per Orscan and we compare the multiple index leaf scan vs. single table scan
  • 70. 70Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Scenario#2: TheSolutionScenario#2: TheSolution Solution: Add implied boundary todatequery  Select * from jobs  Where job_begindate between $begin_date and $end_date  and job_enddate between $begin_date and $end_date  …and thisiswhy wefix both theindex and thequery q In the above case, considering the index in scenario #2, as long as the range is fairly selective, we likely will do the ORScan
  • 71. 71Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group OrScan in Lava TreeOrScan in Lava Tree ==================== Lava Operator Tree ==================== Emit (VA = 4) r:5 er:1 cpu: 0 / NestLoopJoin Inner Join (VA = 3) r:5 er:1 l:0 el:8 p:0 ep:8 / OrScan Restrict Max Rows: 2 (0)(0)(0)(4)(0) (VA = 0) (VA = 2) r:2 er:-1 r:5 er:1 l:0 el:-1 p:0 ep:-1 / IndexScan TBTCO~7 (VA = 1) r:9 er:1 l:28 el:8 p:0 ep:8 ============================================================
  • 72. 72Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group OrScan in Show PlanOrScan in Show Plan |ROOT:EMIT Operator (VA = 6) | | |NESTED LOOP JOIN Operator (VA = 5) (Join Type: Inner Join) | | | | |NESTED LOOP JOIN Operator (VA = 3) (Join Type: Inner Join) | | | | | | |SCAN Operator (VA = 0) | | | | FROM OR List | | | | OR List has up to 12 rows of OR/IN values. | | | | | | |RESTRICT Operator (VA = 2)(0)(0)(0)(13)(0) | | | | | | | | |SCAN Operator (VA = 1) | | | | | FROM TABLE | | | | | SAPSR3.MSEG | | | | | T_01 | | | | | Index : MSEG~1 | | | | | Forward Scan. | | | | | Positioning by key. | | | | | Keys are: | | | | | MANDT ASC | | | | | MATNR ASC | | | | | Using I/O Size 128 Kbytes for index leaf pages. | | | | | With LRU Buffer Replacement Strategy for index leaf pages. | | | | | Using I/O Size 128 Kbytes for data pages. | | | | | With LRU Buffer Replacement Strategy for data pages. | |
  • 73. 73Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Scenario#3Scenario#3 Consider thefollowingindex  create index job_begin_idx  on jobs (job_category, job_status, job_begindate, job_enddate) …and thetypical query  Select * from jobs  Where job_category = ‘night batch’  and job_status in (‘U’, ‘A’, ‘E’)  and job_begindate >= $begin_date  and job_enddate <= $end_date  Why might weonly position by job_category, job_status?
  • 74. 74Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Scenario#3: TheProblemScenario#3: TheProblem Theproblem iswedon’t havemulti-density stats q And creating them might be a bit of a nightmare Asa result, ASE doesthefollowing q It weighs each selectivity individually: ü ‘nightly batch’ + ‘U’ + $begin_date ü ‘nightly batch’ + ‘A’ + $begin_date ü ‘nightly batch’ + ‘E’ + $begin_date q Then aggregates Here’stheproblem….assumeweonly have20 steps q Let’s pick a begin date 3 or more steps from the end ü …and assume end_date is in the same step ü …but remember, we have an unbounded range on both ….so  …effectively it will think it will be 3 steps for each $begin_date….not 1  …and it will thing $end_date is atrocious as is 17 steps worth (from beginning) q If we aggregate, then we will have 3x….so 9 steps….40% of table is 8 steps….we might table scan or look for different index
  • 75. 75Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Scenario#3: TheSolutionScenario#3: TheSolution Updatecolumn statsfor distinctivecolumns q Use 100 steps or similar large value ü update statistics job_status (job_begindate) using 100 values q Result is that each step has a much lower selectivity value Add thebounded rangeintothequery q This means we aggregate only across the exact range of dates we want…which reduces the impact of the IN() clause q 
  • 76. 76Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group ASE’sOR StrategyASE’sOR Strategy If thequery containsan OR clauseon different columns q ASE will (and can) use two different indexes ü On index for predicates on one side of OR ü …and a different index for predicates on other side of OR ü This would be similar to splitting the query in two with union q However, if one side of OR drives a tablescan – ASE will tablescan ü Remember, we saw this with the id=8 OR 1=2 example Common issues q One side of OR not indexed well….drives tablescan q Developer attempted to use 1 index to cover both
  • 77. 77Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group An Exampleof Indexingvs. ORAn Exampleof Indexingvs. OR Consider thefollowingquery:  SELECT "VBELV" ,"POSNV" ,"VBELN" ,"POSNN" ,"VBTYP_N" ,"RFMNG" ,"MEINS" ,"VBTYP_V"  ,"ERDAT" ,"ERZET" ,"AEDAT" ,"STUFE" ,"VRKME"  FROM "VBFA"  WHERE "MANDT" = ? AND ( "ERDAT" = ? OR "AEDAT" = ? )  /* R3:SAPLZFEDWS1:767 T:VBFA M:430 */  Now, consider theindexes:  index_name index_keys  ------------------------------------- --------------------------------------------  VBFA~0 MANDT, VBELV, POSNV, VBELN, POSNN, VBTYP_N  VBFA~Z01 MANDT, VBELN  VBFA~Z02 ERDAT, BWART  VBFA~Z04 MANDT, ERDAT, AEDAT  VBFA~Z99 MANDT, LOGSYS  Issueisthat thequery seemstodrivea tablescan…. q …it seems obvious that VBFA~Z04 should be used….. q ….or is it???
  • 78. 78Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Let’slook a littlecloserLet’slook a littlecloser Lookingat systabstats  ColumnName ColumnID Row_Count RequestedSteps ActualSteps ApproxDistincts DistinctsPerStep  -------------- -------- -------------------- -------------- ----------- --------------- -----------------  AEDAT 22 1255008198 50 50 1625 33.0  BWART 17 1255008198 50 29 64 2.0  ERDAT 14 1255008198 50 245 4674 19.0  LOGSYS 38 1255008198 50 2 1 1.0  MANDT 1 1255008198 50 2 1 1.0  POSNN 5 1255008198 50 573 93300 163.0  POSNV 3 1255008198 50 231 12649 55.0  VBELN 4 1255008198 50 38 85330918 2245550.0  VBELV 2 1255008198 50 38 31223216 821664.0  VBTYP_N 6 1255008198 50 31 25 1.0 Hmmmm….not very good query criteria q MANDT is useless as always q AEDAT and ERDAT are not very distinct….1625 and 4674 values respectively ü Which means each distinct value will return ~250K to ~1M
  • 79. 79Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group AEDAT Stats….from optdiagAEDAT Stats….from optdiag Statistics for column: AEDAT Last update of column statistics: Jan 10 2014 7:21:35:026PM Range cell density: 0.0000017268359901 Total density: 0.9986527756879466 … Unique range values: 0.0000004149259654 Unique total values: 0.0006153846153846 … Histogram for column: AEDAT Column datatype: varchar(24) … Statistics step count sticky Statistics hashing sticky Statistics hashing low domain used Step Weight Value (only 255 bytes used) 1 0.00000000 < '00000000' 2 0.99932617 = '00000000' 3 0.00001720 <= '20080724' 4 0.00001430 <= '20080826' 5 0.00001409 <= '20081030' 6 0.00001545 <= '20081113' 7 0.00001415 <= '20081216' 8 0.00001419 <= '20090310' 9 0.00001468 <= '20090331' 10 0.00002772 <= '20090615' … OUCH!!!!!
  • 80. 80Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group ERDAT Stats….from optdiagERDAT Stats….from optdiag Statistics for column: ERDAT Last update of column statistics: Jan 10 2014 7:21:35:026PM Range cell density: 0.0005738551548958 Total density: 0.0006834762135235 … Unique range values: 0.0001879716956084 Unique total values: 0.0002139495079161 … Requested step count: 50 Actual step count: 245 … Statistics step count sticky Statistics hashing sticky Statistics hashing low domain used Step Weight Value (only 255 bytes used) 1 0.00000000 < '00000000' 2 0.00004201 = '00000000' 3 0.01879592 <= '20030624' 4 0.01879998 <= '20040316' 5 0.01888011 <= '20041015' 6 0.01887963 <= '20050502' 7 0.01878721 <= '20051031' 8 0.01888958 <= '20060420' 9 0.01879898 <= '20061014' 10 0.01882141 <= '20070417' BETTER!!!!
  • 81. 81Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Tounderstand, let’ssimplify thingsTounderstand, let’ssimplify things Assumewehavea tableof customer transactions… q with 1 billion rows q PKEY is transaction_id (not that it matters…..) q Has an index (IDX~1) on {purchase_date, ship_date} ü Both purchase_date and ship_date are not very distinct ü think about it …only 365 in a year….~3600 in 10 years… not very distinctive out of 1 billion row table Now consider thequery:   Select * from cust_transactions  where purchase_date=‘Jan 1 2014’ OR ship_date=‘Jan 1 2014’  Seetheproblem?.... Think about it….
  • 82. 82Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group TheProblemTheProblem Theproblem query:   Select * from cust_transactions  where purchase_date=‘Jan 1 2014’ OR ship_date=‘Jan 1 2014’  Theproblems…. q We can use the index IDX~1 for the purchase_date case …..depending of course on selectivity of the data provided q …but the OR clause means it that we also need to look for the ship date ü individually and not in combination with purchase date – remember a composite index works on COMBINING cols q ….using IDX~1 for that is sort of useless as we can’t use the leading purchase_date column as the OR clause is disjunctive…..the query really could be expressed as:   select * from cust_transactions where purchase_date=‘Jan 1 2014’  union  select * from cust_transactions where ship_date=‘Jan 1 2014’
  • 83. 83Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Remember special OR strategy???Remember special OR strategy??? When an OR condition exists: q ASE can use multiple indexes – a different index for each side of the OR q This ‘special OR strategy’ is also known as ‘index union’ When lookingat thequery & index q ASE says index is probably okay for purchase_date…. q ….but says it will need to tablescan for ship_date q Why the tablescan ü Remember, this is a DOL table and the index keys are sorted by purchase_date, then ship_date ü ….so we would have to scan ALL the leaf pages to find that ship_date ü ….only to find out that 1/4000th of the table qualifies ü ….and they are scattered around due to purchase date, so….LIO exceeds cost of tablescan so we do tablescan ü ….especially if we have an OR value of ‘00000000’….which is 99% of the table.
  • 84. 84Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group What about IN()???What about IN()??? If you werewatchingclosely….you already know theanswer If you think about it…. q …an IN() is like an OR list… q ….in fact ASE flattens into one So, all wedois: q Cost each one individually q Aggregate them into a final cost
  • 85. 85Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group A SimpleIN() exampleA SimpleIN() example 1> select * from sysobjects where id in (2,4,6,8,10,12,14,16) The Lop tree: ( project ( scan sysobjects ) ) OptBlock0 The Lop tree: ( scan sysobjects ) Generic Tables: ( Gtt0( sysobjects ) Gti1( csysobjects ) ) Generic Columns: … Predicates: ( ( { sysobjects.id } = 16 tc:{25} OR{ sysobjects.id } = 14 tc:{25} OR { sysobjects.id } = 12 tc:{25} OR{ sysobjects.id } = 10 tc:{25} OR { sysobjects.id } = 8 tc:{25} OR{ sysobjects.id } = 6 tc:{25} OR { sysobjects.id } = 4 tc:{25} OR{ sysobjects.id } = 2 tc:{25} ) tc:{25} ) Transitive Closures: …) IN() clause is expanded to OR’s….note that all have the same transitive closure id (tc:{25})
  • 86. 86Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Individual OR term selectivityIndividual OR term selectivity BEGIN GENERAL OR ANALYSIS OF all types of indices FOR sysobjects ANALYZING OR TERM 1 Estimating selectivity of index 'sysobjects.csysobjects', indid 3 id = 16 Estimated selectivity for id, selectivity = 0.1, scan selectivity 0.02272727, filter selectivity 0.02272727 restricted selectivity 0.1 unique index with all keys, one row scans 1 rows, 1 pages … ANALYZING OR TERM 2 Estimating selectivity of index 'sysobjects.csysobjects', indid 3 id = 14 … ANALYZING OR TERM 3 Estimating selectivity of index 'sysobjects.csysobjects', indid 3 id = 12 … ANALYZING OR TERM 4 Estimating selectivity of index 'sysobjects.csysobjects', indid 3 id = 10 … ==================== Lava Operator Tree ==================== Emit (VA = 3) r:8 er:5 cpu: 0 / NestLoopJoin Inner Join (VA = 2) r:8 er:5 l:0 el:5 p:0 ep:4 / OrScan IndexScan Max Rows: 8 csysobjects (VA = 0) (VA = 1) r:8 er:-1 r:8 er:5 l:0 el:-1 l:12 el:5 p:0 ep:-1 p:0 ep:4 ============================================================
  • 87. 87Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group AggregatingSelectivity for ORAggregatingSelectivity for OR END GENERAL OR ANALYSIS FOR all types of indices - INDICES FOUND FOR ALL OR TERMS Scan on table sysobjects skipped because table scan less than concurrency threshold Estimating selectivity of index 'sysobjects.csysobjects', indid 3 Estimated selectivity for id, selectivity = 0.8, scan selectivity 0.8, filter selectivity 0.8 restricted selectivity 1 special or terms 8 35.2 rows, 1 pages Data Row Cluster Ratio 0.99999 Index Page Cluster Ratio 1 Data Page Cluster Ratio 1 using no index prefetch (size 4K I/O) in index cache 'default data cache' (cacheid 0) with LRU replacement using no table prefetch (size 4K I/O) in data cache 'default data cache' (cacheid 0) with LRU replacement Data Page LIO for 'csysobjects' on table 'sysobjects' = 1.600336 Whoa!!! Prediction is 80% of the table…which had 44 rows….thankfully in *this* case, it still was only 1 page
  • 88. 88Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group AggregatingIN()AggregatingIN() Aggregation isunintelligent q It doesn’t check how many are from same range cell Result istheaggregated valueisoften over-inflated  TIP: Makesureyou havehistogram steps> largest IN() list q For SAP systems, this will be 100
  • 89. 89Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Out of rangehistogramsOut of rangehistograms Originally added toASE 15.0 for monotonicsequences q For example, sequential numbers, datetime (e.g. current datetime) q Often times if stats only updated every week, a large portion of the new data values where higher than the histogram range ü As a result, the optimizer would estimate 0 values and select an index based on that reduced cost estimate whereas in reality there could be millions of rows q With out of range histograms, several factors are used to estimate how many data values exist beyond the last histogram cell and cost is adjusted higher Usually in such cases, out of rangehistogramsisa sign of stalestats q ….but for high insert/append use cases, you may be updating or re-reading a row that was just inserted – e.g. reporting on today’s sales
  • 90. 90Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Low Cardinality ExamplesLow Cardinality Examples Histogram tuningmay bea bad thingfor short duration “STATUS” columns q Most of the values in the histogram will be “C” for complete q Unless there is a “permanent” status higher than “U” for unprocessed, it is unlikely that update stats will catch a “U” value ü During migration, the system is likely quiesced with nothing incomplete ü Post-migration, if stats are run during quiet period, likely no incomplete values exist q Out of range histogram throws off optimizer….0 would have been better estimate ü Running update stats on weekends or nights when quiet simply causes same problem…as jobs are likely all complete q Spotted with ‘set option show on’ May alsohappen with very low cardinality “TYPE” columns q Or any very low cardinality column, in reality when value in predicate is extremely low occurrence in a very low cardinality column and value is higher than more common value(s)
  • 91. 91Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group ExampleHistogramExampleHistogram Histogram for column: "ENTRY_TYPE" … Out of range Histogram Adjustment is DEFAULT. Sticky step count. Sticky partial_hashing. Step Weight Value 1 0.00000000 < "C" 2 1.00000000 = "C" Histogram for column: "STATUS" … Out of range Histogram Adjustment is DEFAULT. Low Domain Hashing. Sticky step count. Sticky partial_hashing. Step Weight Value 1 0.00000000 < "C" 2 0.98791176 = "C" 3 0.00084806 < "T" 4 0.01124019 = "T"
  • 92. 92Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Example‘set option show output’Example‘set option show output’ Estimating selectivity of index 'SAPSR3.ESH_EX_CPOINTER.ESH_EX_CPOINTER~ST', indid 3 STATUS = 'U' ENTRY_TYPE = 'P' Estimated selectivity for ENTRY_TYPE, Out of range histogram adjustment, selectivity = 0.3333333, Estimated selectivity for STATUS, Out of range histogram adjustment, selectivity = 0.2, scan selectivity 0.2, filter selectivity 0.2 60412.2 rows, 34.2 pages Data Row Cluster Ratio 0.9924527 Index Page Cluster Ratio 0.218543 Data Page Cluster Ratio 0.02202437 using index prefetch (size 128K I/O) Large IO selected: The number of leaf pages qualified is > MIN_PREFETCH pages in index cache 'default data cache' (cacheid 0) with LRU replacement
  • 93. 93Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Toprevent out of rangehistogramsToprevent out of rangehistograms Turn off for updatestatistics q Turn off for columns – not a whole table or specific index q Syntax  update statistics table_name  [[partition data_partition_name]  [ (column1, column2, …) | (column1), (column2), …] |  index_name [partition index_partition_name]]  [using step values | [out_of_range [on | off| default]]]  [with consumers = consumers][, sampling=N percent]  [, no_hashing | partial_hashing | hashing]  [, max_resource_granularity = N [percent]]  [, histogram_tuning_factor = int ]  [, print_progress = int] q Example  Update statistics SAPSR3.ESH_EX_CPOINTER (ENTRY_TYPE) out_of_range off  Update statistics SAPSR3.ESH_EX_CPOINTER (STATUS) out_of_range off Out of rangehistogram is“sticky” q Just like the number of steps, setting this once causes it to be used as the default for all future update statistics that does not specify a value.
  • 94. (c) 2015 Independent SAP Technical User GroupAnnual Conference, 2015 OPTIMIZATION COSTINGOPTIMIZATION COSTING (PART 2)(PART 2) Multi-Column Densities& Joins…
  • 95. 95Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Multi-Column DensitiesMulti-Column Densities A underused secret weapon q Useful any time multiple predicates exist q Think of it this way: ü Two sample predicates  Col_A = ‘5’  Col_B = ‘GREEN’ ü Assume both have a selectivity of 0.1  Combination could still be 0.1 if all Col_A=5 and Col_B=‘GREEN’ are same rows  Combination could be 0.01 (or less) if only a single row had the combination When doesit matter q Joins, distinct, subquery (caching), sort estimations, …. q Anyplace where the estimated number of rows returning could change the query plan (and tip costs towards an alternative ‘bad’ plan) q Especially since we don’t have composite column histograms
  • 96. 96Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Multi-Column Density (Index)Multi-Column Density (Index) Statistics for index: "aqi_weather_date_idx" (nonclustered) Index column list: "sample_date", "air_temp", "weather" Leaf count: 254345 Data page CR count: 167946797.0000000000000000 Index page CR count: 32018.0000000000000000 Data row CR count: 168066295.0000000000000000 Leaf row size: 6.1150672008890936 Index height: 3 Statistics for column group: "sample_date", "air_temp" Last update of column statistics: May 27 2014 11:45:45:016AM Range cell density: 0.0000051768562637 Total density: 0.0000051768562637 Range selectivity: default used (0.33) In between selectivity: default used (0.25) Unique range values: 0.0000016563476210 Unique total values: 0.0000016563476210 Average column width: default used (2.00) Rows scanned: 168066824.0000000000000000 Statistics version: 4 Statistics for column group: "sample_date", "air_temp", "weather" Last update of column statistics: May 27 2014 11:45:45:016AM Range cell density: 0.0000051075008894 Total density: 0.0000051075008894 Range selectivity: default used (0.33) In between selectivity: default used (0.25) Unique range values: 0.0000016297687032 Unique total values: 0.0000016297687032 Average column width: 8.5268955638740458 Rows scanned: 168066824.0000000000000000 Statistics version: 4 This is the cost of a covered query (less any portion of index not needed) The ‘weather’ column must not be very distinct as it doesn’t alter the table total density or range density by very much If the IO cost of the index is ~page count and the IO cost for the table is near the leaf count – it is doing an index scan and then following each leaf…. Often not a good strategy unless only a few rows Any NL join using this index would need to traverse the index tree this many times per outer row (Note: Index cluster ratios removed due to space)
  • 97. 97Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Usinga Multi-Column DensityUsinga Multi-Column Density Remember, wedon’t havecompositehistograms First weconsider theselectivity of each of thecolumnsindividually q This gives us an idea of how many rows there could be q For example, col_A has 2 rows & col_B has 5 rows…. ü Total range is between 2 & 10 rows ü Probability is likely closer to 2…but depends on reality…. Then welook at multi-column density q This is our flavor of reality to temper probability q We use the above with a proprietary formula to compute the selectivity ü The more selective each column, the closer to the multi-column density
  • 98. 98Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Example: Multi-Column DensityExample: Multi-Column Density Statistics for column group: "sample_date", "air_temp", "weather" Last update of column statistics: May 27 2014 11:45:45:016AM Range cell density: 0.0000051075008894 Total density: 0.0000051075008894 Range selectivity: default used (0.33) In between selectivity: default used (0.25) Unique range values: 0.0000016297687032 Unique total values: 0.0000016297687032 Average column width: 8.5268955638740458 Rows scanned: 168066824.0000000000000000 Statistics version: 4 1> select l.city, l.county, s.sample_date, s.air_temp 2> from aqi_locations l, aqi_samples s 3> where l.location_id=s.location_id 4> and s.sample_date = 'July 1 2000 12:00:00:000PM' 5> and l.state='PA' 6> and s.weather='Overcast' 7> and s.air_temp = 90 Estimating selectivity of index 'aqi_samples.aqi_weather_date_idx', indid 3 sample_date= Jul 1 2000 12:00:00:000PM weather = 'Overcast' air_temp = 90 Estimated selectivity for sample_date, selectivity = 0.0002490077, Estimated selectivity for air_temp, selectivity = 0.01104084, Estimated selectivity for weather, selectivity = 0.002359544, scan selectivity 5.11258e-006, filter selectivity 5.11258e-006 859.2551 rows, 1.300359 pages Data Row Cluster Ratio 3.186365e-006 Index Page Cluster Ratio 0.9989935 Data Page Cluster Ratio 0.0007121012 using no index prefetch (size 4K I/O) in index cache 'default data cache' (cacheid 0) with LRU replacement using no table prefetch (size 4K I/O) in data cache 'default data cache' (cacheid 0) with LRU replacement Data Page LIO for 'aqi_weather_date_idx' on table 'aqi_samples' = 859.2551 Selectivity based single histogram cell for sample_date Selectivity based single histogram cell for air_temp Selectivity based on single histogram cell for weather Selectivity estimate based on numbers of values for the above combined with multi-cell density. Since only a few values for each, the selectivity is close to multi-column density
  • 99. 99Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Problem – LargeEstimatesProblem – LargeEstimates In somecases, wecan’t usemulti-column densities q For example, columns involved may have ranges of values q The total estimates of rows could then be astronomical ü Perhaps even higher than the real rowcount In such cases, wecomputea ‘smart’ density q We know the best case is the most selective column q We then simply a formula to derive a selectivity ü Some cite sum(cell weight**2) ü Others use W1*W2 + W1*W2*W3 …
  • 100. 100Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Example: Multi-Column EstimateExample: Multi-Column Estimate 1> select l.city, l.county, s.sample_date, s.air_temp 2> from aqi_locations l, aqi_samples s 3> where l.location_id=s.location_id 4> and s.sample_date between 'July 1 2000 00:00:01' and 'July 31 2000 23:59:59' 5> and l.state='PA' 6> and s.weather='Overcast' 7> and s.air_temp < 85 Estimating selectivity of index 'aqi_samples.aqi_weather_date_idx', indid 3 sample_date>= Jul 1 2000 12:00:01:000AM sample_date <= Jul 31 2000 11:59:59:000PM weather = 'Overcast' air_temp < 85 Estimated selectivity for sample_date, selectivity = 0.007751161, Estimated selectivity for air_temp, selectivity = 0.7523476, Estimated selectivity for weather, selectivity = 0.002359544, Intelligent Scan selectivity reduction from 0.007751161 to 0.005852389 scan selectivity 0.005852389, filter selectivity 1.375984e-005 restricted selectivity 0.007751161 983592.5 rows, 1488.526 pages Data Row Cluster Ratio 3.186365e-006 Index Page Cluster Ratio 0.9989935 Data Page Cluster Ratio 0.0007121012 using index prefetch (size 32K I/O) Large IO selected: The number of leaf pages qualified is > MIN_PREFETCH pages in index cache 'default data cache' (cacheid 0) with LRU replacement using no table prefetch (size 4K I/O) in data cache 'default data cache' (cacheid 0) with LRU replacement Data Page LIO for 'aqi_weather_date_idx' on table 'aqi_samples' = 2312.572 Selectivity based on aggregating all the dates in the range Selectivity based all temps in unbounded range Selectivity based on single cell density for weather The worst case projection is the most selective of the above A better estimate is we use a formula to derive a new value we think is more accurate for the scan selectivity (estimate of index rows & leaf pages)…loosely it is sum(W1*W2…) – e.g. W1*W2+W1*W2*W3 The filter selectivity (estimate of data pages) is the product of the weights (e.g. W1*W2*W3 or 0.007751161* 0.7523476* 0.002359544 = 0.0000137598)
  • 101. 101Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group When tocreate(multi-)column statsWhen tocreate(multi-)column stats Okay – weknow automatically created for index keys q …and used for joins When do/ought wecreateour own q On the 2-nth index key (or subset) ü ASE creates stats on {A}, {A,B},{A,B,C}, {A,B,C,D} ü Might be useful to have {B,C,D} or {B,C}  Help trip ORScans if leading column frequently not a predicate  Help with joins when leading column is specified as literal/lateral join (ala SAP) q On low cardinality columns we don’t want to index ü …but frequently used as predicates (such as gender) ü Especially if often used in queries with joins (help inner/out table decision) Not automatically maintained with ‘updateindex stats’ q You need to manually run update stats on each column density you create
  • 102. 102Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group JoinsJoins Traditional Logic @ DrivingTable q Put the table that seems to ‘drive’ the join as the outer table q Typically, this will be the ‘smaller’ table (or smaller rowset) q The developer may know the driving table (e.g. #temp) q …but optimizer has to figure it out ü Estimate rowsets from each table using index selectivity ü Estimate joined rows from joining with each table in list  Reducing joined rows by applying index selectivity as filter  But remember, this is a guess at optimization time AlternativeLogic Pin smaller in cache q Put larger rowset table as outer and scan once q Inner (smaller) table can be pinned in cache ü Avoid higher PIO In both cases, themulti-column statson join columnsarekey torowset estimates
  • 103. 103Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Join StrategiesJoin Strategies Remember, wehave3 typesof joins q Nested Loop Joins q Merge Joins (including Sort Merge Joins) q Hash Joins Optimizer needstofigureout which oneisbest q For indexed joins, typically an NLJ will be best … ü ….but this assumes M:N ratio is reasonably small (e.g. 1:10) q A merged join is great for high cardinality joins ü M:N is high r 1:1000+ ü Especially if inner table is sorted in join key sequence q A hash join works best when join keys are not predicates but predicates eliminate a lot of rows on both sides of join ü Outer table is filtered by predicates and join keys hashed into build table ü Inner table is filtered by predicates, join key hashed and probed for in build table
  • 104. 104Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Thisiswhy statsaresooo…criticalThisiswhy statsaresooo…critical Weusethem toestimate q cardinality of the join q Rows that qualify from predicates (unjoined) If theestimatesareoff by a lot q We likely predict it is a high cardinality join ü Remember, with 4 join keys, if we don’t have stats on the other 3 columns, we use magic values of 0.1 q With very high row counts projected from inner table…. ü If we consider 3 levels of indexing and 10M rows, that’s 40M LIO ü Sorting 10M rows may only take 20M LIO’s… ü ….so we degrade into a Sort Merge Join (SMJ)
  • 105. 105Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Join Keys: TheQueryJoin Keys: TheQuery SELECT TOP 1 T_00."PRRBA" FROM SAPSR3."/PXY/ACTUAL_DEP" T_00 INNER JOIN SAPSR3."/PXY/SCD" T_01 ON T_01."MANDT" = ? AND T_01."RBARE" = T_00."PRRBA" AND T_01."SCNA" = T_00."PRSCNA" AND T_01."EXECNO" = T_00."PREXEC" AND T_01."STEP" = T_00."PRST" WHERE T_00."MANDT" = ? AND T_00."SCNA" = ? AND T_00."EXECNO" = ? AND T_00."STEP" = ? AND T_00."RBARE" = ? AND T_01."STATUS" <> ? AND T_01."STATUS" <> ? /* R3:/PXY/SAPLRB:72334 T:/PXY/ACTUAL_DEP M:430 */ create unique nonclustered index "/PXY/ACTUAL_DEP~0" on SAPSR3."/PXY/ACTUAL_DEP"(MANDT, SCNA, EXECNO, STEP, RBARE, PRSCNA, PREXEC, PRST, PRRBA) create nonclustered index "/PXY/ACTUAL_DEP~00" on SAPSR3."/PXY/ACTUAL_DEP"(MANDT, PRSCNA, PREXEC, PRST, PRRBA, SCNA, EXECNO, STEP, RBARE) create unique nonclustered index "/PXY/SCD~0" on SAPSR3."/PXY/SCD"(MANDT, RBARE, SCNA, EXECNO, STEP) create nonclustered index "/PXY/SCD~ID1" on SAPSR3."/PXY/SCD"(MANDT, SCNA, EXECNO, RBARE) Notice the lateral join on MANDT = <value>. Knowing that ASE has issues with literals at the beginning of the join, we will see if adding multi- column stats on {RBARE, SCNA, EXECNO, STEP} helps NLJoin costing
  • 106. 106Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Join Keys– Bad Index UsageJoin Keys– Bad Index Usage | |TOP Operator (VA = 4) | | Top Limit: 1 | | |MERGE JOIN Operator (Join Type: Inner Join) (VA = 3) | | | Using Worktable2 for internal storage. | | | Key Count: 4 | | | Key Ordering: ASC ASC ASC ASC | | | |SORT Operator (VA = 1) | | | | Using Worktable1 for internal storage. | | | | |SCAN Operator (VA = 0) | | | | | FROM TABLE | | | | | SAPSR3./PXY/ACTUAL_DEP | | | | | T_00 | | | | | Index : /PXY/ACTUAL_DEP~0 | | | | | Forward Scan. | | | | | Positioning by key. | | | | | Index contains all needed columns. Base table will not be read. | | | | | Keys are: | | | | | MANDT ASC | | | | | SCNA ASC | | | | | EXECNO ASC | | | | | STEP ASC | | | | | RBARE ASC | | | | | Using I/O Size 16 Kbytes for index leaf pages. | | | | | With LRU Buffer Replacement Strategy for index leaf pages. | | | |SCAN Operator (VA = 2) | | | | FROM TABLE | | | | SAPSR3./PXY/SCD | | | | T_01 | | | | Index : /PXY/SCD~0 | | | | Forward Scan. | | | | Positioning by key. | | | | Keys are: | | | | MANDT ASC | | | | Using I/O Size 16 Kbytes for index leaf pages. | | | | With LRU Buffer Replacement Strategy for index leaf pages. | | | | Using I/O Size 16 Kbytes for data pages. | | | | With LRU Buffer Replacement Strategy for data pages.
  • 107. 107Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Join Permutation Costing(1)Join Permutation Costing(1) xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx BEGIN: Complete join order evaluation (perm #1) xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Permutation Order: Gt0( SAPSR3./PXY/ACTUAL_DEP T_00 ) |X| Gt1( SAPSR3./PXY/SCD T_01 ) joining using ( PopNlJoin () () ) cost:0 tempdb:0 order: none outer Pops: ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) cost:81.29999 T(L3,P3,C2.999999) O(L3,P3,C2.999999) tempdb:0 order: <3,2,1,9> ( PopSort ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) ) cost:114.148 T(L9.765611,P3.76561,C4.765611) O(L6,P0,C1) tempdb:0.001237151 order: {1,2,3,9} Has BmoSort inner Pops: ( PopRidJoin ( PopIndScan /PXY/SCD~0 SAPSR3./PXY/SCD T_01 ) ) cost:1989.483 T(L73.16116,P73.16116,C141.3204) O(L70.16116,P70.16116,C140.3204) tempdb:0.0006185754 order: <9,3,2,1> joining using ( PopMergeJoin () () ) cost:0 tempdb:0 order: none outer Pops: ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) cost:81.29999 T(L3,P3,C2.999999) O(L3,P3,C2.999999) tempdb:0 order: <3,2,1,9> ( PopSort ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) ) cost:114.148 T(L9.765611,P3.76561,C4.765611) O(L6,P0,C1) tempdb:0.001237151 order: {1,2,3,9} Has BmoSort inner Pops: ( PopRidJoin ( PopIndScan /PXY/SCD~ID1 SAPSR3./PXY/SCD T_01 ) ) cost:1162186 T(L183590.3,P5562.217,C6559500) O(L182634.3,P4606.217,C4055874) tempdb:0 order: <3,2,9> ( PopRidJoin ( PopIndScan /PXY/SCD~0 SAPSR3./PXY/SCD T_01 ) ) cost:614.7092 T(L20.83115,P20.78714,C533.6843) O(L17.83115,P17.78714,C355.7895) tempdb:0 order: <9,3,2,1> ( PopSort ( PopRidJoin ( PopIndScan /PXY/SCD~ID1 SAPSR3./PXY/SCD T_01 ) ) ) cost:4406059 T(L44736.09,P46577.09,C3.15216e+07) O(L1851,P3692,C3.147871e+07) tempdb:3077.973 order: {1,2,3,9} Has BmoSort
  • 108. 108Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Join Permutation Costing(2)Join Permutation Costing(2) Eagerly enforcing... the cheapest Pop: ( PopMergeJoin ( PopSort ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) ) ( PopRidJoin ( PopIndScan /PXY/SCD~0 SAPSR3./PXY/SCD T_01 ) ) ) cost:735.8733 T(L30.59676,P24.55275,C608.61) O(L0,P0,C70.16021) tempdb:0.0006185754 order: none ... Pop enforcers: ... PopLet enforcers: ... done eager enforcement. All Pops/PopLets before EqcN selection: -> initial Pops: ( PopMergeJoin ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) ( PopRidJoin ( PopIndScan /PXY/SCD~ID1 SAPSR3./PXY/SCD T_01 ) ) ) cost:1288721 T(L191677,P7108.215,C7276614) O(L8083.682,P1542.997,C717110.6) tempdb:0 order: none ( PopMergeJoin ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) ( PopSort ( PopRidJoin ( PopIndScan /PXY/SCD~ID1 SAPSR3./PXY/SCD T_01 ) ) ) ) cost:4406148 T(L44739.09,P46580.09,C3.152167e+07) O(L0,P0,C70.16021) tempdb:1538.986 order: none ( PopMergeJoin ( PopSort ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) ) ( PopRidJoin ( PopIndScan /PXY/SCD~ID1 SAPSR3./PXY/SCD T_01 ) ) ) cost:1162645 T(L183600,P5565.983,C6562956) O(L0,P0,C3451.033) tempdb:0.0006185754 order: none ( PopMergeJoin ( PopSort ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) ) ( PopRidJoin ( PopIndScan /PXY/SCD~0 SAPSR3./PXY/SCD T_01 ) ) ) cost:735.8733 T(L30.59676,P24.55275,C608.61) O(L0,P0,C70.16021) tempdb:0.0006185754 order: none ( PopMergeJoin ( PopSort ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) ) ( PopSort ( PopRidJoin ( PopIndScan /PXY/SCD~ID1 SAPSR3./PXY/SCD T_01 ) ) ) ) cost:4406180 T(L44745.86,P46580.86,C3.152167e+07) O(L0,P0,C70.16021) tempdb:1538.987 order: none Has BmoSort ( PopNlJoin ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) ( PopRidJoin ( PopIndScan /PXY/SCD~0 SAPSR3./PXY/SCD T_01 ) ) ) cost:2070.783 T(L76.16116,P76.16116,C144.3204) tempdb:0 order: none ( PopNlJoin ( PopSort ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) ) ( PopRidJoin ( PopIndScan /PXY/SCD~0 SAPSR3./PXY/SCD T_01 ) ) ) cost:2103.631 T(L82.92677,P76.92677,C146.086) tempdb:0.0006185754 order: none Has BmoSort
  • 109. 109Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Join Permutation Costing(3)Join Permutation Costing(3) Eqc competition ... initial old Pops: ( PopNlJoin ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) ( PopRidJoin ( PopIndScan /PXY/SCD~0 SAPSR3./PXY/SCD T_01 ) ) ) cost:2070.783 T(L76.16116,P76.16116,C144.3204) tempdb:0 order: none initial new Pops: ... pruned new against total 0 pruned new against old 5 pruned old against new 1 kept old Pops: ( PopNlJoin ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) ( PopRidJoin ( PopIndScan /PXY/SCD~0 SAPSR3./PXY/SCD T_01 ) ) ) cost:2070.783 T(L76.16116,P76.16116,C144.3204) tempdb:0 order: none kept new Pops: ( PopMergeJoin ( PopSort ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) ) ( PopRidJoin ( PopIndScan /PXY/SCD~0 SAPSR3./PXY/SCD T_01 ) ) ) cost:735.8733 T(L30.59676,P24.55275,C608.61) O(L0,P0,C70.16021) tempdb:0.0006185754 order: none ... done Eqc competition. ... done join visit. Join plans selected for this permutation: OptBlock0 Eqc{0,1} -> Pops added for the join Eqc{0} - Eqc{1}: ( PopMergeJoin ( PopSort ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) ) ( PopRidJoin ( PopIndScan /PXY/SCD~0 SAPSR3./PXY/SCD T_01 ) ) ) cost:735.8733 T(L30.59676,P24.55275,C608.61) O(L0,P0,C70.16021) tempdb:0.0006185754 order: none move greedy pops to new list ( PopNlJoin ( PopIndScan /PXY/ACTUAL_DEP~0 SAPSR3./PXY/ACTUAL_DEP T_00 ) ( PopRidJoin ( PopIndScan /PXY/SCD~0 SAPSR3./PXY/SCD T_01 ) ) ) cost:2070.783 T(L76.16116,P76.16116,C144.3204) tempdb:0 order: none ... done move greedy pops to new list. xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx DONE: Complete join order evaluation (perm #1) xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx “old Pops” = 12.5 style optimization – note that the cost is >2000
  • 110. 110Annual Conference, 2015 (c) 2015 Independent SAP Technical User Group Join Permutation Costing(4)Join Permutation Costing(4) ** Costing set up for RowLimit optimization ** TopLogProps0( SAPSR3./PXY/ACTUAL_DEP T_00 ) - TopPred: [Tc{} Pe{0,1,2,3,4}] TopSubst: {1,2,3,4,5,6,7,8,9,17} TopLogProps0( SAPSR3./PXY/SCD T_01 ) - TopPred: [Tc{} Pe{5,6,7}] TopSubst: {11,12,13,14,15,16} Statistics for rows returned to client... Estimated rows :14073.64 Estimated row width :7.002473 Estimated client cost is :78.59161 Estimating selectivity of index 'SAPSR3./PXY/SCD./PXY/SCD~0', indid 2 MANDT = '430' Estimated selectivity for MANDT, selectivity = 1, scan selectivity 1, filter selectivity 1 Cost adjusted for RowLimit optimization, Adjustment ratio 7.105484e-05 2503626 rows, 6283 pages Adjustment ratio 7.105484e-05 applied gives 177.8947 rows, 1 pages Data Row Cluster Ratio 0.9107559 Index Page Cluster Ratio 0.9874477 Data Page Cluster Ratio 0.242736 using index prefetch (size 128K I/O) Large IO selected: The number of leaf pages qualified is > MIN_PREFETCH pages Adjustment using index prefetch (size 128K I/O) in index cache 'default data cache' (cacheid 0) with LRU replacement using table prefetch (size 128K I/O) Large IO selected: The number of leaf pages qualified is > MIN_PREFETCH pages Adjustment using table prefetch (size 128K I/O) in data cache 'default data cache' (cacheid 0) with LRU replacement Data Page LIO for '/PXY/SCD~0' on table 'SAPSR3./PXY/SCD' = 17.83115