Presentation by Tony Davis at SQL in The City 2016. An execution plan tells you exactly which tables and indexes SQL Server accessed, in what order, and what other operations it performed to return the data your query needed. But sometimes, the plan for even the simplest-looking query can reveal nasty surprises.
This session describes how SQL Server generates and reuses execution plans and the implications this has for you as the developer. After a quick-start guide to retrieving and reading plans, we'll focus on techniques that can help you track down high-cost queries quickly.
We'll cover tools such as ANTS Performance Profiler, as well as scripts that hunt down execution plans for queries that caused expensive scans, sort warnings, and other issues. Examining those plans, you'll uncover the root cause of the problem, often revealing issues such as inefficient indexing, data type mismatches, and misuse of functions.
Learn more about ANTS Performance Profiler: http://www.red-gate.com/products/dotnet-development/ants-performance-profiler/
Find out about all Redgate Products: http://www.red-gate.com/products/
Connect with Tony Davis on LinkedIn: https://www.linkedin.com/in/tony-davis-208b241
3. SELECT ProductID ,
ProductNumber ,
dbo.LineItemTotal(ProductID)
AS SumTotal
FROM Production.Product p
GO
SELECT bth.ProductID ,
bth.TransactionDate ,
bth.Quantity ,
bth.ActualCost
FROM dbo.bigTransactionHistory bth
WHERE bth.TransactionDate >= '20100701'
AND bth.TransactionDate
<= CURRENT_TIMESTAMP
ORDER BY bth.TransactionDate DESC;
GO
Why Bother Learning Execution Plans?
WITH Totals
AS ( SELECT DATEADD(m, u.theMonth, 0) AS TheMonth ,
SUM(CASE WHEN u.theCol = 'DateJoined'
THEN u.Registrations
ELSE 0
END) AS PeopleJoined ,
SUM(CASE WHEN u.theCol = 'DateLeft'
THEN u.Registrations
ELSE 0
END) AS PeopleLeft
FROM ( SELECT DATEDIFF(MONTH, 0, DateJoined)
AS DateJoined ,
DATEDIFF(MONTH, 0, DateLeft)
AS DateLeft ,
COUNT(*) AS Registrations
FROM dbo.Registrations2
GROUP BY DATEDIFF(MONTH, 0, DateJoined) ,
DATEDIFF(MONTH, 0, DateLeft)
) AS d UNPIVOT ( theMonth FOR theCol
IN ( d.DateJoined,
d.DateLeft ) ) AS u
GROUP BY u.theMonth
HAVING SUM(CASE WHEN u.theCol = 'DateJoined'
THEN u.Registrations
ELSE 0
END) > 0
)
SELECT TheMonth ,
PeopleJoined ,
PeopleLeft ,
SUM(PeopleJoined - PeopleLeft) OVER ( ORDER BY TheMonth
ROWS UNBOUNDED PRECEDING ) AS CurrentSubscribers
FROM Totals;
4. From Query to Execution plan
SELECT bth.ProductID ,
bth.TransactionDate ,
bth.Quantity ,
bth.ActualCost
FROM dbo.bigTransactionHistory bth
WHERE bth.TransactionDate >= '20100701'
AND bth.TransactionDate <=
CURRENT_TIMESTAMP
ORDER BY bth.TransactionDate DESC;
GO
#SQLintheCityUK
5. Plan
1
Plan
2
Plan
3
Plan
4
Plan
57
Plan
n?
Algebrizer
(Query Binding)
Syntax Checking
(Query Parsing)
Query
Optimization
Plan
Query
Execution
Engine
Plan
Cache
Metadata
Table/index structures
SELECT bth.ProductID ,
bth.TransactionDate ,
bth.Quantity ,
bth.ActualCost
FROM dbo.bigTransactionHistory bth
WHERE bth.TransactionDate >= '20100701'
AND bth.TransactionDate <= CURRENT_TIMESTAMP
ORDER BY bth.TransactionDate DESC;
GO
Index/Column Statistics
Volume/distribution of data
#SQLintheCityUK
6. Getting the Execution Plan
.NET Code Profiler
- e.g. ANTS Performance Profiler
Get plans for previously executed
queries from the plan cache
- e.g. SQL Trace or Extended Events
- Using sys.dm_exec_cached_plans
Plans in cache contain no runtime
information
#SQLintheCityUK
8. Getting the Execution Plan
.NET Code Profiler
- e.g. ANTS Performance Profiler
Get plans for previously executed
queries from the plan cache
- e.g. SQL Trace or Extended Events
- Using sys.dm_exec_cached_plans
Plans in cache contain no runtime
information
SET
During testing, request the plan for
a query
• “estimated” plan – no runtime
• “actual” plan - with runtime
There is only 1 plan!
#SQLintheCityUK
10. Execution Plans for Developers
• Don’t examine plan for every query
• Gather stats, focus on critical, frequent, resource-
intensive queries
• Sometimes the code logic is just wrong
– rip it up and start again!
• Sometimes the basic logic is fine
– But some subtler problem causes poor execution performance
– This is where execution plans can help!
#SQLintheCityUK
11. • With SQL Server:
• We submit SQL describing the data set we
want
• The Optimizer decides how to execute it
• Some developers bring imperative approach line-
by-line control to SQL Server…
Problem: Row-by-row strategy
#SQLintheCityUK
32. Solution: set-based approach
– Take the list to the supermarket ;)
– Define single statement (if possible) to return required
data set
• Make as few ‘passes’ through the base tables as possible
• Aggregate early
• Reduce the working set to as few rows as possible
#SQLintheCityUK
33. WITH Totals
AS ( SELECT DATEADD(m, u.theMonth, 0) AS TheMonth ,
SUM(CASE WHEN u.theCol = 'DateJoined'
THEN u.Registrations
ELSE 0
END) AS PeopleJoined ,
SUM(CASE WHEN u.theCol = 'DateLeft'
THEN u.Registrations
ELSE 0
END) AS PeopleLeft
FROM ( SELECT DATEDIFF(MONTH, 0, DateJoined)
AS DateJoined ,
DATEDIFF(MONTH, 0, DateLeft)
AS DateLeft ,
COUNT(*) AS Registrations
FROM dbo.Registrations2
GROUP BY DATEDIFF(MONTH, 0, DateJoined) ,
DATEDIFF(MONTH, 0, DateLeft)
) AS d UNPIVOT ( theMonth FOR theCol
IN ( d.DateJoined,
d.DateLeft ) ) AS u
GROUP BY u.theMonth
HAVING SUM(CASE WHEN u.theCol = 'DateJoined'
THEN u.Registrations
ELSE 0
END) > 0
)
SELECT TheMonth ,
PeopleJoined ,
PeopleLeft ,
SUM(PeopleJoined - PeopleLeft) OVER ( ORDER BY TheMonth
ROWS UNBOUNDED PRECEDING ) AS CurrentSubscribers
FROM Totals;
Classic Running Total Problem
A Registrations table containing a list of
subscribers, with the dates that the subscribers
joined and left
#SQLintheCityUK
34. WITH Totals
AS ( SELECT DATEADD(m, u.theMonth, 0) AS TheMonth ,
SUM(CASE WHEN u.theCol = 'DateJoined'
THEN u.Registrations
ELSE 0
END) AS PeopleJoined ,
SUM(CASE WHEN u.theCol = 'DateLeft'
THEN u.Registrations
ELSE 0
END) AS PeopleLeft
FROM ( SELECT DATEDIFF(MONTH, 0, DateJoined)
AS DateJoined ,
DATEDIFF(MONTH, 0, DateLeft)
AS DateLeft ,
COUNT(*) AS Registrations
FROM dbo.Registrations2
GROUP BY DATEDIFF(MONTH, 0, DateJoined) ,
DATEDIFF(MONTH, 0, DateLeft)
) AS d UNPIVOT ( theMonth FOR theCol
IN ( d.DateJoined,
d.DateLeft ) ) AS u
GROUP BY u.theMonth
HAVING SUM(CASE WHEN u.theCol = 'DateJoined'
THEN u.Registrations
ELSE 0
END) > 0
)
SELECT TheMonth ,
PeopleJoined ,
PeopleLeft ,
SUM(PeopleJoined - PeopleLeft) OVER ( ORDER BY TheMonth
ROWS UNBOUNDED PRECEDING ) AS CurrentSubscribers
FROM Totals;
A single ‘pass’
of the base table
Early
Aggregation
10000001690
Execution Plan
#SQLintheCityUK
35. Problem: Excessive logical reads
• “Read 1000 pages to return 100 rows”
– 1 page = 1 logical read
– Goal: return data with few logical reads as possible
– Sometimes we force SQL Server to read more pages than necessary (excessive IO)
Inefficient data access paths (indexes)
Inefficient SQL forcing SQL Server to do
excessive work
#SQLintheCityUK
36. Name
(Index Key)
A - Z
+ LocationID
Name
(Index Key)
A - E
+ LocationID
Name
(Index Key)
F - I
+ LocationID
Name
(Index Key)
J - M
+ LocationID
Name
(Index Key)
N - R
+ LocationID
Name
(Index Key)
S - V
+ LocationID
Name
(Index Key)
W - Z
+ LocationID
Root Node
Intermediate Level
Leaf Level
Name
(Index Key)
A - M
+ LocationID
Name
(Index Key)
N - Z
+ LocationID
WHERE
l.Name ='Paint';Non-clustered Index
#SQLintheCityUK
37. Name
(Index Key)
A - Z
+ LocationID
Name
(Index Key)
A - E
+ LocationID
Name
(Index Key)
F - I
+ LocationID
Name
(Index Key)
J - M
+ LocationID
Name
(Index Key)
N - R
+ LocationID
Name
(Index Key)
S - V
+ LocationID
Name
(Index Key)
W - Z
+ LocationID
Root Node
Intermediate Level
Leaf Level
Name
(Index Key)
A - M
+ LocationID
Name
(Index Key)
N - Z
+ LocationID
WHERE
l.Name ='Paint';Non-clustered Index
#SQLintheCityUK
38. LocationID = 6;
(ProductID,
LocationID)
(Clustering Key)
1 to 12000
(ProductID,
LocationID)
1-6000
(ProductID,
LocationID)
6001-12000
Data rows
for
ProductID
1-2000
Data rows
for
ProductID
2001-4000
Data rows
for
ProductID
4001-6000
Data rows
for
ProductID
6001-8000
Data rows
for
ProductID
80001-10000
Data rows
for
ProductID
10001-12000
Root Node
Intermediate Level
Leaf Level(502,6) (4325,6) (10324,6)(7865,6)
Clustered Index
#SQLintheCityUK
39. LocationID = 6;
(ProductID,
LocationID)
(Clustering Key)
1 to 12000
(ProductID,
LocationID)
1-6000
(ProductID,
LocationID)
6001-12000
Data rows
for
ProductID
1-2000
Data rows
for
ProductID
2001-4000
Data rows
for
ProductID
4001-6000
Data rows
for
ProductID
6001-8000
Data rows
for
ProductID
80001-10000
Data rows
for
ProductID
10001-12000
Root Node
Intermediate Level
Leaf Level
Clustered Index
#SQLintheCityUK
40. LocationID = 6;
(ProductID,
LocationID)
(Clustering Key)
1 to 12000
(ProductID,
LocationID)
1-6000
(ProductID,
LocationID)
6001-12000
Data rows
for
ProductID
1-2000
Data rows
for
ProductID
2001-4000
Data rows
for
ProductID
4001-6000
Data rows
for
ProductID
6001-8000
Data rows
for
ProductID
80001-10000
Data rows
for
ProductID
10001-12000
Root Node
Intermediate Level
Leaf Level
#SQLintheCityUK
44. #SQLintheCityUK
• For small tables, scanning an index is very
efficient operation
• BUT…for critical and frequently-executed
queries:
– Provide covering non-clustered indexes that the
queries can seek
– Minimize logical reads (IO) required to gather the
data
Solution: optimize indexes for workloadSolution: optimize indexes for workload
45. #SQLintheCityUK
–Worst Strategy: No indexes.
Query performance will be terrible
–Second worst: Index on every column; same
column participating in numerous indexes
Data modification performance will be awful
–Don’t do “SELECT *”
Makes effective indexing VERY hard!
Solution: optimize indexes for workloadSolution: optimize indexes for workload
46. #SQLintheCityUK
• Don’t index query by query
• Goal: small set of indexes to help most important and frequent queries
• Analyze the workload as a whole
– Profiler and Database Engine Tuning Advisor
• Good start point…
• Not necessarily good end point
– Refine recommendations using Missing Index (and other) DMVs
• What is the write : read ratio (>1?) for the index?
• How many plans currently associated with index?
• How often are those plans used?
• How many reads might the missing index have helped? Don’t create if this number is small…
– See Performance Tuning with SQL Server Dynamic Management Views (free eBook)
• http://bit.ly/1jVG3IW
Solution: optimize indexes for workload
47. “The explicit data type conversion
has rendered the predicate non-SARGable!”
Problem: Non-SARGable Predicates
#SQLintheCityUK
48. Beware of SARG-ability
• Non-SARGable means that the optimizer can’t
use the expression in a seek operation
• Cause: use of function directly on column in
WHERE or JOIN
– Various incarnations
– All lead to excessive IO (scans when seeks should be viable)
#SQLintheCityUK
49. SARG-able predicate
…WHERE FirstName = 'bob'; SARGable
FirstName LastName
Aaron Bertrand
Aaron Darrenovsky
Amber Smith
Apple Paltrow
Bob Duffy
Bob Carolgees
Chris Christofferson
…etc...
#SQLintheCityUK
50. Non SARGable predicate
…WHERE REVERSE(FirstName) = 'bob';
FirstName LastName
Aaron Bertrand
Aaron Darrenovsky
Amber Smith
Apple Paltrow
Bob Duffy
Bob Carolgees
Chris Christofferson
…etc...
REVERSE ('Aaron') = 'bob' ?
#SQLintheCityUK
51. Non SARG-able predicate
…WHERE REVERSE(FirstName) = 'bob';
FirstName LastName
Aaron Bertrand
Aaron Darrenovsky
Amber Smith
Apple Paltrow
Bob Duffy
Bob Carolgees
Chris Christofferson
…etc...
REVERSE ('Aaron') = 'bob' ?
#SQLintheCityUK
52. Non SARGable predicate
…WHERE REVERSE(FirstName) = 'bob';
FirstName LastName
Aaron Bertrand
Aaron Darrenovsky
Amber Smith
Apple Paltrow
Bob Duffy
Bob Carolgees
Chris Christofferson
…etc...
REVERSE ('Amber') = 'bob' ?
Non-SARGable
#SQLintheCityUK
54. Other common explicit conversions
-- Using functions like LTRIM, RTRIM etc.
-- often happens where people don't trust
consistency of data inputs
SELECT bp.ProductID ,
bp.Name ,
bp.ProductNumber
FROM dbo.bigProduct bp
WHERE LTRIM(bp.ProductNumber) LIKE 'AR%';
GO
SELECT bp.ProductID ,
bp.Name ,
bp.ProductNumber
FROM dbo.bigProduct bp
WHERE bp.ProductNumber LIKE N'%1000';
GO
#SQLintheCityUK
55. Other causes of Non-SARG
• Implicit Data type conversions
– Variable data type doesn’t match column type
– SQL Server uses CONVERT_IMPLICIT
– Problem if column is lower precedence type
• Data type precedence order: http://bit.ly/1ENOwjZ
• Misuse of User Defined Functions (UDFs)
#SQLintheCityUK
56. • Rewrite the query to avoid direct use of
function on column
• Use helper functions
• Avoid data type mismatches
• Be careful when using UDFs!
– Convert scalar UDFs to inline TVFs
– http://bit.ly/1oN9ysK
Solutions to Non-SARGability
#SQLintheCityUK
57. • One query size fits all
• Leads to wildly inaccurate estimations and
inappropriate execution plans
Problem: Overly Generic SQL
#SQLintheCityUK
59. • Write SQL and stored procedures for a specific, defined
purpose
• Various ‘workarounds’ though none without drawbacks
– Recompile on each execution
– Dynamic SQL
– See Gail Shaw’s “How to Confuse the Query Optimizer”
(http://bit.ly/1Mb8Rnp)
Solutions to Generic SQL
#SQLintheCityUK
60. Other issues that cause
poor estimations
• Stale statistics
• Problems with parameter sniffing
#SQLintheCityUK
62. Conclusions
• Plans tell us exactly how SQL Server executed your
query
– Focus on critical and frequently-executed queries
– Use plans to:
• spot common mistakes in the code
• Uncover problems with inefficient indexing
– Work with rather than against SQL Server
#SQLintheCityUK
63. Thank you to…
• Grant Fritchey - @GFritchey
• Hugo Kornelis - @Hugo_Kornelis
• Gail Shaw - @SQLintheWild
• Rodney Landrum - @SQLBeat
• Phil Factor - @Phil_Factor
#SQLintheCityUK