SlideShare une entreprise Scribd logo
1  sur  62
Télécharger pour lire hors ligne
Analysing Performance of Algorithmic SQL and PL/SQL
Brendan Furey, September 2022
A Programmer Writes… (Brendan's Blog)
Ireland Oracle User Group, September 5-6, 2022
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 1
whoami
Freelance Oracle developer and blogger
Keen interest in programming concepts
Started career as a Fortran programmer at British Gas
Dublin-based Europhile
30 years Oracle experience, currently working in Finance
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 2
Agenda
 Algorithms and SQL (9 slides)
 On algorithms at different levels in SQL and PL/SQL
 Network Analysis Problems (4 slides)
 On shortest path and subnetwork grouping problems
 Network Paths by SQL (7 slides)
 Solving all- and shortest- path problems via pure SQL
 Two Algorithms with Code Timing (7 slides)
 Two PL/SQL network analysis algorithms with code timing and performance analysis
 Oracle Standard Profilers (2 slides)
 Results from two standard Oracle profiling tools for the Subnetwork Grouper procedure
 Tuning 1 - SQL for Isolated Nodes (5 slides)
 Recap of join methods and types, then queries with antijoin structures and hints
 Tuning 2 - SQL for Isolated Links (8 slides)
 Disastrous ‘Bitmap Or’ expansion, good & bad antijoin plans and efficient group counting query
 Tuning 3 - SQL for Root Node Selector (4 slides)
 Code timing several methods for root node selection
 Tuning – Results (2 slides)
 Code timing results for one dataset and before and after results for Subnetwork Grouper for all
 Conclusion (1 slide)
 A few recommendations split between SQL and PL/SQL
Brendan Furey, 2022 3
Analysing Performance of Algorithmic SQL and PL/SQL
Algorithms and SQL
Brendan Furey, 2022 4
Algorithms and SQL (9 slides)
On algorithms at different levels in SQL and PL/SQL
Analysing Performance of Algorithmic SQL and PL/SQL
The Algorithm (extracts from Computer Hope web page)
Brendan Furey, 2022 5
Analysing Performance of Algorithmic SQL and PL/SQL
Algorithm - Computer Hope
 Derived from the name of the
mathematician Muhammed ibn-Musa
Al-Khowarizmi, an algorithm is a
solution to a problem that meets the
following criteria.
 A list of instructions, procedures,
or formula that solves a problem
 Can be proven
 Something that always finishes
and works
When was the first algorithm?
 Because a cooking recipe could be considered an algorithm, the first algorithm could go back as far as written
language
 However, many find Euclid's algorithm for finding the greatest common divisor to be the first algorithm. This
algorithm was first described in 300 B.C.
 Ada Lovelace is credited as being the first computer programmer and the first person to develop an algorithm
for a machine
Algorithms and SQL 1 - Built-In Algorithms and Subquery Sequence
Brendan Furey, 2022 6
Analysing Performance of Algorithmic SQL and PL/SQL
Declarative Language (paraphrase from Britannia.com)
 Declarative languages are programming languages in which a program specifies what is to be
done rather than how to do it
SQL as a declarative language?
 SQL is often described as a declarative (or non-procedural) language
 But it’s a bit more complicated than that, especially when performance is important…
Built-In Algorithms
 Oracle provides built-in algorithms for joining tables and other rowsets, and grouping
 Oracle provides additional specific built-in algorithms for processing an input rowset …
 Analytics allows aggregation over partition key within rowset windows
 Match Recognize allows patterns to be reported across rows
 These algorithms are configured declaratively within an SQL subquery
 Also, we have more general algorithms
 Recursive subquery factors allow for recursive algorithms
 Model clause allows for iteration over cells within a spreadsheet-like array
Subquery Sequence  Build queries in a sequence of subquery steps
Algorithms and SQL 2 - Joins and Grouping
Brendan Furey, 2022 7
Analysing Performance of Algorithmic SQL and PL/SQL
SELECT d.department_name, Avg(e.salary) avg_sal
FROM departments d
JOIN employees e
ON e.department_id = d.department_id
GROUP BY d.department_name
ORDER BY d.department_name
Simple Query with Joins and Grouping
 A simple query joins data sources, and may group by a key…
 with aggregate functions on non-key columns
 Oracle CBO has multiple algorithms for joining and for aggregation
 Hash Join – using full table scans for larger data sets
 Nested Loops – using indexes for smaller data sets
 CBO chooses algorithm based on table statistics
 We can override with hints:
 USE_HASH(e)
 USE_NL(e)
DEPARTMENT_NAME AVG_SAL
---------------- -------
Accounting 10,154
Administration 4,400
Executive 19,333
Finance 8,601
Human Resources 6,500
IT 5,760
Marketing 9,500
Public Relations 10,000
Purchasing 4,150
Sales 8,956
Shipping 3,476
Example: Average salary grouped by department
Algorithms and SQL 3 - Analytics
Brendan Furey, 2022 8
Analysing Performance of Algorithmic SQL and PL/SQL
 Analytics allows aggregation over partition key within rowset windows
WITH rowset AS (
SELECT d.department_name, e.hire_date, e.last_name, e.salary
FROM departments d
JOIN employees e ON e.department_id = d.department_id
)
SELECT department_name, hire_date, last_name, salary,
Sum(salary) OVER (PARTITION BY department_name
ORDER BY hire_date) rsum_sal,
salary - Lag(salary) OVER (PARTITION BY department_name
ORDER BY hire_date) sal_incr
FROM rowset
ORDER BY department_name, hire_date
DEPARTMENT_NAME HIRE_DATE LAST_NAME SALARY RSUM_SAL SAL_INCR
---------------- --------- ------------ ------- -------- --------
Accounting 07-JUN-02 Gietz 8,300 20,308
Accounting 07-JUN-02 Higgins 12,008 20,308 3,708
Administration 17-SEP-03 Whalen 4,400 4,400
.
Sales 04-JAN-08 Johnson 6,200 261,300 -800
Sales 24-JAN-08 Marvins 7,200 268,500 1,000
Sales 29-JAN-08 Zlotkey 10,500 279,000 3,300
.
Example: Running sum of salaries and salary increase by department
 Can have multiple independent expressions
 Aggregate functions on fields (or expressions), apply over the partition
 Row set is unaltered, and does not have to be a separate subquery
 Range specifies a window based on the Order By expression
 Often range is defaulted, in example is Unbounded Preceding
Algorithms and SQL 4 - Pattern Matching
Brendan Furey, 2022 9
Analysing Performance of Algorithmic SQL and PL/SQL
 Match Recognize allows patterns to be reported across rows
WITH rowset AS (
SELECT dep.department_name, emp.hire_date, emp.last_name, emp.salary
FROM departments dep
JOIN employees emp ON emp.department_id = dep.department_id)
SELECT * FROM rowset
MATCH_RECOGNIZE (
PARTITION BY department_name
ORDER BY hire_date
MEASURES last_name AS last_name, salary AS salary
ONE ROW PER MATCH AFTER MATCH SKIP TO NEXT ROW
PATTERN ( up{2} )
DEFINE up AS up.salary > PREV(up.salary))
DEPARTMENT_NAME LAST_NAME SALARY
---------------- ------------ -------
Sales Bloom 10,000
Sales Zlotkey 10,500
Shipping OConnell 2,600
Shipping Mourgos 5,800
Shipping Grant 2,600
Shipping Geoni 2,800
Example: Two consecutive salary increases
 The Partition By allows for independent patterns across keys
 Order By defines row sequence
 Measures specifies fields (or expressions) to output
 Specify behaviour in relation to matches
 Pattern expresses sequences of values across rows
 Using a regex-like syntax
 Referencing variables from the Define section
 In example up{2} ~ 2 adjacent instances of salary increase
Algorithms and SQL 5 - Recursive Subqueries
Brendan Furey, 2022 10
Analysing Performance of Algorithmic SQL and PL/SQL
 Recursive subquery has anchor branch in union with
 …recursive branch that reads from subquery itself
 Partitioning via where clause
DEPARTMENT_NAME LAST_NAME MULT R_PROD
--------------- --------- ------ --------
Accounting Gietz 1.83 1.83
Accounting Higgins 2.2008 4.027464
Administration Whalen 1.44 1.44
Executive De Haan 2.7 2.7
Executive King 3.4 9.18
Executive Kochhar 2.7 24.786
.
Example: Running Products
WITH multipliers AS (
SELECT d.department_name, e.last_name, (1 + e.salary/10000) mult,
Row_Number() OVER (PARTITION BY d.department_name
ORDER BY e.last_name) rn
FROM departments d
JOIN employees e ON e.department_id = d.department_id
), rsf (department_name, last_name, rn, mult, running_prod) AS (
SELECT department_name, last_name, rn, mult, mult running_prod
FROM multipliers
WHERE rn = 1
UNION ALL
SELECT m.department_name, m.last_name, m.rn, m.mult,
r.running_prod * m.mult
FROM rsf r
JOIN multipliers m ON m.rn = r.rn + 1
AND m.department_name = r.department_name)
SELECT department_name, last_name, mult, running_prod FROM rsf
ORDER BY department_name, last_name
 Performs well for hierarchies, less well for looped
structures (as we’ll see later)
Algorithms and SQL 6 - Model Clause
Brendan Furey, 2022 11
Analysing Performance of Algorithmic SQL and PL/SQL
Example: Running Products
WITH multipliers AS (
SELECT d.department_name, e.last_name, (1 + e.salary/10000) mult
FROM departments d
JOIN employees e ON e.department_id = d.department_id
)
SELECT department_name, last_name, mult, running_prod
FROM multipliers
MODEL
PARTITION BY (department_name)
DIMENSION BY (Row_Number() OVER (PARTITION BY department_name
ORDER BY last_name) rn)
MEASURES (last_name, mult, mult running_prod)
RULES (running_prod[rn > 1] = mult[CV()] * running_prod[CV() - 1])
ORDER BY department_name, last_name
 Model clause does not have the best
reputation for performance
 Rarely seen in the wild…
 Model clause reads records from a rowset, then allows
 …rules to reference the rows and columns as array cells
 Partition By allows for independent patterns across keys
 Dimension By defines the indexing over rows, and can use
analytic functions
 Measures specifies fields (or expressions) to output
 Rules may update or insert rows, and optionally iterate
 Order By defines output order
Algorithms and SQL 7 - Subquery Sequence
Brendan Furey, 2022 12
Analysing Performance of Algorithmic SQL and PL/SQL
 Subqueries can reference not only tables and views, but…
 Previous subqueries
 Database functions, returning scalar or tabular outputs
 This allows us to build queries in a sequence of subquery steps
 This can be seen as a higher level algorithm in itself…
 specifying procedurally rather than declaratively at a
higher level: the how not just the what
 But CBO can override and rewrite the structure
Subqueries and Performance
 CBO’s query transformation can improve performance or worsen it
 Hints can often improve performance here, such as
 Materialize – evaluate the subquery and save the resulting rowset
 No_Query_Transformation – don’t transform the query
 Sometimes helps to split a complex query that CBO is transforming badly, eg
 Insert subquery output into a temporary table
 Put subquery into a pipelined function
 We can also manually transform, eg change Not Exists into explicit antijoins, as we’ll see later
Algorithms and SQL 7 – General Principles
Brendan Furey, 2022 13
Analysing Performance of Algorithmic SQL and PL/SQL
 Process in batches, or sets, where possible
 A process often has a startup cost plus a cost per row, so spread the startup cost
 Also, different algorithms may be more efficient for processing a set of rows or 1 row
 Avoid cursor loops when the rowset can be processed in a single query
 Prune early, avoid continued processing of rows that will later be eliminated, if possible
 Use where there is no efficient pure SQL algorithm, as in some network analysis problems
 But ensure SQL is used effectively within the PL/SQL algorithm
 Also can use to break a complex query into smaller sections via pipelined function/temp table
 Only do this when CBO performs badly
PL/SQL Algorithms
SQL Algorithms
 Use SQL algorithms that meet a specific requirement, within pure SQL
 Join and group rowsets
 Analytic functions for aggregation over a partition key within a rowset window
 Match Recognize for pattern matching across rows
 Recursive subqueries for traversing hierarchies
Network Analysis Problems
Brendan Furey, 2022 14
Network Analysis Problems (4 slides)
On shortest path and subnetwork grouping problems
Analysing Performance of Algorithmic SQL and PL/SQL
3 Subnetworks – Demo Network
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 15
Network Analysis Problems
 Undirected network
 Find all paths from root
 Find shortest paths from
root
 Group all nodes by
subnetwork
All Paths from S1-N0-1
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 16
Node Path Length
--------- ------------ ------
S1-N0-1 1 0
S1-N1-1 ..2 1
S1-N2-1 ....7 2
S1-N3-1 ......10 3
S1-N1-2 ..3 1
S1-N1-3 ....4 2
S1-N2-2 ....8 2
S1-N1-3 ..4 1
S1-N1-2 ....3 2
S1-N2-2 ......8 3
S1-N1-4 ..5 1
S1-N2-3 ....9 2
S1-N1-5 ......6 3
S1-N3-2 ......11 3
S1-N1-5 ..6 1
S1-N2-3 ....9 2
S1-N1-4 ......5 3
S1-N3-2 ......11 3
Shortest Paths from S1-N0-1
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 17
NODE_NAME NODE LEV
----------- ---------- ----
S1-N0-1 1 0
S1-N1-1 ..2 1
S1-N2-1 ....7 2
S1-N3-1 ......10 3
S1-N1-2 ..3 1
S1-N2-2 ....8 2
S1-N1-3 ..4 1
S1-N1-4 ..5 1
S1-N2-3 ....9 2
S1-N3-2 ......11 3
S1-N1-5 ..6 1
 Shortest paths networks form trees
Subnetwork Grouper
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 18
Network Paths by SQL
Brendan Furey, 2022 19
Network Paths by SQL (7 slides)
Solving all- and shortest- path problems via pure SQL
Analysing Performance of Algorithmic SQL and PL/SQL
SQL for All Paths
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 20
Get Execution Plan using Marker
WITH paths (node_id, lev) AS (
SELECT &root_id_var, 0
FROM DUAL
UNION ALL
SELECT CASE WHEN lnk.node_id_fr = pth.node_id THEN lnk.node_id_to ELSE lnk.node_id_fr END,
pth.lev + 1
FROM paths pth
JOIN links lnk
ON (lnk.node_id_fr = pth.node_id OR lnk.node_id_to = pth.node_id)
) SEARCH DEPTH FIRST BY node_id SET line_no
CYCLE node_id SET cycle TO '*' DEFAULT ' '
SELECT /*+ gather_plan_statistics XPLAN_ALL_PATHS */
n.node_name,
Substr(LPad ('.', 1 + 2 * p.lev, '.') || p.node_id, 2) node,
p.lev
FROM paths p
JOIN nodes n
ON n.id = p.node_id
WHERE cycle = ' '
ORDER BY p.line_no
Recursive subquery
CYCLE clause on node_id
Hint gather_plan_statistics
with marker string
Exclude cycle rows from
output
EXEC Utils.W(Utils.Get_XPlan(p_sql_marker => 'XPLAN_ALL_PATHS'));
 For tree networks each node has only one path from the root, and the SQL is efficient
 Also efficient for small looped networks
 For larger looped networks, finding all paths resource-intensive
 Also for non-pure-SQL methods: Intrinsically hard
SQL
SQL for Shortest Paths - One Recursive Subquery
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 21
WITH paths (node_id, rnk, lev) AS (
SELECT &root_id_var, 1, 0
FROM DUAL
UNION ALL
SELECT CASE WHEN l.node_id_fr = p.node_id THEN l.node_id_to
ELSE l.node_id_fr END,
Rank () OVER (PARTITION BY CASE WHEN l.node_id_fr = p.node_id
THEN l.node_id_to
ELSE l.node_id_fr END
ORDER BY p.node_id),
p.lev + 1
FROM paths p
JOIN links l
ON p.node_id IN (l.node_id_fr, l.node_id_to)
WHERE p.rnk = 1
) SEARCH DEPTH FIRST BY node_id SET line_no
CYCLE node_id SET lp TO '*' DEFAULT ' '
, node_min_levs AS (
SELECT node_id,
Min (lev) KEEP (DENSE_RANK FIRST ORDER BY lev) lev,
Min (line_no) KEEP (DENSE_RANK FIRST ORDER BY lev) line_no
FROM paths
GROUP BY node_id
)
SELECT n.node_name,
Substr(LPad ('.', 1 + 2 * m.lev, '.') || m.node_id, 2) node,
m.lev lev
FROM node_min_levs m
JOIN nodes n
ON n.id = m.node_id
ORDER BY m.line_no
SQL
Extra field, rnk = rank of record for a given
node at each iteration, based on the prior node
id
At each iteration only the record of rank 1 is
joined to new links, avoiding duplication
Subquery, node_min_levs, selects the
preferred record of minimum length for each
node
SQL for Shortest Paths - One Recursive Subquery - Performance
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 22
-------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem |
-------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 161 |00:00:07.90 | 5032K| | |
| 1 | SORT ORDER BY | | 1 | 381G| 161 |00:00:07.90 | 5032K| 18432 | 18432 |
|* 2 | HASH JOIN | | 1 | 381G| 161 |00:00:07.90 | 5032K| 1449K| 1449K|
| 3 | TABLE ACCESS FULL | NODES | 1 | 161 | 161 |00:00:00.01 | 7 | | |
| 4 | VIEW | | 1 | 381G| 161 |00:00:07.90 | 5032K| | |
| 5 | SORT GROUP BY | | 1 | 381G| 161 |00:00:07.90 | 5032K| 31744 | 31744 |
| 6 | VIEW | | 1 | 381G| 220K|00:00:07.81 | 5032K| | |
| 7 | UNION ALL (RECURSIVE WITH) DEPTH FIRST| | 1 | | 220K|00:00:07.77 | 5032K| 19M| 1646K|
| 8 | FAST DUAL | | 1 | 1 | 1 |00:00:00.01 | 0 | | |
| 9 | WINDOW SORT | | 79 | 381G| 220K|00:00:00.72 | 57440 | 478K| 448K|
| 10 | NESTED LOOPS | | 79 | 381G| 220K|00:00:00.52 | 57440 | | |
| 11 | RECURSIVE WITH PUMP | | 79 | | 3590 |00:00:00.01 | 0 | | |
|* 12 | TABLE ACCESS FULL | LINKS | 3590 | 45 | 220K|00:00:00.70 | 57440 | | |
-------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("N"."ID"="M"."NODE_ID")
12 - filter(("P"."NODE_ID"="L"."NODE_ID_FR" OR "P"."NODE_ID"="L"."NODE_ID_TO"))
Execution Plan (Extract) – Bacon/small (161 node / 3,342 link network)
 SQL solution can obtain the shortest paths efficiently for tree and smaller looped networks
 In larger looped networks the number of paths overall can become extremely large
 Recursive subquery discards all but one path to a given node at a given iteration…
 But has no access to other paths reached at earlier iterations
 And so may persist with longer paths that will be discarded in the later ranking subquery
 One approach to mitigating is to do a truncated search to obtain some bounds for later query…
SQL for Shortest Paths – Two Recursive Subqueries, Part 1
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 23
WITH paths_truncated (node_id, lev, rn) AS (
SELECT &root_id_var, 0, 1
FROM DUAL
UNION ALL
SELECT CASE WHEN l.node_id_fr = p.node_id THEN l.node_id_to
ELSE l.node_id_fr END,
p.lev + 1,
Row_Number () OVER (PARTITION BY CASE WHEN l.node_id_fr = p.node_id
THEN l.node_id_to
ELSE l.node_id_fr END
ORDER BY p.node_id)
FROM paths_truncated p
JOIN links l
ON p.node_id IN (l.node_id_fr, l.node_id_to)
WHERE p.rn = 1
AND p.lev < &LEVMAX)
CYCLE node_id SET lp TO '*' DEFAULT ' '
, approx_best_paths AS (
SELECT node_id,
Max (lev) KEEP (DENSE_RANK FIRST ORDER BY lev) lev
FROM paths_truncated
GROUP BY node_id)
paths_truncated (recursive subquery)
approx_best_paths
Same subquery as in 1-recursion
Except…
Truncate recursion at iteration &LEVMAX
( I tried 5 and 10)
Gets minimum lev by node_id from
paths_truncated
…
 Any paths to node_id longer in second recursion than found here can be
discarded
SQL for Shortest Paths – Two Recursive Subqueries, Part 2
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 24
), paths (node_id, lev, rn) AS (
SELECT &root_id_var, 0, 1
FROM DUAL
UNION ALL
SELECT CASE WHEN l.node_id_fr = p.node_id THEN l.node_id_to
ELSE l.node_id_fr END,
p.lev + 1,
Row_Number () OVER (PARTITION BY CASE WHEN l.node_id_fr = p.node_id
THEN l.node_id_to
ELSE l.node_id_fr END
ORDER BY p.node_id)
FROM paths p
JOIN links l
ON p.node_id IN (l.node_id_fr, l.node_id_to)
LEFT JOIN approx_best_paths b
ON b.node_id = CASE WHEN l.node_id_fr = p.node_id THEN l.node_id_to
ELSE l.node_id_fr END
WHERE p.rn = 1
AND p.lev < Nvl (b.lev, 1000000)
) SEARCH DEPTH FIRST BY node_id SET line_no CYCLE node_id SET lp TO '*' DEFAULT ' '
, node_min_levs AS (
SELECT node_id,
Min (lev) KEEP (DENSE_RANK FIRST ORDER BY lev) lev,
Min (line_no) KEEP (DENSE_RANK FIRST ORDER BY lev) line_no
FROM paths GROUP BY node_id)
SELECT n.node_name,
Substr(LPad ('.', 1 + 2 * m.lev, '.') || m.node_id, 2) node,
m.lev lev
FROM node_min_levs m
JOIN nodes n
ON n.id = m.node_id
ORDER BY m.line_no
paths (recursive subquery)
node_min_levs, main section
Same subquery as in 1-recursion
Except…
Outer-join approx_best_paths
…
Discard path if longer than found in
previous subquery
node_min_levs gets the minimum lev
by node_id
Along with the line_no…
To order by in main section
SQL for Shortest Paths - Two Recursive Subqueries - Performance
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 25
-------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 11803 |00:12:55.42 | 130M|
| 1 | SORT ORDER BY | | 1 | 18E| 11803 |00:12:55.42 | 130M|
|* 2 | HASH JOIN | | 1 | 18E| 11803 |00:12:55.41 | 130M|
| 3 | TABLE ACCESS FULL | NODES | 1 | 12466 | 12466 |00:00:00.01 | 46 |
| 4 | VIEW | | 1 | 18E| 11803 |00:12:55.40 | 130M|
| 5 | SORT GROUP BY | | 1 | 18E| 11803 |00:12:55.40 | 130M|
| 6 | VIEW | | 1 | 18E| 672K|00:12:54.66 | 130M|
| 7 | UNION ALL (RECURSIVE WITH) DEPTH FIRST | | 1 | | 672K|00:12:54.53 | 130M|
| 8 | FAST DUAL | | 1 | 1 | 1 |00:00:00.01 | 0 |
| 9 | WINDOW SORT | | 128 | 18E| 672K|00:11:57.72 | 65M|
|* 10 | FILTER | | 128 | | 672K|00:11:57.25 | 65M|
| 11 | MERGE JOIN OUTER | | 128 | 18E| 1821K|00:11:57.02 | 65M|
| 12 | SORT JOIN | | 128 | 18E| 1821K|00:06:38.56 | 24M|
| 13 | NESTED LOOPS | | 128 | 18E| 1821K|00:06:39.34 | 24M|
| 14 | RECURSIVE WITH PUMP | | 128 | | 21483 |00:00:00.11 | 1 |
|* 15 | TABLE ACCESS FULL | LINKS | 21483 | 95 | 1821K|00:07:40.14 | 24M|
|* 16 | SORT JOIN (REUSE) | | 1821K| 70T| 1203K|00:05:17.98 | 41M|
| 17 | VIEW | | 1 | 70T| 11625 |00:05:17.18 | 41M|
| 18 | SORT GROUP BY | | 1 | 70T| 11625 |00:05:17.18 | 41M|
| 19 | VIEW | | 1 | 70T| 1666K|00:00:25.89 | 41M|
| 20 | UNION ALL (RECURSIVE WITH) BREADTH FIRST| | 1 | | 1666K|00:00:25.48 | 41M|
| 21 | FAST DUAL | | 1 | 1 | 1 |00:00:00.01 | 0 |
| 22 | WINDOW SORT | | 6 | 70T| 1666K|00:04:53.82 | 18M|
| 23 | NESTED LOOPS | | 6 | 70T| 1666K|00:04:50.98 | 18M|
| 24 | RECURSIVE WITH PUMP | | 6 | | 16426 |00:00:00.25 | 2 |
|* 25 | TABLE ACCESS FULL | LINKS | 16426 | 95 | 1666K|00:06:32.63 | 18M|
-------------------------------------------------------------------------------------------------------------------------
Execution Plan (Extract) – Bacon/top250 (11,803 node subnetwork - 12,466 node / 583,993 link total)
 The E-Rows numbers here are often massive over-estimates
 Let’s look at the run times for both queries across the datasets…
SQL for Shortest Paths - Performance - Results
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 26
 One-recursive subquery ran for hours on top250 before being aborted, ok for small datasets
 Two-recursive subqueries completed top250 in 796/1,663s for Truncate at 5/10
 2-RS obtains a partial, approximative solution to enable early truncation of the paths
 The use of a hard-coded iteration limit in the first subquery has obvious limitations
 If it’s too low, the first subquery will provide too little information to optimize the second
 If it’s too large then the approximative subquery itself will have too much work to do
 We’ll see that using SQL within a PL/SQL algorithm will give better results…
Dataset
#Nodes
(all)
#Links
#Nodes
(sub)
Maxlev
#Secs
(1-RS)
Truncate
at
#Secs
(2-RS)
three_subnets 14 13 11 3 0.01 3 0.02
foreign_keys 289 319 47 5 0.01 5 0.01
brightkite 58,228 214,078 56,739 10 NA 5 559
bacon/small 161 3,342 161 5 8 5 0.1
bacon/top250 12,466 583,993 11,803 7 Aborted 5 796
bacon/top250 12,466 583,993 11,803 7 Aborted 10 1,663
Two Algorithms with Code Timing
Brendan Furey, 2022 27
Two Algorithms with Code Timing (7 slides)
Two PL/SQL network analysis algorithms with code timing and
performance analysis
Analysing Performance of Algorithmic SQL and PL/SQL
Two Algorithms
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 28
•Truncate the solution table, min_tree_links, and insert the root node at level 0
•Loop while records are inserted
• Insert a new node record at the next level:
• for every link that is connected to a node at the current level:
• that does not exist in the table for any prior level
• and does not appear at the next level for any other link with a higher ranked path
• Commit
• Increment level and inserts counter
• Exit when no records inserted
•Return the number of records inserted
•Truncate the solution table, node_roots
•Loop while a new root node is found
• Select a new root node id from nodes not in node_roots
• Exit loop when none found
• Call Ins_Min_Tree_Links to populate the solution table, min_tree_links, for the new root node
• Insert all nodes in min_tree_links into node_roots against the new root node
Min Pathfinder Algorithm
Subnetwork Grouper Algorithm
 Code timing will show tuning opportunities in the initial implementation
 Shortest paths are inserted at each iteration, and all inserted are visible to the future iterations
 This avoids the inefficiency inherent in the pure SQL solutions
Code Timing - Ins_Min_Tree_Links
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 29
FUNCTION Ins_Min_Tree_Links(
p_root_node_id PLS_INTEGER)
RETURN PLS_INTEGER IS
l_lev PLS_INTEGER := 0;
l_ins PLS_INTEGER;
l_ins_tot PLS_INTEGER := 0;
l_ts_id PLS_INTEGER := Timer_Set.Construct('Ins_Min_Tree_Links: ' || p_root_node_id);
BEGIN
EXECUTE IMMEDIATE 'TRUNCATE TABLE min_tree_links';
INSERT INTO min_tree_links VALUES (p_root_node_id, '', 0);
LOOP
INSERT INTO min_tree_links
SELECT CASE WHEN lnk.node_id_fr = mlp_cur.node_id THEN lnk.node_id_to
ELSE lnk.node_id_fr END,
Min (mlp_cur.node_id),
l_lev + 1
FROM min_tree_links mlp_cur
JOIN links lnk
ON (lnk.node_id_fr = mlp_cur.node_id OR lnk.node_id_to = mlp_cur.node_id)
LEFT JOIN min_tree_links mlp_pri
ON mlp_pri.node_id = CASE WHEN lnk.node_id_fr = mlp_cur.node_id THEN lnk.node_id_to
ELSE lnk.node_id_fr END
WHERE mlp_pri.node_id IS NULL
AND mlp_cur.lev = l_lev
GROUP BY CASE WHEN lnk.node_id_fr = mlp_cur.node_id THEN lnk.node_id_to
ELSE lnk.node_id_fr END;
l_ins := SQL%ROWCOUNT;
COMMIT;
l_ins_tot := l_ins_tot + l_ins;
Timer_Set.Increment_Time(l_ts_id, 'Level: ' || l_lev || ', nodes: ' || l_ins);
EXIT WHEN l_ins = 0;
l_lev := l_lev + 1;
END LOOP;
Utils.W(Timer_Set.Format_Results(l_ts_id));
RETURN l_ins_tot;
END Ins_Min_Tree_Links;
Construct timer set,
with root node in name
Time insert, with level
and rows in name
Write timer set
Code Timing - Ins_Min_Tree_Links - Results
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 30
Timer Set: Ins_Min_Tree_Links: 10001, Constructed at 30 Jul 2022 16:07:35, written at 16:11:01
==============================================================================================
Timer Elapsed CPU Calls Ela/Call CPU/Call
----------------------- ---------- ---------- ---------- ------------- -------------
Level: 0, nodes: 38 0.05 0.04 1 0.04700 0.04000
Level: 1, nodes: 5169 0.04 0.03 1 0.04200 0.03000
Level: 2, nodes: 202118 13.77 13.72 1 13.76500 13.72000
Level: 3, nodes: 358824 104.69 100.59 1 104.69100 100.59000
Level: 4, nodes: 100099 75.15 74.11 1 75.14900 74.11000
Level: 5, nodes: 11298 9.61 9.61 1 9.60600 9.61000
Level: 6, nodes: 1865 1.15 1.14 1 1.14700 1.14000
Level: 7, nodes: 421 0.29 0.30 1 0.28900 0.30000
Level: 8, nodes: 170 0.16 0.16 1 0.16200 0.16000
Level: 9, nodes: 39 0.10 0.09 1 0.09700 0.09000
Level: 10, nodes: 11 0.07 0.08 1 0.07000 0.08000
Level: 11, nodes: 7 0.07 0.08 1 0.07300 0.08000
Level: 12, nodes: 0 0.07 0.06 1 0.07200 0.06000
(Other) 0.39 0.39 1 0.39400 0.39000
----------------------- ---------- ---------- ---------- ------------- -------------
Total 205.60 200.40 14 14.68600 14.31429
----------------------- ---------- ---------- ---------- ------------- -------------
[Timer timed (per call in ms): Elapsed: 0.02061, CPU: 0.02245]
Results for Bacon/only_tv_v Dataset (680,060 node subnetwork - 744,374 node / 22,503,060 link total)
 The results show a total elapsed time of 206 seconds
 There is a timer for each iteration, showing CPU and elapsed times, with nodes processed
 As you’d expect, the largest times correspond to the most nodes inserted…
 and with time per node increasing as the solution table fills up
 Each iteration corresponds to a single insert, we can get the execution plan…
Execution Plan - Ins_Min_Tree_Links Insert
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 31
--------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem |
--------------------------------------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 1 | | 0 |00:00:00.07 | 6777 | | |
| 1 | LOAD TABLE CONVENTIONAL | MIN_TREE_LINKS | 1 | | 0 |00:00:00.07 | 6777 | | |
| 2 | HASH GROUP BY | | 1 | 2 | 0 |00:00:00.07 | 6777 | 1161K| 1161K|
| 3 | VIEW | VW_ORE_BC29D05C | 1 | 2 | 0 |00:00:00.07 | 6777 | | |
| 4 | UNION-ALL | | 1 | | 0 |00:00:00.07 | 6777 | | |
|* 5 | HASH JOIN ANTI | | 1 | 1 | 0 |00:00:00.04 | 3388 | 1106K| 1106K|
| 6 | NESTED LOOPS | | 1 | 33 | 10 |00:00:00.01 | 1709 | | |
| 7 | NESTED LOOPS | | 1 | 33 | 10 |00:00:00.01 | 1699 | | |
|* 8 | TABLE ACCESS FULL | MIN_TREE_LINKS | 1 | 1 | 7 |00:00:00.01 | 1683 | | |
|* 9 | INDEX RANGE SCAN | LINKS_FR_N1 | 7 | 33 | 10 |00:00:00.01 | 16 | | |
| 10 | TABLE ACCESS BY INDEX ROWID| LINKS | 10 | 33 | 10 |00:00:00.01 | 10 | | |
| 11 | TABLE ACCESS FULL | MIN_TREE_LINKS | 1 | 1 | 680K|00:00:00.01 | 1679 | | |
|* 12 | HASH JOIN ANTI | | 1 | 1 | 0 |00:00:00.03 | 3389 | 1106K| 1106K|
| 13 | NESTED LOOPS | | 1 | 33 | 12 |00:00:00.01 | 1710 | | |
| 14 | NESTED LOOPS | | 1 | 33 | 12 |00:00:00.01 | 1698 | | |
|* 15 | TABLE ACCESS FULL | MIN_TREE_LINKS | 1 | 1 | 7 |00:00:00.01 | 1682 | | |
|* 16 | INDEX RANGE SCAN | LINKS_TO_N1 | 7 | 33 | 12 |00:00:00.01 | 16 | | |
|* 17 | TABLE ACCESS BY INDEX ROWID| LINKS | 12 | 33 | 12 |00:00:00.01 | 12 | | |
| 18 | TABLE ACCESS FULL | MIN_TREE_LINKS | 1 | 1 | 680K|00:00:00.01 | 1679 | | |
--------------------------------------------------------------------------------------------------------------------------------
Execution Plan Extract
INSERT INTO min_tree_links
SELECT /*+ gather_plan_statistics XPLAN_MTL */
…
Add hint to obtain execution plan
Utils.W(Utils.Get_XPlan(p_sql_marker => 'XPLAN_MTL'));
Write execution plan using wrapper function
 No obvious problems
Code Timing - Ins_Node_Roots
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 32
Code Timing Output
PROCEDURE Ins_Node_Roots IS
l_root_id PLS_INTEGER;
l_ins_tot PLS_INTEGER;
l_ts_id PLS_INTEGER := Timer_Set.Construct('Ins_Node_Roots');
l_suffix VARCHAR2(60);
BEGIN
EXECUTE IMMEDIATE 'TRUNCATE TABLE node_roots';
LOOP
BEGIN
SELECT id INTO l_root_id FROM nodes WHERE id NOT IN (SELECT node_id
FROM node_roots)
AND ROWNUM = 1;
EXCEPTION
WHEN NO_DATA_FOUND THEN
l_root_id := NULL;
END;
Timer_Set.Increment_Time(l_ts_id, 'SELECT id INTO l_root_id');
EXIT WHEN l_root_id IS NULL;
l_ins_tot := Ins_Min_Tree_Links(l_root_id);
l_suffix := CASE WHEN l_ins_tot = 0 THEN '(1 node)'
WHEN l_ins_tot = 1 THEN '(2 nodes)'
WHEN l_ins_tot = 2 THEN '(3 nodes)'
WHEN l_ins_tot < 40 THEN '(4-39 nodes)'
ELSE '(root node ' || l_root_id || ', size: ' || (l_ins_tot + 1)
|| ')'
END;
Timer_Set.Increment_Time(l_ts_id, 'Insert min_tree_links ' || l_suffix);
INSERT INTO node_roots tgt
SELECT node_id, l_root_id, lev FROM min_tree_links;
Timer_Set.Increment_Time(l_ts_id, 'Insert node_roots ' || l_suffix);
END LOOP;
Utils.W(Timer_Set.Format_Results(l_ts_id));
Procedure with Code Timing
Construct timer set
Time node selector query
Timer name suffix allows aggregation
by subnetwork size group
Time Ins_Min_Tree_Links by size
group
Time Insert node_roots by size group
Write timer set
Code Timing - Ins_Node_Roots - Results
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 33
Code Timing Output
Timer Set: Ins_Node_Roots, Constructed at 30 Jul 2022 16:15:48, written at 16:44:22
===================================================================================
Timer Elapsed CPU Calls Ela/Call CPU/Call
--------------------------------------------------- ---------- ---------- ---------- ------------- -------------
SELECT id INTO l_root_id 1517.43 1506.68 19642 0.07725 0.07671
Insert min_tree_links (root node 579, size: 680060) 122.95 120.31 1 122.94500 120.31000
Insert node_roots (root node 579, size: 680060) 4.10 4.05 1 4.10400 4.05000
Insert min_tree_links (4-39 nodes) 20.21 23.07 5317 0.00380 0.00434
Insert node_roots (4-39 nodes) 1.56 1.61 5317 0.00029 0.00030
Insert min_tree_links (root node 646, size: 58) 0.01 0.01 1 0.00800 0.01000
Insert node_roots (root node 646, size: 58) 0.00 0.00 1 0.00000 0.00000
Insert min_tree_links (3 nodes) 7.14 7.29 2091 0.00341 0.00349
Insert node_roots (3 nodes) 0.50 0.62 2091 0.00024 0.00030
Insert min_tree_links (1 node) 24.91 24.76 8659 0.00288 0.00286
Insert node_roots (1 node) 2.18 1.75 8659 0.00025 0.00020
Insert min_tree_links (2 nodes) 11.74 11.67 3539 0.00332 0.00330
Insert node_roots (2 nodes) 0.88 1.42 3539 0.00025 0.00040
...
(Other) 0.00 0.00 1 0.00100 0.00000
--------------------------------------------------- ---------- ---------- ---------- ------------- -------------
Total 1714.02 1703.66 58925 0.02909 0.02891
--------------------------------------------------- ---------- ---------- ---------- ------------- -------------
[Timer timed (per call in ms): Elapsed: 0.01282, CPU: 0.01282]
Results for Bacon/only_tv_v Dataset (744,374 nodes and 22,503,060 links)
 The results show a total elapsed time of 1,714 seconds, 90% from the SELECT timer
 To improve performance we need first to focus on that code section
 8,659 calls were made for '(1 node)' suffix timers and 3,539 for the '(2 nodes)' ones
 Call also corresponds to an instance of SELECT id INTO l_root_id, ~ about 26% of that line
 We can insert these 1/2 node node_roots records in single inserts prior to main algorithm
Two Algorithms - Performance Considerations
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 34
 It does this by storing the paths at each iteration, and excluding nodes already reached from
future iterations
 At the same time, each iteration uses a single SQL insert with subquery to process in an
efficient set-based fashion
 We will find the resulting queries themselves can be tuned using query transformation and hints
 It thus benefits from its efficiency to identify the subnetworks
 However, code timing identified two main areas in which a still more set-based approach could
improve performance:
 Firstly, One and two-node subnetworks do an insert for each node
 We could in fact insert all of these in a single set-based insert each, ahead of the main
algorithm for the larger subnetworks
 Secondly, a root node selector query is executed for each subnetwork
 We may be able to find a way of selection that does not execute this at each iteration
Min Pathfinder
Subnetwork Grouper
 algorithm allows us to prune non-shortest paths early
 algorithm uses Min Pathfinder within a higher level algorithm
SQL Tuning
Oracle Standard Profilers
Brendan Furey, 2022 35
Oracle Standard Profilers (2 slides)
Results from two standard Oracle profiling tools for the Subnetwork
Grouper procedure
Analysing Performance of Algorithmic SQL and PL/SQL
Flat Profiler
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 36
VAR RUN_ID NUMBER
DECLARE
l_result PLS_INTEGER;
BEGIN
l_result := DBMS_Profiler.Start_Profiler(
run_comment => 'Profile for Ins_Node_Roots',
run_number => :RUN_ID);
Shortest_Path_SQL_Base.Ins_Node_Roots;
l_result := DBMS_Profiler.Stop_Profiler;
END;
/
@....dprof_queries :RUN_ID
Calling Flat Profiler
Profiler data by time (PLSQL_PROFILER_DATA)
Seconds Calls Unit Line# Line Text
----------- -------- ------------------------- ------- ---------------------------------------------------------------------------
-----------------------------------
1789.829 19642 SHORTEST_PATH_SQL_BASE 85 SELECT id INTO l_root_id FROM nodes WHERE id NOT IN (SELECT node_id FROM
node_roots) AND ROWNUM = 1;
128.850 31374 SHORTEST_PATH_SQL_BASE 15 INSERT INTO min_tree_links
31.519 19641 SHORTEST_PATH_SQL_BASE 11 EXECUTE IMMEDIATE 'TRUNCATE TABLE min_tree_links';
7.318 19641 SHORTEST_PATH_SQL_BASE 93 INSERT INTO node_roots tgt
4.828 19641 SHORTEST_PATH_SQL_BASE 12 INSERT INTO min_tree_links VALUES (p_root_node_id, '', 0);
1.897 31374 SHORTEST_PATH_SQL_BASE 31 COMMIT;
0.071 31374 SHORTEST_PATH_SQL_BASE 30 l_ins := SQL%ROWCOUNT;
...
157179 rows selected.
Call to be profiled
Start…
…and stop profiler
Custom reporting script, passed run id
 The line text is got by joining the system view all_source to the profiler package/line number
Hierarchical Profiler
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 37
VAR RUN_ID NUMBER
BEGIN
HProf_Utils.Start_Profiling;
Shortest_Path_SQL_Base.Ins_Node_Roots;
:RUN_ID := HProf_Utils.Stop_Profiling(
p_run_comment => 'Profile for Ins_Node_Roots',
p_filename => 'hp_ins_node_roots_&SUB..html');
END;
/
@....hprof_queries :RUN_ID
Calling Hierarchical Profiler
Profiler data by time (PLSQL_PROFILER_DATA)
Function tree Owner Module Inst. Subtree MicroS Function MicroS Calls
------------------------------------ ------------------ ------------------------- ------ -------------- --------------- -------
INS_NODE_ROOTS SHORTEST_PATH_SQL SHORTEST_PATH_SQL_BASE 1668506464 332444 1
__static_sql_exec_line85 SHORTEST_PATH_SQL SHORTEST_PATH_SQL_BASE 1490685875 1490685875 19642
INS_MIN_TREE_LINKS SHORTEST_PATH_SQL SHORTEST_PATH_SQL_BASE 169705305 1195428 19641
__static_sql_exec_line15 SHORTEST_PATH_SQL SHORTEST_PATH_SQL_BASE 141321652 141321652 31378
__dyn_sql_exec_line11 SHORTEST_PATH_SQL SHORTEST_PATH_SQL_BASE 20610292 9775994 19641
__plsql_vm 1 of 2 10834298 71027 19641
__anonymous_block 1 of 2 10763826 2971230 19642
IS_VPD_ENABLED SYS IS_VPD_ENABLED 1 of 2 6934407 395925 39284
__static_sql_exec_line22 SYS IS_VPD_ENABLED 1 of 2 6538482 6538482 39284
DICTIONARY_OBJ_OWNER SYS DICTIONARY_OBJ_OWNER 1 of 2 812866 812866 39284
DICTIONARY_OBJ_NAME SYS DICTIONARY_OBJ_NAME 1 of 2 45323 45323 39284
__static_sql_exec_line12 SHORTEST_PATH_SQL SHORTEST_PATH_SQL_BASE 4413730 4413730 19641
__static_sql_exec_line31 SHORTEST_PATH_SQL SHORTEST_PATH_SQL_BASE 2164203 2164203 31378
__static_sql_exec_line93 SHORTEST_PATH_SQL SHORTEST_PATH_SQL_BASE 7706832 7706832 19641
__dyn_sql_exec_line81 SHORTEST_PATH_SQL SHORTEST_PATH_SQL_BASE 76008 75449 1
__plsql_vm 2 of 2 559 4 1
__static_sql_exec_line700 SYS DBMS_HPROF 128 128 1
STOP_PROFILING LIB HPROF_UTILS 22 22 1
STOP_PROFILING SYS DBMS_HPROF 0 0 1
Custom reporting script, passed run id
Call to be profiled
Custom wrapper package around start
…and stop profiling
HTML results filename
Tuning 1 - SQL for Isolated Nodes
Brendan Furey, 2022 38
Tuning 1 - SQL for Isolated Nodes (5 slides)
Recap of join methods and types, then queries with antijoin
structures and hints
Analysing Performance of Algorithmic SQL and PL/SQL
SQL Join Definitions
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 39
Join Types
For each row in the outer data set that matches the single-table predicates, the database
retrieves all rows in the inner data set that satisfy the join predicate. If an index is available, then
the database can use it to access the inner data set by rowid
Hash Join - The database uses a hash join to join larger data sets
The optimizer uses the smaller of two data sets to build a hash table on the join key in memory,
using a deterministic hash function to specify the location in the hash table in which to store
each row. The database then scans the larger data set, probing the hash table to find the rows
that meet the join condition
Extracted from: SQL Tuning Guide, 21c
Antijoin
An antijoin is a join between two data sets that returns a row from the first set when a matching
row does not exist in the subquery data set.
Like a semijoin, an antijoin stops processing the subquery data set when the first match is
found. Unlike a semijoin, the antijoin only returns a row when no match is found
Nested Loops Join - Nested loops join an outer data set to an inner data set
Join Methods
SQL for Isolated Nodes: SQL 1 - Not Exists / Or
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 40
Execution Plan - Ran on only_tv_v dataset (744,374 nodes and 22,503,060 links)
INSERT INTO node_roots
SELECT nod.id, nod.id, 0
FROM nodes nod
WHERE NOT EXISTS (SELECT 1
FROM links lnk
WHERE lnk.node_id_fr = nod.id
OR lnk.node_id_to = nod.id);
------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
------------------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 1 | | 0 |00:00:25.78 | 191K| 93127 |
| 1 | LOAD TABLE CONVENTIONAL | NODE_ROOTS | 1 | | 0 |00:00:25.78 | 191K| 93127 |
|* 2 | HASH JOIN ANTI | | 1 | 53174 | 8659 |00:00:25.73 | 176K| 93122 |
| 3 | INDEX FAST FULL SCAN | SYS_C0018310 | 1 | 744K| 744K|00:00:00.08 | 1461 | 0 |
| 4 | VIEW | VW_SQ_1 | 1 | 45M| 45M|00:00:22.86 | 174K| 93122 |
| 5 | UNION-ALL | | 1 | | 45M|00:00:15.93 | 174K| 93122 |
| 6 | TABLE ACCESS FULL | LINKS | 1 | 22M| 22M|00:00:02.61 | 87315 | 46561 |
| 7 | TABLE ACCESS FULL | LINKS | 1 | 22M| 22M|00:00:02.19 | 87315 | 46561 |
------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("VW_COL_1"="NOD"."ID")
 UNION ALL results in a single hash antijoin, with a probe table twice the size of links
 What if we replaced the NOT EXISTS with explicit antijoins?...
 All the nodes that are present only in the nodes table but not in the links table
 Can be expressed in a single SQL statement for the insert
 Query obtains the 8,659 isolated nodes in 21 seconds
 S5: OR transformed into UNION ALL of two full links scans, S6/7
 S4: View of 45M rows used as probe table in hash antijoin, S2…
 S3: With scan of nodes unique index as the build table
SQL for Isolated Nodes: SQL 2 - Outer Joins Unhinted
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 41
Execution Plan
INSERT INTO node_roots
SELECT nod.id, nod.id, 0
FROM nodes nod
LEFT JOIN links lnk_f
ON lnk_f.node_id_fr = nod.id
LEFT JOIN links lnk_t
ON lnk_t.node_id_to = nod.id
WHERE lnk_f.node_id_fr IS NULL
AND lnk_t.node_id_fr IS NULL;
------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
------------------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 1 | | 0 |00:00:12.48 | 191K| 93127 |
| 1 | LOAD TABLE CONVENTIONAL | NODE_ROOTS | 1 | | 0 |00:00:12.48 | 191K| 93127 |
|* 2 | HASH JOIN ANTI | | 1 | 532 | 8659 |00:00:12.43 | 176K| 93122 |
|* 3 | HASH JOIN ANTI | | 1 | 53174 | 57851 |00:00:08.41 | 88776 | 46561 |
| 4 | INDEX FAST FULL SCAN | SYS_C0018310 | 1 | 744K| 744K|00:00:00.07 | 1461 | 0 |
| 5 | TABLE ACCESS FULL | LINKS | 1 | 22M| 22M|00:00:02.58 | 87315 | 46561 |
| 6 | TABLE ACCESS FULL | LINKS | 1 | 22M| 22M|00:00:01.83 | 87315 | 46561 |
------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("LNK_T"."NODE_ID_TO"="NOD"."ID")
3 - access("LNK_F"."NODE_ID_FR"="NOD"."ID")
 Query obtains the 8,659 isolated nodes in 12 seconds
 S4: An index scan of the unique index on nodes as the build table for
a hash antijoin, S3
 S5: Full scan of the links table as the probe table
 S2: Hash antijoin uses result set as the build table
 S6: With another full scan of the links table as the second probe table
 Convert NOT EXISTS into outer antijoins
 Where the CBO in SQL-1 used a view/union and a single hash antijoin…
 Two outer joins resulted in two hash antijoins, but with smaller probe tables and faster
 How would this compare with a plan using nested loop joins?
SQL for Isolated Nodes: SQL 3 - Outer Joins Hinted
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 42
Execution Plan
INSERT INTO node_roots
SELECT
/*+gather_plan_statistics USE_NL (lnk_f) USE_NL (lnk_t)*/
nod.id, nod.id, 0
FROM nodes nod
LEFT JOIN links lnk_f
ON lnk_f.node_id_fr = nod.id
LEFT JOIN links lnk_t
ON lnk_t.node_id_to = nod.id
WHERE lnk_f.node_id_fr IS NULL
AND lnk_t.node_id_fr IS NULL;
------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
------------------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 1 | | 0 |00:00:01.27 | 624K| 5 |
| 1 | LOAD TABLE CONVENTIONAL | NODE_ROOTS | 1 | | 0 |00:00:01.27 | 624K| 5 |
| 2 | NESTED LOOPS ANTI | | 1 | 532 | 8659 |00:00:00.89 | 622K| 0 |
| 3 | NESTED LOOPS ANTI | | 1 | 53174 | 57851 |00:00:01.04 | 506K| 0 |
| 4 | INDEX FAST FULL SCAN | SYS_C0018310 | 1 | 744K| 744K|00:00:00.11 | 1461 | 0 |
|* 5 | INDEX RANGE SCAN | LINKS_FR_N1 | 744K| 20M| 686K|00:00:00.83 | 505K| 0 |
|* 6 | INDEX RANGE SCAN | LINKS_TO_N1 | 57851 | 22M| 49192 |00:00:00.15 | 115K| 0 |
------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("LNK_F"."NODE_ID_FR"="NOD"."ID")
6 - access("LNK_T"."NODE_ID_TO"="NOD"."ID")
 Obtains the 8,659 isolated nodes in 1.5 seconds
 S2, S3: Two nested loops antijoins
 S4: Drives off full scan of the unique index on
nodes
 S5: First join to From index on links
 S6: Then join to To index on links
 Hint to use nested loops joins: USE_NL (lnk_f) USE_NL (lnk_t)
 Estimated rows for the two range scans are much higher than the actual rows returned. Let’s
look at it…
SQL for Isolated Nodes: SQL 3 - Nested Loops Analysis
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 43
Execution Plan
----------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows |
----------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 1 | | 0 |
| 1 | LOAD TABLE CONVENTIONAL | NODE_ROOTS | 1 | | 0 |
| 2 | NESTED LOOPS ANTI | | 1 | 532 | 8659 |
| 3 | NESTED LOOPS ANTI | | 1 | 53174 | 57851 |
| 4 | INDEX FAST FULL SCAN | SYS_C0018310 | 1 | 744K| 744K|
|* 5 | INDEX RANGE SCAN | LINKS_FR_N1 | 744K| 20M| 686K|
|* 6 | INDEX RANGE SCAN | LINKS_TO_N1 | 57851 | 22M| 49192 |
----------------------------------------------------------------------------
 E-Rows of 20M in S5 and 22M in S6 seem to assume getting all matches
 And seem to be across all starts, usually it’s per start
 But, as we saw in the definitions, antijoins get only the first match
 As reflected in the A-Rows of 686K and 49,192
 It is almost as though (to speculate):
 The SQL engine is smart enough to know that, in the context of the anti-join, there is no
point in bringing back all the joining records when these will all be eliminated later
 But that the CBO is not, and chooses a bad plan, when unhinted, for that reason
 Anyway, it’s important to note that the CBO does not always choose the optimal join method
E-Rows Anomaly
Tuning 2 - SQL for Isolated Links
Brendan Furey, 2022 44
Tuning 2 - SQL for Isolated Links (8 slides)
Disastrous ‘Bitmap Or’ expansion, good and bad antijoin plans and
efficient group counting query
Analysing Performance of Algorithmic SQL and PL/SQL
SQL for Isolated Links: SQL 1 - Not Exists / 4-way Or
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 45
INSERT INTO node_roots
WITH isolated_links AS (
SELECT lnk.node_id_fr, lnk.node_id_to
FROM links lnk
WHERE NOT EXISTS (
SELECT 1
FROM links lnk_1
WHERE (lnk_1.node_id_fr = lnk.node_id_to OR
lnk_1.node_id_to = lnk.node_id_fr OR
lnk_1.node_id_fr = lnk.node_id_fr OR
lnk_1.node_id_to = lnk.node_id_to)
AND lnk_1.ROWID != lnk.ROWID ))
SELECT node_id_fr, node_id_fr, 0
FROM isolated_links
UNION
SELECT node_id_to, node_id_fr, 1
FROM isolated_links
 NOT EXISTS links record matching: any of 4
conditions
 And not the driving links record itself
 For record passing the NOT EXISTS:
 Add both from and to nodes into node_roots
 Links that do not connect to any other links
 From and to node is neither a from nor a to node in any other link
 Ran on pre1950 dataset (134,131 nodes and 8,095,294 links)
 Obtains the 425 isolated links in 4,103 seconds!
 Let’s look at the execution plan…
SQL for Isolated Links: SQL 1 - Not Exists / 4-way Or - Execution Plan
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 46
Execution Plan (Extract)
 S7: CBO transforms the OR conditions into a 4-section BITMAP OR
 S6: Then a BITMAP CONVERSION TO ROWIDS and
 S5: A links table access to filter the driving instance (S3)
-----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time |
-----------------------------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 1 | | 0 |00:41:31.41 |
| 1 | TEMP TABLE TRANSFORMATION | | 1 | | 0 |00:41:31.41 |
| 2 | LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D6C4F_4F65443 | 1 | | 0 |00:41:31.37 |
|* 3 | FILTER | | 1 | | 425 |01:20:08.08 |
| 4 | TABLE ACCESS FULL | LINKS | 1 | 8095K| 8095K|00:00:01.57 |
| 5 | TABLE ACCESS BY INDEX ROWID BATCHED | LINKS | 8095K| 1 | 8094K|01:08:05.02 |
|* 6 | BITMAP CONVERSION TO ROWIDS | | 8095K| | 8094K|01:07:45.55 |
| 7 | BITMAP OR | | 8095K| | 8094K|01:07:35.34 |
|* 8 | BITMAP CONVERSION FROM ROWIDS | | 8095K| | 7978K|00:09:42.09 |
|* 9 | INDEX RANGE SCAN | LINKS_TO_N1 | 8095K| | 3076M|00:08:56.46 |
|* 10 | BITMAP CONVERSION FROM ROWIDS | | 8095K| | 8086K|00:17:19.09 |
|* 11 | INDEX RANGE SCAN | LINKS_TO_N1 | 8095K| | 5926M|00:16:17.18 |
|* 12 | BITMAP CONVERSION FROM ROWIDS | | 8095K| | 7974K|00:09:27.91 |
|* 13 | INDEX RANGE SCAN | LINKS_FR_N1 | 8095K| | 3076M|00:08:41.13 |
|* 14 | BITMAP CONVERSION FROM ROWIDS | | 8095K| | 8086K|00:19:00.44 |
|* 15 | INDEX RANGE SCAN | LINKS_FR_N1 | 8095K| | 6232M|00:17:21.29 |
| 16 | LOAD TABLE CONVENTIONAL | NODE_ROOTS | 1 | | 0 |00:00:00.04 |
| 17 | HASH UNIQUE | | 1 | 16M| 850 |00:00:00.03 |
| 18 | UNION-ALL | | 1 | | 850 |00:00:00.01 |
| 19 | VIEW | | 1 | 8095K| 425 |00:00:00.01 |
| 20 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6C4F_4F65443 | 1 | 8095K| 425 |00:00:00.01 |
| 21 | VIEW | | 1 | 8095K| 425 |00:00:00.01 |
| 22 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6C4F_4F65443 | 1 | 8095K| 425 |00:00:00.01 |
-----------------------------------------------------------------------------------------------------------------------
 8095K starts, S5-S15
 A-Rows very high
SQL for Isolated Links: SQL 2 - 4 Not Exists Subqueries
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 47
INSERT INTO node_roots
WITH isolated_links AS (
SELECT lnk.node_id_fr, lnk.node_id_to
FROM links lnk
WHERE NOT EXISTS (
SELECT 1
FROM links lnk_1
WHERE lnk_1.node_id_fr = lnk.node_id_fr
AND lnk_1.ROWID != lnk.ROWID)
AND NOT EXISTS (
SELECT 1
FROM links lnk_2
WHERE lnk_2.node_id_to = lnk.node_id_to
AND lnk_2.ROWID != lnk.ROWID)
AND NOT EXISTS (
SELECT 1
FROM links lnk_3
WHERE (lnk_3.node_id_fr = lnk.node_id_to)
AND lnk_3.ROWID != lnk.ROWID)
AND NOT EXISTS (
SELECT 1
FROM links lnk_4
WHERE (lnk_4.node_id_to = lnk.node_id_fr)
AND lnk_4.ROWID != lnk.ROWID))
SELECT node_id_fr, node_id_fr, 0
FROM isolated_links
UNION
SELECT node_id_to, node_id_fr, 1
FROM isolated_links
 Split the NOT EXISTS with 4 conditions into…
 A NOT EXISTS for each condition, replicating…
 …the ‘not the driving links record’ condition
 Obtains the 425 isolated links in 20 seconds, much faster!
 Let’s look at the execution plan…
SQL for Isolated Links: SQL 2 - 4 Not Exists Subqueries - Execution Plan
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 48
Execution Plan (Extract)
 S9: Plan starts with a hash antijoin on full scans of links…
 S7,5,3: Then a sequence of hash right antijoins on result sets to full scans of links
 …where right means the build table/probe table choice is reversed from the default
 …making the build table the (smaller) result set
 Note that the A-Rows drops rapidly from 116K as the sequence progresses, down to 425 (S3)
-----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time |
-----------------------------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 1 | | 0 |00:00:12.78 |
| 1 | TEMP TABLE TRANSFORMATION | | 1 | | 0 |00:00:12.78 |
| 2 | LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D6C1D_4F65443 | 1 | | 0 |00:00:12.77 |
|* 3 | HASH JOIN RIGHT ANTI | | 1 | 8095K| 425 |00:00:13.60 |
| 4 | TABLE ACCESS FULL | LINKS | 1 | 8095K| 8095K|00:00:00.69 |
|* 5 | HASH JOIN RIGHT ANTI | | 1 | 8095K| 484 |00:00:09.94 |
| 6 | TABLE ACCESS FULL | LINKS | 1 | 8095K| 8095K|00:00:00.59 |
|* 7 | HASH JOIN RIGHT ANTI | | 1 | 8095K| 4196 |00:00:05.59 |
| 8 | TABLE ACCESS FULL | LINKS | 1 | 8095K| 8095K|00:00:00.58 |
|* 9 | HASH JOIN ANTI | | 1 | 8095K| 116K|00:00:04.99 |
| 10 | TABLE ACCESS FULL | LINKS | 1 | 8095K| 8095K|00:00:00.59 |
| 11 | TABLE ACCESS FULL | LINKS | 1 | 8095K| 8095K|00:00:00.55 |
| 12 | LOAD TABLE CONVENTIONAL | NODE_ROOTS | 1 | | 0 |00:00:00.02 |
| 13 | HASH UNIQUE | | 1 | 16M| 850 |00:00:00.01 |
| 14 | UNION-ALL | | 1 | | 850 |00:00:00.01 |
| 15 | VIEW | | 1 | 8095K| 425 |00:00:00.01 |
| 16 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6C1D_4F65443 | 1 | 8095K| 425 |00:00:00.01 |
| 17 | VIEW | | 1 | 8095K| 425 |00:00:00.01 |
| 18 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6C1D_4F65443 | 1 | 8095K| 425 |00:00:00.01 |
-----------------------------------------------------------------------------------------------------------------------
SQL for Isolated Links: SQL 3 - 4 Outer Joins
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 49
INSERT INTO node_roots
WITH isolated_links AS (
SELECT lnk.node_id_fr, lnk.node_id_to
FROM links lnk
LEFT JOIN links lnk_1
ON (lnk_1.node_id_fr = lnk.node_id_fr
AND lnk_1.ROWID != lnk.ROWID)
LEFT JOIN links lnk_2
ON (lnk_2.node_id_fr = lnk.node_id_to
AND lnk_2.ROWID != lnk.ROWID)
LEFT JOIN links lnk_3
ON (lnk_3.node_id_to = lnk.node_id_fr
AND lnk_3.ROWID != lnk.ROWID)
LEFT JOIN links lnk_4
ON (lnk_4.node_id_to = lnk.node_id_to
AND lnk_4.ROWID != lnk.ROWID)
WHERE lnk_1.node_id_fr IS NULL
AND lnk_2.node_id_fr IS NULL
AND lnk_3.node_id_to IS NULL
AND lnk_4.node_id_to IS NULL
)
SELECT node_id_fr, node_id_fr, 0
FROM isolated_links
UNION
SELECT node_id_to, node_id_fr, 1
FROM isolated_links
 Replace each NOT EXISTS with an outer antijoin
 This worked well for isolated nodes, where the plan
used hash antijoin…
 Almost halved the time compared with NOT EXISTS
 Obtains the 425 isolated links in 1,259 seconds, much slower!
 Let’s look at the execution plan…
SQL for Isolated Links: SQL 3 - 4 Outer Joins - Execution Plan
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 50
Execution Plan (Extract)
 S13, S12: The hash join anti step has been replaced by hash join outer / filter pair of steps
 S7,5,3: And the sequence of hash join right anti steps has been replaced by…
 hash join right outer / filter pairs of steps
 The outer joins have not been recognised as antijoins, causing much more intermediate work
-----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time |
-----------------------------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 1 | | 0 |00:17:46.54 |
| 1 | TEMP TABLE TRANSFORMATION | | 1 | | 0 |00:17:46.54 |
| 2 | LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D6C1E_4F65443 | 1 | | 0 |00:17:46.52 |
|* 3 | FILTER | | 1 | | 425 |00:17:46.11 |
|* 4 | HASH JOIN RIGHT OUTER | | 1 | 8095K| 1302 |00:17:46.21 |
| 5 | TABLE ACCESS FULL | LINKS | 1 | 8095K| 8095K|00:00:00.73 |
|* 6 | FILTER | | 1 | | 472 |00:15:41.39 |
|* 7 | HASH JOIN RIGHT OUTER | | 1 | 8095K| 49224 |00:12:50.53 |
| 8 | TABLE ACCESS FULL | LINKS | 1 | 8095K| 8095K|00:00:00.68 |
|* 9 | FILTER | | 1 | | 3724 |00:15:31.22 |
|* 10 | HASH JOIN RIGHT OUTER | | 1 | 8095K| 267K|00:15:16.91 |
| 11 | TABLE ACCESS FULL | LINKS | 1 | 8095K| 8095K|00:00:00.61 |
|* 12 | FILTER | | 1 | | 8819 |00:14:59.94 |
|* 13 | HASH JOIN OUTER | | 1 | 8095K| 6232M|00:18:35.29 |
| 14 | TABLE ACCESS FULL | LINKS | 1 | 8095K| 8095K|00:00:00.59 |
| 15 | TABLE ACCESS FULL | LINKS | 1 | 8095K| 8095K|00:00:00.55 |
| 16 | LOAD TABLE CONVENTIONAL | NODE_ROOTS | 1 | | 0 |00:00:00.02 |
| 17 | HASH UNIQUE | | 1 | 16M| 850 |00:00:00.01 |
| 18 | UNION-ALL | | 1 | | 850 |00:00:00.01 |
| 19 | VIEW | | 1 | 8095K| 425 |00:00:00.01 |
| 20 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6C1E_4F65443 | 1 | 8095K| 425 |00:00:00.01 |
| 21 | VIEW | | 1 | 8095K| 425 |00:00:00.01 |
| 22 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6C1E_4F65443 | 1 | 8095K| 425 |00:00:00.01 |
-----------------------------------------------------------------------------------------------------------------------
SQL for Isolated Links: SQL 4 - Group Counting
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 51
INSERT INTO node_roots
WITH all_nodes AS (
SELECT node_id_fr node_id, 'F' tp
FROM links
UNION ALL
SELECT node_id_to, 'T'
FROM links
), unique_nodes AS (
SELECT node_id, Max(tp) tp
FROM all_nodes
GROUP BY node_id
HAVING COUNT(*) = 1
), isolated_links AS (
SELECT lnk.node_id_fr, lnk.node_id_to
FROM links lnk
JOIN unique_nodes frn
ON frn.node_id = lnk.node_id_fr
AND frn.tp = 'F'
JOIN unique_nodes ton
ON ton.node_id = lnk.node_id_to
AND ton.tp = 'T'
)
SELECT node_id_fr, node_id_fr, 0
FROM isolated_links
UNION ALL
SELECT node_id_to, node_id_fr, 1
FROM isolated_links
 all_nodes:
 Gets all node instances with a type of F(rom) or T(o)
 unique_nodes:
 Selects from all_nodes the nodes having exactly one
instance, along with its type
 isolated_links:
 Selects all links and inner-joins them to unique_nodes on
both ends
 main section:
 Adds both nodes with from node as root
 Re-define the logic for an isolated link as
 Its from and to node both appear in exactly one link
 Avoids the expensive self-join of links in favour of a group counting query
 Obtains the 425 isolated links in 1.5 seconds!  Let’s look at the execution plan…
SQL for Isolated Links: SQL 4 - Group Counting - Execution Plan
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 52
Execution Plan (Extract)
 The plan shows two LOAD AS SELECTs
 The first does a HASH GROUP BY, S4, on a UNION ALL of full scans on links; most of the time goes here
 The filter step, S3, shows only 1,797 rows, making the rest of the query very fast – early pruning!
-----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time |
-----------------------------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 1 | | 0 |00:00:01.83 |
| 1 | TEMP TABLE TRANSFORMATION | | 1 | | 0 |00:00:01.83 |
| 2 | LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D6C20_4F65443 | 1 | | 0 |00:00:01.82 |
|* 3 | FILTER | | 1 | | 1797 |00:00:01.91 |
| 4 | HASH GROUP BY | | 1 | 26 | 132K|00:00:01.82 |
| 5 | VIEW | | 1 | 16M| 16M|00:00:00.36 |
| 6 | UNION-ALL | | 1 | | 16M|00:00:00.34 |
| 7 | TABLE ACCESS FULL | LINKS | 1 | 8095K| 8095K|00:00:00.16 |
| 8 | TABLE ACCESS FULL | LINKS | 1 | 8095K| 8095K|00:00:00.15 |
| 9 | LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D6C21_4F65443 | 1 | | 0 |00:00:00.01 |
|* 10 | HASH JOIN | | 1 | 1 | 425 |00:00:00.01 |
|* 11 | VIEW | | 1 | 26 | 901 |00:00:00.01 |
| 12 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6C20_4F65443 | 1 | 26 | 1797 |00:00:00.01 |
| 13 | NESTED LOOPS | | 1 | 1685 | 896 |00:00:00.01 |
| 14 | NESTED LOOPS | | 1 | 1690 | 896 |00:00:00.01 |
|* 15 | VIEW | | 1 | 26 | 896 |00:00:00.01 |
| 16 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6C20_4F65443 | 1 | 26 | 1797 |00:00:00.01 |
|* 17 | INDEX RANGE SCAN | LINKS_TO_N1 | 896 | 65 | 896 |00:00:00.01 |
| 18 | TABLE ACCESS BY INDEX ROWID | LINKS | 896 | 65 | 896 |00:00:00.01 |
| 19 | LOAD TABLE CONVENTIONAL | NODE_ROOTS | 1 | | 0 |00:00:00.01 |
| 20 | UNION-ALL | | 1 | | 850 |00:00:00.01 |
| 21 | VIEW | | 1 | 1 | 425 |00:00:00.01 |
| 22 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6C21_4F65443 | 1 | 1 | 425 |00:00:00.01 |
| 23 | VIEW | | 1 | 1 | 425 |00:00:00.01 |
| 24 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6C21_4F65443 | 1 | 1 | 425 |00:00:00.01 |
-----------------------------------------------------------------------------------------------------------------------
Tuning 3 - SQL for Root Node Selector
Brendan Furey, 2022 53
Tuning 3 - SQL for Root Node Selector (4 slides)
Code timing several methods for root node selection
Analysing Performance of Algorithmic SQL and PL/SQL
SQL for Root Node Selector: Method 0 - Select from Unused Nodes
(Unordered)
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 54
SELECT id INTO l_root_id
FROM nodes
WHERE id NOT IN (SELECT node_id FROM node_roots)
AND ROWNUM = 1
 Code timing showed root node selection took 90% of the time on the Bacon/only_tv_v dataset
(744,374 nodes and 22,503,060 links)
 Execution plan shows a nested loops antijoin from the nodes index to the root nodes index
 We’ll try two variants with different queries and ordering added, then try a different approach
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 0 |00:00:00.37 | 42508 |
|* 1 | COUNT STOPKEY | | 1 | | 0 |00:00:00.37 | 42508 |
|* 2 | FILTER | | 1 | | 0 |00:00:00.37 | 42508 |
| 3 | NESTED LOOPS ANTI SNA| | 1 | 20 | 0 |00:00:00.36 | 40791 |
| 4 | INDEX FAST FULL SCAN| SYS_C0018310 | 1 | 520 | 744K|00:00:00.09 | 1460 |
|* 5 | INDEX UNIQUE SCAN | NODE_ROOTS_N1 | 744K| 714K| 744K|00:00:00.23 | 39331 |
|* 6 | TABLE ACCESS FULL | NODE_ROOTS | 1 | 1 | 0 |00:00:00.01 | 1717 |
---------------------------------------------------------------------------------------------------
Execution Plan (Extract)
Root
Selection
ms/Call %Total
Non-Root
Selection
%Total Total
303 41 58 221 42 524
Elapsed Times
 The base method, with no ordering, took 303
seconds
SQL for Root Node Selector: Method 1 - Select from Unused Nodes
(Minimum Id)
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 55
SELECT Min(id) INTO l_root_id
FROM nodes WHERE id NOT IN (SELECT node_id FROM node_roots)
 The first ordering query takes a Min(id) from nodes not in the solution table
------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.50 | 3178 | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.50 | 3178 | | | |
|* 2 | HASH JOIN RIGHT ANTI NA| | 1 | 28678 | 0 |00:00:00.50 | 3178 | 37M| 6400K| 30M (0)|
| 3 | TABLE ACCESS FULL | NODE_ROOTS | 1 | 715K| 744K|00:00:00.06 | 1717 | | | |
| 4 | INDEX FAST FULL SCAN | SYS_C0018310 | 1 | 744K| 744K|00:00:00.08 | 1461 | | | |
------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID"="NODE_ID")
Execution Plan
Root
Selection
ms/Call %Total
Non-Root
Selection
%Total Total
2,046 275 92 208 8 2,233
Elapsed Times
SQL
 The first ordering method, took 2,046 seconds
 This is nearly 7 times slower than the base, unordered method
SQL for Root Node Selector: Method 2 - Select from Unused Nodes
(Ordered by Id, ROWNUM = 1)
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 56
SELECT id INTO l_root_id
FROM (SELECT id FROM nodes WHERE id NOT IN (
SELECT node_id FROM node_roots) ORDER BY 1
)
WHERE ROWNUM = 1
 The second ordering query uses a ROWNUM = 1 on an ordered subquery from nodes not in
the solution table
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 0 |00:00:00.42 | 26499 |
|* 1 | COUNT STOPKEY | | 1 | | 0 |00:00:00.42 | 26499 |
| 2 | VIEW | | 1 | 1 | 0 |00:00:00.42 | 26499 |
|* 3 | FILTER | | 1 | | 0 |00:00:00.42 | 26499 |
| 4 | NESTED LOOPS ANTI SNA| | 1 | 20 | 0 |00:00:00.41 | 24782 |
| 5 | INDEX FULL SCAN | SYS_C0018310 | 1 | 744K| 744K|00:00:00.10 | 1398 |
|* 6 | INDEX UNIQUE SCAN | NODE_ROOTS_N1 | 744K| 688K| 744K|00:00:00.26 | 23384 |
|* 7 | TABLE ACCESS FULL | NODE_ROOTS | 1 | 1 | 0 |00:00:00.01 | 1717 |
----------------------------------------------------------------------------------------------------
Execution Plan (Steps)
Root
Selection
ms/Call %Total
Non-Root
Selection
%Total Total
289 39 60 193 40 482
Elapsed Times
SQL
Predicate Information
(identified by operation id):
-----------------------------
1 - filter(ROWNUM=1)
3 - filter( IS NULL)
6 - access("ID"="NODE_ID")
7 - filter("NODE_ID" IS NULL)
(Predicates)
 The second ordering method took 289 seconds
 This is 7 times faster than the first ordering method
and slightly faster than the base, unordered
SQL for Root Node Selector: Method 3 - Fetch from Cursor (Ordered by
Id), Check Unused
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 57
CURSOR c_roots IS
SELECT id
FROM nodes
ORDER BY 1;
OPEN c_roots;
FETCH c_roots INTO l_root_id
 The ordering query is opened once as a cursor, and fetched for each new subnetwork
 An existence check is made against the node_roots table, if present we skip to the next fetch
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 3 |
| 1 | INDEX FULL SCAN | SYS_C0018310 | 1 | 744K| 1 |00:00:00.01 | 3 |
-------------------------------------------------------------------------------------------
Cursor Execution Plan
Cursor SQL
SELECT 1 INTO l_dummy
FROM node_roots
WHERE node_id = l_root_id
Existence Check SQL Existence Check Execution Plan
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 3 |
|* 1 | INDEX UNIQUE SCAN| NODE_ROOTS_N1 | 1 | 1 | 1 |00:00:00.01 | 3 |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("NODE_ID"=:B1)
Elapsed Times
Root
Selection
ms/Call %Total
Non-Root
Selection
%Total Total
67 11 27 185 73 252
 We get the root selection time by adding up multiple code timing lines for cursor and check SQL
 This third ordering method took 67 seconds
 > 4 times faster
than next best
Tuning Results
Brendan Furey, 2022 58
Tuning Results (2 slides)
Code timing results for one dataset and before and after results for
Subnetwork Grouper for all
Analysing Performance of Algorithmic SQL and PL/SQL
Code Timing - Ins_Node_Roots - Results on Bacon/only_tv_v after Tuning
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 59
Code Timing Output
Timer Set: Ins_Node_Roots, Constructed at 30 Jul 2022 17:38:25, written at 17:42:37
===================================================================================
Timer Elapsed CPU Calls Ela/Call CPU/Call
------------------------------------------------- ---------- ---------- ---------- ------------- -------------
Insert isolated nodes 3: 8659 1.24 1.22 1 1.23700 1.22000
Insert isolated links 5: 7078 5.59 5.30 1 5.59100 5.30000
OPEN c_roots 0.19 0.20 1 0.19000 0.20000
Count nodes 0.01 0.00 1 0.01400 0.00000
FETCH c_roots (first) 0.00 0.00 1 0.00000 0.00000
SELECT 1 INTO l_dummy: Not found 0.70 0.84 7443 0.00009 0.00011
Insert min_tree_links (root node 1, size: 680060) 142.60 137.63 1 142.59600 137.63000
Insert node_roots (root node 1, size: 680060) 3.68 3.64 1 3.68100 3.64000
FETCH c_roots (remaining) 28.60 26.67 664224 0.00004 0.00004
SELECT 1 INTO l_dummy: Found 37.05 37.41 656782 0.00006 0.00006
Insert min_tree_links (3 nodes) 7.44 6.82 2091 0.00356 0.00326
Insert node_roots (3 nodes) 0.53 0.37 2091 0.00025 0.00018
Insert min_tree_links (4-39 nodes) 21.90 20.15 5317 0.00412 0.00379
Insert node_roots (4-39 nodes) 1.67 1.39 5317 0.00031 0.00026
Insert min_tree_links (root node 332, size: 52) 0.01 0.00 1 0.00900 0.00000
Insert node_roots (root node 332, size: 52) 0.00 0.00 1 0.00100 0.00000
...
(Other) 0.00 0.00 1 0.00100 0.00000
------------------------------------------------- ---------- ---------- ---------- ------------- -------------
Total 251.67 241.98 1343341 0.00019 0.00018
------------------------------------------------- ---------- ---------- ---------- ------------- -------------
[Timer timed (per call in ms): Elapsed: 0.00935, CPU: 0.00935]
 The total time has come down from 1714 seconds to 252 seconds, a reduction factor of 7
 The largest contribution is now from the timer Insert min_tree_links (root node 1, size: 680060)
 The results show the additional pre-insert steps, taking 1 and 6 seconds
 We also see the new cursor fetch step, and existence query, taking 29 and 37 seconds
Ins_Node_Roots - Performance - Results before/after Tuning
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 60
Dataset #Nodes #Links #Subnetworks #Maxlev Base Ela(s) Tuned Ela(s)
three_subnets 14 13 3 3 0.07 0.5
foreign_keys
289 319 43 5 0.2 0.6
brightkite 58,228 214,078 547 10 7 7
bacon/small 161 3,342 1 5 0.1 0.5
bacon/top250 12,466 583,993 15 6 1.9 4.2
bacon/pre1950 134,131 8,095,294 2,432 13 85 61
bacon/only_tv_v 744,374 22,503,060 12,198 11 1,714 252
bacon/no_tv_v 2,386,567 87,866,033 55,276 10 16,108 2,081
bacon/post1950 2,696,175 101,597,227 60,544 10 19,736 2,930
bacon/full 2,800,309 109,262,592 62,557 10 20,631 3,756
 The tuned procedure is between 5.5 and 7.7 times faster on the four largest datasets
Conclusion
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 61
SQL
 Be aware of the built-in SQL algorithms at different levels
 Understand the use of subquery sequencing in logical query design
 Understand how queries can be transformed, and how performance may be affected
 By the CBO, and by manual rewriting
 Including logical or physical splitting of complex queries
 Understand the use of hints to affect the choice of algorithms the CBO makes
 Use execution plans to analyse SQL performance
PL/SQL
 Use PL/SQL algorithms when there isn’t an appropriate SQL built-in equivalent
 But use SQL as fully as possible within these algorithms, in particular to process data in sets
 Be familiar with the Oracle standard profilers, and the possibilities offered by custom code
timing
For more detail
 See my blog and GitHub project…
References
Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 62
1. Algorithm, Computer Hope, March 2021
2. Declarative Language, Britannia.com, Undated
3. SQL Tuning Guide, 21c
4. Shortest Path Analysis of Large Networks by SQL and PL/SQL: Blog, Brendan Furey, August
2022
5. SQL and PL/SQL for Shortest Path Problems: GitHub, Brendan Furey, August 2022
6. Timer_Set - Oracle PL/SQL code timing module: GitHub, Brendan Furey, January 2019
7. Friendship network of Brightkite users, Jure Leskovec, Stanford University, Undated
8. Bacon Numbers Datasets, Oberlin College, December 2016
9. SQL for Shortest Path Problems, Brendan Furey, April 2015
10. SQL for Shortest Path Problems 2: A Branch and Bound Approach, Brendan Furey, May 2015
11. PL/SQL Pipelined Function for Network Analysis, Brendan Furey, May 2015
12. PL/SQL Profiling 1: Overview, Brendan Furey, June 2020

Contenu connexe

Similaire à Analysing Performance of Algorithmic SQL and PLSQL.pptx

MIS5101 WK10 Outcome Measures
MIS5101 WK10 Outcome MeasuresMIS5101 WK10 Outcome Measures
MIS5101 WK10 Outcome MeasuresSteven Johnson
 
Sql scripting sorcerypaper
Sql scripting sorcerypaperSql scripting sorcerypaper
Sql scripting sorcerypaperoracle documents
 
SQL Pattern Matching – should I start using it?
SQL Pattern Matching – should I start using it?SQL Pattern Matching – should I start using it?
SQL Pattern Matching – should I start using it?Andrej Pashchenko
 
An Extension of RETRO Framework: Translating SQL Insert, Update and Delete Qu...
An Extension of RETRO Framework: Translating SQL Insert, Update and Delete Qu...An Extension of RETRO Framework: Translating SQL Insert, Update and Delete Qu...
An Extension of RETRO Framework: Translating SQL Insert, Update and Delete Qu...Editor IJCATR
 
Database management system chapter5
Database management system chapter5Database management system chapter5
Database management system chapter5Pranab Dasgupta
 
Ms sql server architecture
Ms sql server architectureMs sql server architecture
Ms sql server architectureAjeet Singh
 
Understand when to use user defined functions in sql server tech-republic
Understand when to use user defined functions in sql server   tech-republicUnderstand when to use user defined functions in sql server   tech-republic
Understand when to use user defined functions in sql server tech-republicKaing Menglieng
 
Advanced plsql mock_assessment
Advanced plsql mock_assessmentAdvanced plsql mock_assessment
Advanced plsql mock_assessmentSaurabh K. Gupta
 
New features of sql server 2005
New features of sql server 2005New features of sql server 2005
New features of sql server 2005Govind Raj
 
Linq To The Enterprise
Linq To The EnterpriseLinq To The Enterprise
Linq To The EnterpriseDaniel Egan
 
SQL-Tutorial.P1241112567Pczwq.powerpoint.pptx
SQL-Tutorial.P1241112567Pczwq.powerpoint.pptxSQL-Tutorial.P1241112567Pczwq.powerpoint.pptx
SQL-Tutorial.P1241112567Pczwq.powerpoint.pptxBhupendraShahi6
 
Tony Jambu (obscure) tools of the trade for tuning oracle sq ls
Tony Jambu   (obscure) tools of the trade for tuning oracle sq lsTony Jambu   (obscure) tools of the trade for tuning oracle sq ls
Tony Jambu (obscure) tools of the trade for tuning oracle sq lsInSync Conference
 
WinMagic - Subquery Elimination Using Window Aggregation.pdf
WinMagic - Subquery Elimination Using Window Aggregation.pdfWinMagic - Subquery Elimination Using Window Aggregation.pdf
WinMagic - Subquery Elimination Using Window Aggregation.pdfRayWill4
 
Introduction to Threading in .Net
Introduction to Threading in .NetIntroduction to Threading in .Net
Introduction to Threading in .Netwebhostingguy
 
TSQL in SQL Server 2012
TSQL in SQL Server 2012TSQL in SQL Server 2012
TSQL in SQL Server 2012Eduardo Castro
 

Similaire à Analysing Performance of Algorithmic SQL and PLSQL.pptx (20)

SQL Tunning
SQL TunningSQL Tunning
SQL Tunning
 
MIS5101 WK10 Outcome Measures
MIS5101 WK10 Outcome MeasuresMIS5101 WK10 Outcome Measures
MIS5101 WK10 Outcome Measures
 
Fdms 1st cycle exp.pdf
Fdms 1st cycle exp.pdfFdms 1st cycle exp.pdf
Fdms 1st cycle exp.pdf
 
Sql scripting sorcerypaper
Sql scripting sorcerypaperSql scripting sorcerypaper
Sql scripting sorcerypaper
 
SQL Pattern Matching – should I start using it?
SQL Pattern Matching – should I start using it?SQL Pattern Matching – should I start using it?
SQL Pattern Matching – should I start using it?
 
An Extension of RETRO Framework: Translating SQL Insert, Update and Delete Qu...
An Extension of RETRO Framework: Translating SQL Insert, Update and Delete Qu...An Extension of RETRO Framework: Translating SQL Insert, Update and Delete Qu...
An Extension of RETRO Framework: Translating SQL Insert, Update and Delete Qu...
 
Unit 2 web technologies
Unit 2 web technologiesUnit 2 web technologies
Unit 2 web technologies
 
Database management system chapter5
Database management system chapter5Database management system chapter5
Database management system chapter5
 
Ms sql server architecture
Ms sql server architectureMs sql server architecture
Ms sql server architecture
 
Understand when to use user defined functions in sql server tech-republic
Understand when to use user defined functions in sql server   tech-republicUnderstand when to use user defined functions in sql server   tech-republic
Understand when to use user defined functions in sql server tech-republic
 
Sql tutorial
Sql tutorialSql tutorial
Sql tutorial
 
Advanced plsql mock_assessment
Advanced plsql mock_assessmentAdvanced plsql mock_assessment
Advanced plsql mock_assessment
 
New features of sql server 2005
New features of sql server 2005New features of sql server 2005
New features of sql server 2005
 
Linq To The Enterprise
Linq To The EnterpriseLinq To The Enterprise
Linq To The Enterprise
 
SQL-Tutorial.P1241112567Pczwq.powerpoint.pptx
SQL-Tutorial.P1241112567Pczwq.powerpoint.pptxSQL-Tutorial.P1241112567Pczwq.powerpoint.pptx
SQL-Tutorial.P1241112567Pczwq.powerpoint.pptx
 
Tony Jambu (obscure) tools of the trade for tuning oracle sq ls
Tony Jambu   (obscure) tools of the trade for tuning oracle sq lsTony Jambu   (obscure) tools of the trade for tuning oracle sq ls
Tony Jambu (obscure) tools of the trade for tuning oracle sq ls
 
WinMagic - Subquery Elimination Using Window Aggregation.pdf
WinMagic - Subquery Elimination Using Window Aggregation.pdfWinMagic - Subquery Elimination Using Window Aggregation.pdf
WinMagic - Subquery Elimination Using Window Aggregation.pdf
 
Introduction to Threading in .Net
Introduction to Threading in .NetIntroduction to Threading in .Net
Introduction to Threading in .Net
 
Day5
Day5Day5
Day5
 
TSQL in SQL Server 2012
TSQL in SQL Server 2012TSQL in SQL Server 2012
TSQL in SQL Server 2012
 

Dernier

JS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AIJS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AIIvo Andreev
 
OpenChain Webinar: Universal CVSS Calculator
OpenChain Webinar: Universal CVSS CalculatorOpenChain Webinar: Universal CVSS Calculator
OpenChain Webinar: Universal CVSS CalculatorShane Coughlan
 
Streamlining Your Application Builds with Cloud Native Buildpacks
Streamlining Your Application Builds  with Cloud Native BuildpacksStreamlining Your Application Builds  with Cloud Native Buildpacks
Streamlining Your Application Builds with Cloud Native BuildpacksVish Abrams
 
eAuditor Audits & Inspections - conduct field inspections
eAuditor Audits & Inspections - conduct field inspectionseAuditor Audits & Inspections - conduct field inspections
eAuditor Audits & Inspections - conduct field inspectionsNirav Modi
 
AI Embracing Every Shade of Human Beauty
AI Embracing Every Shade of Human BeautyAI Embracing Every Shade of Human Beauty
AI Embracing Every Shade of Human BeautyRaymond Okyere-Forson
 
Why Choose Brain Inventory For Ecommerce Development.pdf
Why Choose Brain Inventory For Ecommerce Development.pdfWhy Choose Brain Inventory For Ecommerce Development.pdf
Why Choose Brain Inventory For Ecommerce Development.pdfBrain Inventory
 
Cybersecurity Challenges with Generative AI - for Good and Bad
Cybersecurity Challenges with Generative AI - for Good and BadCybersecurity Challenges with Generative AI - for Good and Bad
Cybersecurity Challenges with Generative AI - for Good and BadIvo Andreev
 
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmony
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine HarmonyLeveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmony
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmonyelliciumsolutionspun
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLAlluxio, Inc.
 
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.Sharon Liu
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeNeo4j
 
online pdf editor software solutions.pdf
online pdf editor software solutions.pdfonline pdf editor software solutions.pdf
online pdf editor software solutions.pdfMeon Technology
 
Fields in Java and Kotlin and what to expect.pptx
Fields in Java and Kotlin and what to expect.pptxFields in Java and Kotlin and what to expect.pptx
Fields in Java and Kotlin and what to expect.pptxJoão Esperancinha
 
Growing Oxen: channel operators and retries
Growing Oxen: channel operators and retriesGrowing Oxen: channel operators and retries
Growing Oxen: channel operators and retriesSoftwareMill
 
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...OnePlan Solutions
 
Kawika Technologies pvt ltd Software Development Company in Trivandrum
Kawika Technologies pvt ltd Software Development Company in TrivandrumKawika Technologies pvt ltd Software Development Company in Trivandrum
Kawika Technologies pvt ltd Software Development Company in TrivandrumKawika Technologies
 
Introduction-to-Software-Development-Outsourcing.pptx
Introduction-to-Software-Development-Outsourcing.pptxIntroduction-to-Software-Development-Outsourcing.pptx
Introduction-to-Software-Development-Outsourcing.pptxIntelliSource Technologies
 
ARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdf
ARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdfARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdf
ARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdfTobias Schneck
 
Top Software Development Trends in 2024
Top Software Development Trends in  2024Top Software Development Trends in  2024
Top Software Development Trends in 2024Mind IT Systems
 
Enterprise Document Management System - Qualityze Inc
Enterprise Document Management System - Qualityze IncEnterprise Document Management System - Qualityze Inc
Enterprise Document Management System - Qualityze Incrobinwilliams8624
 

Dernier (20)

JS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AIJS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AI
 
OpenChain Webinar: Universal CVSS Calculator
OpenChain Webinar: Universal CVSS CalculatorOpenChain Webinar: Universal CVSS Calculator
OpenChain Webinar: Universal CVSS Calculator
 
Streamlining Your Application Builds with Cloud Native Buildpacks
Streamlining Your Application Builds  with Cloud Native BuildpacksStreamlining Your Application Builds  with Cloud Native Buildpacks
Streamlining Your Application Builds with Cloud Native Buildpacks
 
eAuditor Audits & Inspections - conduct field inspections
eAuditor Audits & Inspections - conduct field inspectionseAuditor Audits & Inspections - conduct field inspections
eAuditor Audits & Inspections - conduct field inspections
 
AI Embracing Every Shade of Human Beauty
AI Embracing Every Shade of Human BeautyAI Embracing Every Shade of Human Beauty
AI Embracing Every Shade of Human Beauty
 
Why Choose Brain Inventory For Ecommerce Development.pdf
Why Choose Brain Inventory For Ecommerce Development.pdfWhy Choose Brain Inventory For Ecommerce Development.pdf
Why Choose Brain Inventory For Ecommerce Development.pdf
 
Cybersecurity Challenges with Generative AI - for Good and Bad
Cybersecurity Challenges with Generative AI - for Good and BadCybersecurity Challenges with Generative AI - for Good and Bad
Cybersecurity Challenges with Generative AI - for Good and Bad
 
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmony
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine HarmonyLeveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmony
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmony
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
 
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG time
 
online pdf editor software solutions.pdf
online pdf editor software solutions.pdfonline pdf editor software solutions.pdf
online pdf editor software solutions.pdf
 
Fields in Java and Kotlin and what to expect.pptx
Fields in Java and Kotlin and what to expect.pptxFields in Java and Kotlin and what to expect.pptx
Fields in Java and Kotlin and what to expect.pptx
 
Growing Oxen: channel operators and retries
Growing Oxen: channel operators and retriesGrowing Oxen: channel operators and retries
Growing Oxen: channel operators and retries
 
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
 
Kawika Technologies pvt ltd Software Development Company in Trivandrum
Kawika Technologies pvt ltd Software Development Company in TrivandrumKawika Technologies pvt ltd Software Development Company in Trivandrum
Kawika Technologies pvt ltd Software Development Company in Trivandrum
 
Introduction-to-Software-Development-Outsourcing.pptx
Introduction-to-Software-Development-Outsourcing.pptxIntroduction-to-Software-Development-Outsourcing.pptx
Introduction-to-Software-Development-Outsourcing.pptx
 
ARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdf
ARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdfARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdf
ARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdf
 
Top Software Development Trends in 2024
Top Software Development Trends in  2024Top Software Development Trends in  2024
Top Software Development Trends in 2024
 
Enterprise Document Management System - Qualityze Inc
Enterprise Document Management System - Qualityze IncEnterprise Document Management System - Qualityze Inc
Enterprise Document Management System - Qualityze Inc
 

Analysing Performance of Algorithmic SQL and PLSQL.pptx

  • 1. Analysing Performance of Algorithmic SQL and PL/SQL Brendan Furey, September 2022 A Programmer Writes… (Brendan's Blog) Ireland Oracle User Group, September 5-6, 2022 Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 1
  • 2. whoami Freelance Oracle developer and blogger Keen interest in programming concepts Started career as a Fortran programmer at British Gas Dublin-based Europhile 30 years Oracle experience, currently working in Finance Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 2
  • 3. Agenda  Algorithms and SQL (9 slides)  On algorithms at different levels in SQL and PL/SQL  Network Analysis Problems (4 slides)  On shortest path and subnetwork grouping problems  Network Paths by SQL (7 slides)  Solving all- and shortest- path problems via pure SQL  Two Algorithms with Code Timing (7 slides)  Two PL/SQL network analysis algorithms with code timing and performance analysis  Oracle Standard Profilers (2 slides)  Results from two standard Oracle profiling tools for the Subnetwork Grouper procedure  Tuning 1 - SQL for Isolated Nodes (5 slides)  Recap of join methods and types, then queries with antijoin structures and hints  Tuning 2 - SQL for Isolated Links (8 slides)  Disastrous ‘Bitmap Or’ expansion, good & bad antijoin plans and efficient group counting query  Tuning 3 - SQL for Root Node Selector (4 slides)  Code timing several methods for root node selection  Tuning – Results (2 slides)  Code timing results for one dataset and before and after results for Subnetwork Grouper for all  Conclusion (1 slide)  A few recommendations split between SQL and PL/SQL Brendan Furey, 2022 3 Analysing Performance of Algorithmic SQL and PL/SQL
  • 4. Algorithms and SQL Brendan Furey, 2022 4 Algorithms and SQL (9 slides) On algorithms at different levels in SQL and PL/SQL Analysing Performance of Algorithmic SQL and PL/SQL
  • 5. The Algorithm (extracts from Computer Hope web page) Brendan Furey, 2022 5 Analysing Performance of Algorithmic SQL and PL/SQL Algorithm - Computer Hope  Derived from the name of the mathematician Muhammed ibn-Musa Al-Khowarizmi, an algorithm is a solution to a problem that meets the following criteria.  A list of instructions, procedures, or formula that solves a problem  Can be proven  Something that always finishes and works When was the first algorithm?  Because a cooking recipe could be considered an algorithm, the first algorithm could go back as far as written language  However, many find Euclid's algorithm for finding the greatest common divisor to be the first algorithm. This algorithm was first described in 300 B.C.  Ada Lovelace is credited as being the first computer programmer and the first person to develop an algorithm for a machine
  • 6. Algorithms and SQL 1 - Built-In Algorithms and Subquery Sequence Brendan Furey, 2022 6 Analysing Performance of Algorithmic SQL and PL/SQL Declarative Language (paraphrase from Britannia.com)  Declarative languages are programming languages in which a program specifies what is to be done rather than how to do it SQL as a declarative language?  SQL is often described as a declarative (or non-procedural) language  But it’s a bit more complicated than that, especially when performance is important… Built-In Algorithms  Oracle provides built-in algorithms for joining tables and other rowsets, and grouping  Oracle provides additional specific built-in algorithms for processing an input rowset …  Analytics allows aggregation over partition key within rowset windows  Match Recognize allows patterns to be reported across rows  These algorithms are configured declaratively within an SQL subquery  Also, we have more general algorithms  Recursive subquery factors allow for recursive algorithms  Model clause allows for iteration over cells within a spreadsheet-like array Subquery Sequence  Build queries in a sequence of subquery steps
  • 7. Algorithms and SQL 2 - Joins and Grouping Brendan Furey, 2022 7 Analysing Performance of Algorithmic SQL and PL/SQL SELECT d.department_name, Avg(e.salary) avg_sal FROM departments d JOIN employees e ON e.department_id = d.department_id GROUP BY d.department_name ORDER BY d.department_name Simple Query with Joins and Grouping  A simple query joins data sources, and may group by a key…  with aggregate functions on non-key columns  Oracle CBO has multiple algorithms for joining and for aggregation  Hash Join – using full table scans for larger data sets  Nested Loops – using indexes for smaller data sets  CBO chooses algorithm based on table statistics  We can override with hints:  USE_HASH(e)  USE_NL(e) DEPARTMENT_NAME AVG_SAL ---------------- ------- Accounting 10,154 Administration 4,400 Executive 19,333 Finance 8,601 Human Resources 6,500 IT 5,760 Marketing 9,500 Public Relations 10,000 Purchasing 4,150 Sales 8,956 Shipping 3,476 Example: Average salary grouped by department
  • 8. Algorithms and SQL 3 - Analytics Brendan Furey, 2022 8 Analysing Performance of Algorithmic SQL and PL/SQL  Analytics allows aggregation over partition key within rowset windows WITH rowset AS ( SELECT d.department_name, e.hire_date, e.last_name, e.salary FROM departments d JOIN employees e ON e.department_id = d.department_id ) SELECT department_name, hire_date, last_name, salary, Sum(salary) OVER (PARTITION BY department_name ORDER BY hire_date) rsum_sal, salary - Lag(salary) OVER (PARTITION BY department_name ORDER BY hire_date) sal_incr FROM rowset ORDER BY department_name, hire_date DEPARTMENT_NAME HIRE_DATE LAST_NAME SALARY RSUM_SAL SAL_INCR ---------------- --------- ------------ ------- -------- -------- Accounting 07-JUN-02 Gietz 8,300 20,308 Accounting 07-JUN-02 Higgins 12,008 20,308 3,708 Administration 17-SEP-03 Whalen 4,400 4,400 . Sales 04-JAN-08 Johnson 6,200 261,300 -800 Sales 24-JAN-08 Marvins 7,200 268,500 1,000 Sales 29-JAN-08 Zlotkey 10,500 279,000 3,300 . Example: Running sum of salaries and salary increase by department  Can have multiple independent expressions  Aggregate functions on fields (or expressions), apply over the partition  Row set is unaltered, and does not have to be a separate subquery  Range specifies a window based on the Order By expression  Often range is defaulted, in example is Unbounded Preceding
  • 9. Algorithms and SQL 4 - Pattern Matching Brendan Furey, 2022 9 Analysing Performance of Algorithmic SQL and PL/SQL  Match Recognize allows patterns to be reported across rows WITH rowset AS ( SELECT dep.department_name, emp.hire_date, emp.last_name, emp.salary FROM departments dep JOIN employees emp ON emp.department_id = dep.department_id) SELECT * FROM rowset MATCH_RECOGNIZE ( PARTITION BY department_name ORDER BY hire_date MEASURES last_name AS last_name, salary AS salary ONE ROW PER MATCH AFTER MATCH SKIP TO NEXT ROW PATTERN ( up{2} ) DEFINE up AS up.salary > PREV(up.salary)) DEPARTMENT_NAME LAST_NAME SALARY ---------------- ------------ ------- Sales Bloom 10,000 Sales Zlotkey 10,500 Shipping OConnell 2,600 Shipping Mourgos 5,800 Shipping Grant 2,600 Shipping Geoni 2,800 Example: Two consecutive salary increases  The Partition By allows for independent patterns across keys  Order By defines row sequence  Measures specifies fields (or expressions) to output  Specify behaviour in relation to matches  Pattern expresses sequences of values across rows  Using a regex-like syntax  Referencing variables from the Define section  In example up{2} ~ 2 adjacent instances of salary increase
  • 10. Algorithms and SQL 5 - Recursive Subqueries Brendan Furey, 2022 10 Analysing Performance of Algorithmic SQL and PL/SQL  Recursive subquery has anchor branch in union with  …recursive branch that reads from subquery itself  Partitioning via where clause DEPARTMENT_NAME LAST_NAME MULT R_PROD --------------- --------- ------ -------- Accounting Gietz 1.83 1.83 Accounting Higgins 2.2008 4.027464 Administration Whalen 1.44 1.44 Executive De Haan 2.7 2.7 Executive King 3.4 9.18 Executive Kochhar 2.7 24.786 . Example: Running Products WITH multipliers AS ( SELECT d.department_name, e.last_name, (1 + e.salary/10000) mult, Row_Number() OVER (PARTITION BY d.department_name ORDER BY e.last_name) rn FROM departments d JOIN employees e ON e.department_id = d.department_id ), rsf (department_name, last_name, rn, mult, running_prod) AS ( SELECT department_name, last_name, rn, mult, mult running_prod FROM multipliers WHERE rn = 1 UNION ALL SELECT m.department_name, m.last_name, m.rn, m.mult, r.running_prod * m.mult FROM rsf r JOIN multipliers m ON m.rn = r.rn + 1 AND m.department_name = r.department_name) SELECT department_name, last_name, mult, running_prod FROM rsf ORDER BY department_name, last_name  Performs well for hierarchies, less well for looped structures (as we’ll see later)
  • 11. Algorithms and SQL 6 - Model Clause Brendan Furey, 2022 11 Analysing Performance of Algorithmic SQL and PL/SQL Example: Running Products WITH multipliers AS ( SELECT d.department_name, e.last_name, (1 + e.salary/10000) mult FROM departments d JOIN employees e ON e.department_id = d.department_id ) SELECT department_name, last_name, mult, running_prod FROM multipliers MODEL PARTITION BY (department_name) DIMENSION BY (Row_Number() OVER (PARTITION BY department_name ORDER BY last_name) rn) MEASURES (last_name, mult, mult running_prod) RULES (running_prod[rn > 1] = mult[CV()] * running_prod[CV() - 1]) ORDER BY department_name, last_name  Model clause does not have the best reputation for performance  Rarely seen in the wild…  Model clause reads records from a rowset, then allows  …rules to reference the rows and columns as array cells  Partition By allows for independent patterns across keys  Dimension By defines the indexing over rows, and can use analytic functions  Measures specifies fields (or expressions) to output  Rules may update or insert rows, and optionally iterate  Order By defines output order
  • 12. Algorithms and SQL 7 - Subquery Sequence Brendan Furey, 2022 12 Analysing Performance of Algorithmic SQL and PL/SQL  Subqueries can reference not only tables and views, but…  Previous subqueries  Database functions, returning scalar or tabular outputs  This allows us to build queries in a sequence of subquery steps  This can be seen as a higher level algorithm in itself…  specifying procedurally rather than declaratively at a higher level: the how not just the what  But CBO can override and rewrite the structure Subqueries and Performance  CBO’s query transformation can improve performance or worsen it  Hints can often improve performance here, such as  Materialize – evaluate the subquery and save the resulting rowset  No_Query_Transformation – don’t transform the query  Sometimes helps to split a complex query that CBO is transforming badly, eg  Insert subquery output into a temporary table  Put subquery into a pipelined function  We can also manually transform, eg change Not Exists into explicit antijoins, as we’ll see later
  • 13. Algorithms and SQL 7 – General Principles Brendan Furey, 2022 13 Analysing Performance of Algorithmic SQL and PL/SQL  Process in batches, or sets, where possible  A process often has a startup cost plus a cost per row, so spread the startup cost  Also, different algorithms may be more efficient for processing a set of rows or 1 row  Avoid cursor loops when the rowset can be processed in a single query  Prune early, avoid continued processing of rows that will later be eliminated, if possible  Use where there is no efficient pure SQL algorithm, as in some network analysis problems  But ensure SQL is used effectively within the PL/SQL algorithm  Also can use to break a complex query into smaller sections via pipelined function/temp table  Only do this when CBO performs badly PL/SQL Algorithms SQL Algorithms  Use SQL algorithms that meet a specific requirement, within pure SQL  Join and group rowsets  Analytic functions for aggregation over a partition key within a rowset window  Match Recognize for pattern matching across rows  Recursive subqueries for traversing hierarchies
  • 14. Network Analysis Problems Brendan Furey, 2022 14 Network Analysis Problems (4 slides) On shortest path and subnetwork grouping problems Analysing Performance of Algorithmic SQL and PL/SQL
  • 15. 3 Subnetworks – Demo Network Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 15 Network Analysis Problems  Undirected network  Find all paths from root  Find shortest paths from root  Group all nodes by subnetwork
  • 16. All Paths from S1-N0-1 Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 16 Node Path Length --------- ------------ ------ S1-N0-1 1 0 S1-N1-1 ..2 1 S1-N2-1 ....7 2 S1-N3-1 ......10 3 S1-N1-2 ..3 1 S1-N1-3 ....4 2 S1-N2-2 ....8 2 S1-N1-3 ..4 1 S1-N1-2 ....3 2 S1-N2-2 ......8 3 S1-N1-4 ..5 1 S1-N2-3 ....9 2 S1-N1-5 ......6 3 S1-N3-2 ......11 3 S1-N1-5 ..6 1 S1-N2-3 ....9 2 S1-N1-4 ......5 3 S1-N3-2 ......11 3
  • 17. Shortest Paths from S1-N0-1 Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 17 NODE_NAME NODE LEV ----------- ---------- ---- S1-N0-1 1 0 S1-N1-1 ..2 1 S1-N2-1 ....7 2 S1-N3-1 ......10 3 S1-N1-2 ..3 1 S1-N2-2 ....8 2 S1-N1-3 ..4 1 S1-N1-4 ..5 1 S1-N2-3 ....9 2 S1-N3-2 ......11 3 S1-N1-5 ..6 1  Shortest paths networks form trees
  • 18. Subnetwork Grouper Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 18
  • 19. Network Paths by SQL Brendan Furey, 2022 19 Network Paths by SQL (7 slides) Solving all- and shortest- path problems via pure SQL Analysing Performance of Algorithmic SQL and PL/SQL
  • 20. SQL for All Paths Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 20 Get Execution Plan using Marker WITH paths (node_id, lev) AS ( SELECT &root_id_var, 0 FROM DUAL UNION ALL SELECT CASE WHEN lnk.node_id_fr = pth.node_id THEN lnk.node_id_to ELSE lnk.node_id_fr END, pth.lev + 1 FROM paths pth JOIN links lnk ON (lnk.node_id_fr = pth.node_id OR lnk.node_id_to = pth.node_id) ) SEARCH DEPTH FIRST BY node_id SET line_no CYCLE node_id SET cycle TO '*' DEFAULT ' ' SELECT /*+ gather_plan_statistics XPLAN_ALL_PATHS */ n.node_name, Substr(LPad ('.', 1 + 2 * p.lev, '.') || p.node_id, 2) node, p.lev FROM paths p JOIN nodes n ON n.id = p.node_id WHERE cycle = ' ' ORDER BY p.line_no Recursive subquery CYCLE clause on node_id Hint gather_plan_statistics with marker string Exclude cycle rows from output EXEC Utils.W(Utils.Get_XPlan(p_sql_marker => 'XPLAN_ALL_PATHS'));  For tree networks each node has only one path from the root, and the SQL is efficient  Also efficient for small looped networks  For larger looped networks, finding all paths resource-intensive  Also for non-pure-SQL methods: Intrinsically hard SQL
  • 21. SQL for Shortest Paths - One Recursive Subquery Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 21 WITH paths (node_id, rnk, lev) AS ( SELECT &root_id_var, 1, 0 FROM DUAL UNION ALL SELECT CASE WHEN l.node_id_fr = p.node_id THEN l.node_id_to ELSE l.node_id_fr END, Rank () OVER (PARTITION BY CASE WHEN l.node_id_fr = p.node_id THEN l.node_id_to ELSE l.node_id_fr END ORDER BY p.node_id), p.lev + 1 FROM paths p JOIN links l ON p.node_id IN (l.node_id_fr, l.node_id_to) WHERE p.rnk = 1 ) SEARCH DEPTH FIRST BY node_id SET line_no CYCLE node_id SET lp TO '*' DEFAULT ' ' , node_min_levs AS ( SELECT node_id, Min (lev) KEEP (DENSE_RANK FIRST ORDER BY lev) lev, Min (line_no) KEEP (DENSE_RANK FIRST ORDER BY lev) line_no FROM paths GROUP BY node_id ) SELECT n.node_name, Substr(LPad ('.', 1 + 2 * m.lev, '.') || m.node_id, 2) node, m.lev lev FROM node_min_levs m JOIN nodes n ON n.id = m.node_id ORDER BY m.line_no SQL Extra field, rnk = rank of record for a given node at each iteration, based on the prior node id At each iteration only the record of rank 1 is joined to new links, avoiding duplication Subquery, node_min_levs, selects the preferred record of minimum length for each node
  • 22. SQL for Shortest Paths - One Recursive Subquery - Performance Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 22 ------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | ------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 161 |00:00:07.90 | 5032K| | | | 1 | SORT ORDER BY | | 1 | 381G| 161 |00:00:07.90 | 5032K| 18432 | 18432 | |* 2 | HASH JOIN | | 1 | 381G| 161 |00:00:07.90 | 5032K| 1449K| 1449K| | 3 | TABLE ACCESS FULL | NODES | 1 | 161 | 161 |00:00:00.01 | 7 | | | | 4 | VIEW | | 1 | 381G| 161 |00:00:07.90 | 5032K| | | | 5 | SORT GROUP BY | | 1 | 381G| 161 |00:00:07.90 | 5032K| 31744 | 31744 | | 6 | VIEW | | 1 | 381G| 220K|00:00:07.81 | 5032K| | | | 7 | UNION ALL (RECURSIVE WITH) DEPTH FIRST| | 1 | | 220K|00:00:07.77 | 5032K| 19M| 1646K| | 8 | FAST DUAL | | 1 | 1 | 1 |00:00:00.01 | 0 | | | | 9 | WINDOW SORT | | 79 | 381G| 220K|00:00:00.72 | 57440 | 478K| 448K| | 10 | NESTED LOOPS | | 79 | 381G| 220K|00:00:00.52 | 57440 | | | | 11 | RECURSIVE WITH PUMP | | 79 | | 3590 |00:00:00.01 | 0 | | | |* 12 | TABLE ACCESS FULL | LINKS | 3590 | 45 | 220K|00:00:00.70 | 57440 | | | ------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("N"."ID"="M"."NODE_ID") 12 - filter(("P"."NODE_ID"="L"."NODE_ID_FR" OR "P"."NODE_ID"="L"."NODE_ID_TO")) Execution Plan (Extract) – Bacon/small (161 node / 3,342 link network)  SQL solution can obtain the shortest paths efficiently for tree and smaller looped networks  In larger looped networks the number of paths overall can become extremely large  Recursive subquery discards all but one path to a given node at a given iteration…  But has no access to other paths reached at earlier iterations  And so may persist with longer paths that will be discarded in the later ranking subquery  One approach to mitigating is to do a truncated search to obtain some bounds for later query…
  • 23. SQL for Shortest Paths – Two Recursive Subqueries, Part 1 Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 23 WITH paths_truncated (node_id, lev, rn) AS ( SELECT &root_id_var, 0, 1 FROM DUAL UNION ALL SELECT CASE WHEN l.node_id_fr = p.node_id THEN l.node_id_to ELSE l.node_id_fr END, p.lev + 1, Row_Number () OVER (PARTITION BY CASE WHEN l.node_id_fr = p.node_id THEN l.node_id_to ELSE l.node_id_fr END ORDER BY p.node_id) FROM paths_truncated p JOIN links l ON p.node_id IN (l.node_id_fr, l.node_id_to) WHERE p.rn = 1 AND p.lev < &LEVMAX) CYCLE node_id SET lp TO '*' DEFAULT ' ' , approx_best_paths AS ( SELECT node_id, Max (lev) KEEP (DENSE_RANK FIRST ORDER BY lev) lev FROM paths_truncated GROUP BY node_id) paths_truncated (recursive subquery) approx_best_paths Same subquery as in 1-recursion Except… Truncate recursion at iteration &LEVMAX ( I tried 5 and 10) Gets minimum lev by node_id from paths_truncated …  Any paths to node_id longer in second recursion than found here can be discarded
  • 24. SQL for Shortest Paths – Two Recursive Subqueries, Part 2 Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 24 ), paths (node_id, lev, rn) AS ( SELECT &root_id_var, 0, 1 FROM DUAL UNION ALL SELECT CASE WHEN l.node_id_fr = p.node_id THEN l.node_id_to ELSE l.node_id_fr END, p.lev + 1, Row_Number () OVER (PARTITION BY CASE WHEN l.node_id_fr = p.node_id THEN l.node_id_to ELSE l.node_id_fr END ORDER BY p.node_id) FROM paths p JOIN links l ON p.node_id IN (l.node_id_fr, l.node_id_to) LEFT JOIN approx_best_paths b ON b.node_id = CASE WHEN l.node_id_fr = p.node_id THEN l.node_id_to ELSE l.node_id_fr END WHERE p.rn = 1 AND p.lev < Nvl (b.lev, 1000000) ) SEARCH DEPTH FIRST BY node_id SET line_no CYCLE node_id SET lp TO '*' DEFAULT ' ' , node_min_levs AS ( SELECT node_id, Min (lev) KEEP (DENSE_RANK FIRST ORDER BY lev) lev, Min (line_no) KEEP (DENSE_RANK FIRST ORDER BY lev) line_no FROM paths GROUP BY node_id) SELECT n.node_name, Substr(LPad ('.', 1 + 2 * m.lev, '.') || m.node_id, 2) node, m.lev lev FROM node_min_levs m JOIN nodes n ON n.id = m.node_id ORDER BY m.line_no paths (recursive subquery) node_min_levs, main section Same subquery as in 1-recursion Except… Outer-join approx_best_paths … Discard path if longer than found in previous subquery node_min_levs gets the minimum lev by node_id Along with the line_no… To order by in main section
  • 25. SQL for Shortest Paths - Two Recursive Subqueries - Performance Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 25 ------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | ------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 11803 |00:12:55.42 | 130M| | 1 | SORT ORDER BY | | 1 | 18E| 11803 |00:12:55.42 | 130M| |* 2 | HASH JOIN | | 1 | 18E| 11803 |00:12:55.41 | 130M| | 3 | TABLE ACCESS FULL | NODES | 1 | 12466 | 12466 |00:00:00.01 | 46 | | 4 | VIEW | | 1 | 18E| 11803 |00:12:55.40 | 130M| | 5 | SORT GROUP BY | | 1 | 18E| 11803 |00:12:55.40 | 130M| | 6 | VIEW | | 1 | 18E| 672K|00:12:54.66 | 130M| | 7 | UNION ALL (RECURSIVE WITH) DEPTH FIRST | | 1 | | 672K|00:12:54.53 | 130M| | 8 | FAST DUAL | | 1 | 1 | 1 |00:00:00.01 | 0 | | 9 | WINDOW SORT | | 128 | 18E| 672K|00:11:57.72 | 65M| |* 10 | FILTER | | 128 | | 672K|00:11:57.25 | 65M| | 11 | MERGE JOIN OUTER | | 128 | 18E| 1821K|00:11:57.02 | 65M| | 12 | SORT JOIN | | 128 | 18E| 1821K|00:06:38.56 | 24M| | 13 | NESTED LOOPS | | 128 | 18E| 1821K|00:06:39.34 | 24M| | 14 | RECURSIVE WITH PUMP | | 128 | | 21483 |00:00:00.11 | 1 | |* 15 | TABLE ACCESS FULL | LINKS | 21483 | 95 | 1821K|00:07:40.14 | 24M| |* 16 | SORT JOIN (REUSE) | | 1821K| 70T| 1203K|00:05:17.98 | 41M| | 17 | VIEW | | 1 | 70T| 11625 |00:05:17.18 | 41M| | 18 | SORT GROUP BY | | 1 | 70T| 11625 |00:05:17.18 | 41M| | 19 | VIEW | | 1 | 70T| 1666K|00:00:25.89 | 41M| | 20 | UNION ALL (RECURSIVE WITH) BREADTH FIRST| | 1 | | 1666K|00:00:25.48 | 41M| | 21 | FAST DUAL | | 1 | 1 | 1 |00:00:00.01 | 0 | | 22 | WINDOW SORT | | 6 | 70T| 1666K|00:04:53.82 | 18M| | 23 | NESTED LOOPS | | 6 | 70T| 1666K|00:04:50.98 | 18M| | 24 | RECURSIVE WITH PUMP | | 6 | | 16426 |00:00:00.25 | 2 | |* 25 | TABLE ACCESS FULL | LINKS | 16426 | 95 | 1666K|00:06:32.63 | 18M| ------------------------------------------------------------------------------------------------------------------------- Execution Plan (Extract) – Bacon/top250 (11,803 node subnetwork - 12,466 node / 583,993 link total)  The E-Rows numbers here are often massive over-estimates  Let’s look at the run times for both queries across the datasets…
  • 26. SQL for Shortest Paths - Performance - Results Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 26  One-recursive subquery ran for hours on top250 before being aborted, ok for small datasets  Two-recursive subqueries completed top250 in 796/1,663s for Truncate at 5/10  2-RS obtains a partial, approximative solution to enable early truncation of the paths  The use of a hard-coded iteration limit in the first subquery has obvious limitations  If it’s too low, the first subquery will provide too little information to optimize the second  If it’s too large then the approximative subquery itself will have too much work to do  We’ll see that using SQL within a PL/SQL algorithm will give better results… Dataset #Nodes (all) #Links #Nodes (sub) Maxlev #Secs (1-RS) Truncate at #Secs (2-RS) three_subnets 14 13 11 3 0.01 3 0.02 foreign_keys 289 319 47 5 0.01 5 0.01 brightkite 58,228 214,078 56,739 10 NA 5 559 bacon/small 161 3,342 161 5 8 5 0.1 bacon/top250 12,466 583,993 11,803 7 Aborted 5 796 bacon/top250 12,466 583,993 11,803 7 Aborted 10 1,663
  • 27. Two Algorithms with Code Timing Brendan Furey, 2022 27 Two Algorithms with Code Timing (7 slides) Two PL/SQL network analysis algorithms with code timing and performance analysis Analysing Performance of Algorithmic SQL and PL/SQL
  • 28. Two Algorithms Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 28 •Truncate the solution table, min_tree_links, and insert the root node at level 0 •Loop while records are inserted • Insert a new node record at the next level: • for every link that is connected to a node at the current level: • that does not exist in the table for any prior level • and does not appear at the next level for any other link with a higher ranked path • Commit • Increment level and inserts counter • Exit when no records inserted •Return the number of records inserted •Truncate the solution table, node_roots •Loop while a new root node is found • Select a new root node id from nodes not in node_roots • Exit loop when none found • Call Ins_Min_Tree_Links to populate the solution table, min_tree_links, for the new root node • Insert all nodes in min_tree_links into node_roots against the new root node Min Pathfinder Algorithm Subnetwork Grouper Algorithm  Code timing will show tuning opportunities in the initial implementation  Shortest paths are inserted at each iteration, and all inserted are visible to the future iterations  This avoids the inefficiency inherent in the pure SQL solutions
  • 29. Code Timing - Ins_Min_Tree_Links Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 29 FUNCTION Ins_Min_Tree_Links( p_root_node_id PLS_INTEGER) RETURN PLS_INTEGER IS l_lev PLS_INTEGER := 0; l_ins PLS_INTEGER; l_ins_tot PLS_INTEGER := 0; l_ts_id PLS_INTEGER := Timer_Set.Construct('Ins_Min_Tree_Links: ' || p_root_node_id); BEGIN EXECUTE IMMEDIATE 'TRUNCATE TABLE min_tree_links'; INSERT INTO min_tree_links VALUES (p_root_node_id, '', 0); LOOP INSERT INTO min_tree_links SELECT CASE WHEN lnk.node_id_fr = mlp_cur.node_id THEN lnk.node_id_to ELSE lnk.node_id_fr END, Min (mlp_cur.node_id), l_lev + 1 FROM min_tree_links mlp_cur JOIN links lnk ON (lnk.node_id_fr = mlp_cur.node_id OR lnk.node_id_to = mlp_cur.node_id) LEFT JOIN min_tree_links mlp_pri ON mlp_pri.node_id = CASE WHEN lnk.node_id_fr = mlp_cur.node_id THEN lnk.node_id_to ELSE lnk.node_id_fr END WHERE mlp_pri.node_id IS NULL AND mlp_cur.lev = l_lev GROUP BY CASE WHEN lnk.node_id_fr = mlp_cur.node_id THEN lnk.node_id_to ELSE lnk.node_id_fr END; l_ins := SQL%ROWCOUNT; COMMIT; l_ins_tot := l_ins_tot + l_ins; Timer_Set.Increment_Time(l_ts_id, 'Level: ' || l_lev || ', nodes: ' || l_ins); EXIT WHEN l_ins = 0; l_lev := l_lev + 1; END LOOP; Utils.W(Timer_Set.Format_Results(l_ts_id)); RETURN l_ins_tot; END Ins_Min_Tree_Links; Construct timer set, with root node in name Time insert, with level and rows in name Write timer set
  • 30. Code Timing - Ins_Min_Tree_Links - Results Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 30 Timer Set: Ins_Min_Tree_Links: 10001, Constructed at 30 Jul 2022 16:07:35, written at 16:11:01 ============================================================================================== Timer Elapsed CPU Calls Ela/Call CPU/Call ----------------------- ---------- ---------- ---------- ------------- ------------- Level: 0, nodes: 38 0.05 0.04 1 0.04700 0.04000 Level: 1, nodes: 5169 0.04 0.03 1 0.04200 0.03000 Level: 2, nodes: 202118 13.77 13.72 1 13.76500 13.72000 Level: 3, nodes: 358824 104.69 100.59 1 104.69100 100.59000 Level: 4, nodes: 100099 75.15 74.11 1 75.14900 74.11000 Level: 5, nodes: 11298 9.61 9.61 1 9.60600 9.61000 Level: 6, nodes: 1865 1.15 1.14 1 1.14700 1.14000 Level: 7, nodes: 421 0.29 0.30 1 0.28900 0.30000 Level: 8, nodes: 170 0.16 0.16 1 0.16200 0.16000 Level: 9, nodes: 39 0.10 0.09 1 0.09700 0.09000 Level: 10, nodes: 11 0.07 0.08 1 0.07000 0.08000 Level: 11, nodes: 7 0.07 0.08 1 0.07300 0.08000 Level: 12, nodes: 0 0.07 0.06 1 0.07200 0.06000 (Other) 0.39 0.39 1 0.39400 0.39000 ----------------------- ---------- ---------- ---------- ------------- ------------- Total 205.60 200.40 14 14.68600 14.31429 ----------------------- ---------- ---------- ---------- ------------- ------------- [Timer timed (per call in ms): Elapsed: 0.02061, CPU: 0.02245] Results for Bacon/only_tv_v Dataset (680,060 node subnetwork - 744,374 node / 22,503,060 link total)  The results show a total elapsed time of 206 seconds  There is a timer for each iteration, showing CPU and elapsed times, with nodes processed  As you’d expect, the largest times correspond to the most nodes inserted…  and with time per node increasing as the solution table fills up  Each iteration corresponds to a single insert, we can get the execution plan…
  • 31. Execution Plan - Ins_Min_Tree_Links Insert Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 31 -------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | -------------------------------------------------------------------------------------------------------------------------------- | 0 | INSERT STATEMENT | | 1 | | 0 |00:00:00.07 | 6777 | | | | 1 | LOAD TABLE CONVENTIONAL | MIN_TREE_LINKS | 1 | | 0 |00:00:00.07 | 6777 | | | | 2 | HASH GROUP BY | | 1 | 2 | 0 |00:00:00.07 | 6777 | 1161K| 1161K| | 3 | VIEW | VW_ORE_BC29D05C | 1 | 2 | 0 |00:00:00.07 | 6777 | | | | 4 | UNION-ALL | | 1 | | 0 |00:00:00.07 | 6777 | | | |* 5 | HASH JOIN ANTI | | 1 | 1 | 0 |00:00:00.04 | 3388 | 1106K| 1106K| | 6 | NESTED LOOPS | | 1 | 33 | 10 |00:00:00.01 | 1709 | | | | 7 | NESTED LOOPS | | 1 | 33 | 10 |00:00:00.01 | 1699 | | | |* 8 | TABLE ACCESS FULL | MIN_TREE_LINKS | 1 | 1 | 7 |00:00:00.01 | 1683 | | | |* 9 | INDEX RANGE SCAN | LINKS_FR_N1 | 7 | 33 | 10 |00:00:00.01 | 16 | | | | 10 | TABLE ACCESS BY INDEX ROWID| LINKS | 10 | 33 | 10 |00:00:00.01 | 10 | | | | 11 | TABLE ACCESS FULL | MIN_TREE_LINKS | 1 | 1 | 680K|00:00:00.01 | 1679 | | | |* 12 | HASH JOIN ANTI | | 1 | 1 | 0 |00:00:00.03 | 3389 | 1106K| 1106K| | 13 | NESTED LOOPS | | 1 | 33 | 12 |00:00:00.01 | 1710 | | | | 14 | NESTED LOOPS | | 1 | 33 | 12 |00:00:00.01 | 1698 | | | |* 15 | TABLE ACCESS FULL | MIN_TREE_LINKS | 1 | 1 | 7 |00:00:00.01 | 1682 | | | |* 16 | INDEX RANGE SCAN | LINKS_TO_N1 | 7 | 33 | 12 |00:00:00.01 | 16 | | | |* 17 | TABLE ACCESS BY INDEX ROWID| LINKS | 12 | 33 | 12 |00:00:00.01 | 12 | | | | 18 | TABLE ACCESS FULL | MIN_TREE_LINKS | 1 | 1 | 680K|00:00:00.01 | 1679 | | | -------------------------------------------------------------------------------------------------------------------------------- Execution Plan Extract INSERT INTO min_tree_links SELECT /*+ gather_plan_statistics XPLAN_MTL */ … Add hint to obtain execution plan Utils.W(Utils.Get_XPlan(p_sql_marker => 'XPLAN_MTL')); Write execution plan using wrapper function  No obvious problems
  • 32. Code Timing - Ins_Node_Roots Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 32 Code Timing Output PROCEDURE Ins_Node_Roots IS l_root_id PLS_INTEGER; l_ins_tot PLS_INTEGER; l_ts_id PLS_INTEGER := Timer_Set.Construct('Ins_Node_Roots'); l_suffix VARCHAR2(60); BEGIN EXECUTE IMMEDIATE 'TRUNCATE TABLE node_roots'; LOOP BEGIN SELECT id INTO l_root_id FROM nodes WHERE id NOT IN (SELECT node_id FROM node_roots) AND ROWNUM = 1; EXCEPTION WHEN NO_DATA_FOUND THEN l_root_id := NULL; END; Timer_Set.Increment_Time(l_ts_id, 'SELECT id INTO l_root_id'); EXIT WHEN l_root_id IS NULL; l_ins_tot := Ins_Min_Tree_Links(l_root_id); l_suffix := CASE WHEN l_ins_tot = 0 THEN '(1 node)' WHEN l_ins_tot = 1 THEN '(2 nodes)' WHEN l_ins_tot = 2 THEN '(3 nodes)' WHEN l_ins_tot < 40 THEN '(4-39 nodes)' ELSE '(root node ' || l_root_id || ', size: ' || (l_ins_tot + 1) || ')' END; Timer_Set.Increment_Time(l_ts_id, 'Insert min_tree_links ' || l_suffix); INSERT INTO node_roots tgt SELECT node_id, l_root_id, lev FROM min_tree_links; Timer_Set.Increment_Time(l_ts_id, 'Insert node_roots ' || l_suffix); END LOOP; Utils.W(Timer_Set.Format_Results(l_ts_id)); Procedure with Code Timing Construct timer set Time node selector query Timer name suffix allows aggregation by subnetwork size group Time Ins_Min_Tree_Links by size group Time Insert node_roots by size group Write timer set
  • 33. Code Timing - Ins_Node_Roots - Results Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 33 Code Timing Output Timer Set: Ins_Node_Roots, Constructed at 30 Jul 2022 16:15:48, written at 16:44:22 =================================================================================== Timer Elapsed CPU Calls Ela/Call CPU/Call --------------------------------------------------- ---------- ---------- ---------- ------------- ------------- SELECT id INTO l_root_id 1517.43 1506.68 19642 0.07725 0.07671 Insert min_tree_links (root node 579, size: 680060) 122.95 120.31 1 122.94500 120.31000 Insert node_roots (root node 579, size: 680060) 4.10 4.05 1 4.10400 4.05000 Insert min_tree_links (4-39 nodes) 20.21 23.07 5317 0.00380 0.00434 Insert node_roots (4-39 nodes) 1.56 1.61 5317 0.00029 0.00030 Insert min_tree_links (root node 646, size: 58) 0.01 0.01 1 0.00800 0.01000 Insert node_roots (root node 646, size: 58) 0.00 0.00 1 0.00000 0.00000 Insert min_tree_links (3 nodes) 7.14 7.29 2091 0.00341 0.00349 Insert node_roots (3 nodes) 0.50 0.62 2091 0.00024 0.00030 Insert min_tree_links (1 node) 24.91 24.76 8659 0.00288 0.00286 Insert node_roots (1 node) 2.18 1.75 8659 0.00025 0.00020 Insert min_tree_links (2 nodes) 11.74 11.67 3539 0.00332 0.00330 Insert node_roots (2 nodes) 0.88 1.42 3539 0.00025 0.00040 ... (Other) 0.00 0.00 1 0.00100 0.00000 --------------------------------------------------- ---------- ---------- ---------- ------------- ------------- Total 1714.02 1703.66 58925 0.02909 0.02891 --------------------------------------------------- ---------- ---------- ---------- ------------- ------------- [Timer timed (per call in ms): Elapsed: 0.01282, CPU: 0.01282] Results for Bacon/only_tv_v Dataset (744,374 nodes and 22,503,060 links)  The results show a total elapsed time of 1,714 seconds, 90% from the SELECT timer  To improve performance we need first to focus on that code section  8,659 calls were made for '(1 node)' suffix timers and 3,539 for the '(2 nodes)' ones  Call also corresponds to an instance of SELECT id INTO l_root_id, ~ about 26% of that line  We can insert these 1/2 node node_roots records in single inserts prior to main algorithm
  • 34. Two Algorithms - Performance Considerations Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 34  It does this by storing the paths at each iteration, and excluding nodes already reached from future iterations  At the same time, each iteration uses a single SQL insert with subquery to process in an efficient set-based fashion  We will find the resulting queries themselves can be tuned using query transformation and hints  It thus benefits from its efficiency to identify the subnetworks  However, code timing identified two main areas in which a still more set-based approach could improve performance:  Firstly, One and two-node subnetworks do an insert for each node  We could in fact insert all of these in a single set-based insert each, ahead of the main algorithm for the larger subnetworks  Secondly, a root node selector query is executed for each subnetwork  We may be able to find a way of selection that does not execute this at each iteration Min Pathfinder Subnetwork Grouper  algorithm allows us to prune non-shortest paths early  algorithm uses Min Pathfinder within a higher level algorithm SQL Tuning
  • 35. Oracle Standard Profilers Brendan Furey, 2022 35 Oracle Standard Profilers (2 slides) Results from two standard Oracle profiling tools for the Subnetwork Grouper procedure Analysing Performance of Algorithmic SQL and PL/SQL
  • 36. Flat Profiler Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 36 VAR RUN_ID NUMBER DECLARE l_result PLS_INTEGER; BEGIN l_result := DBMS_Profiler.Start_Profiler( run_comment => 'Profile for Ins_Node_Roots', run_number => :RUN_ID); Shortest_Path_SQL_Base.Ins_Node_Roots; l_result := DBMS_Profiler.Stop_Profiler; END; / @....dprof_queries :RUN_ID Calling Flat Profiler Profiler data by time (PLSQL_PROFILER_DATA) Seconds Calls Unit Line# Line Text ----------- -------- ------------------------- ------- --------------------------------------------------------------------------- ----------------------------------- 1789.829 19642 SHORTEST_PATH_SQL_BASE 85 SELECT id INTO l_root_id FROM nodes WHERE id NOT IN (SELECT node_id FROM node_roots) AND ROWNUM = 1; 128.850 31374 SHORTEST_PATH_SQL_BASE 15 INSERT INTO min_tree_links 31.519 19641 SHORTEST_PATH_SQL_BASE 11 EXECUTE IMMEDIATE 'TRUNCATE TABLE min_tree_links'; 7.318 19641 SHORTEST_PATH_SQL_BASE 93 INSERT INTO node_roots tgt 4.828 19641 SHORTEST_PATH_SQL_BASE 12 INSERT INTO min_tree_links VALUES (p_root_node_id, '', 0); 1.897 31374 SHORTEST_PATH_SQL_BASE 31 COMMIT; 0.071 31374 SHORTEST_PATH_SQL_BASE 30 l_ins := SQL%ROWCOUNT; ... 157179 rows selected. Call to be profiled Start… …and stop profiler Custom reporting script, passed run id  The line text is got by joining the system view all_source to the profiler package/line number
  • 37. Hierarchical Profiler Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 37 VAR RUN_ID NUMBER BEGIN HProf_Utils.Start_Profiling; Shortest_Path_SQL_Base.Ins_Node_Roots; :RUN_ID := HProf_Utils.Stop_Profiling( p_run_comment => 'Profile for Ins_Node_Roots', p_filename => 'hp_ins_node_roots_&SUB..html'); END; / @....hprof_queries :RUN_ID Calling Hierarchical Profiler Profiler data by time (PLSQL_PROFILER_DATA) Function tree Owner Module Inst. Subtree MicroS Function MicroS Calls ------------------------------------ ------------------ ------------------------- ------ -------------- --------------- ------- INS_NODE_ROOTS SHORTEST_PATH_SQL SHORTEST_PATH_SQL_BASE 1668506464 332444 1 __static_sql_exec_line85 SHORTEST_PATH_SQL SHORTEST_PATH_SQL_BASE 1490685875 1490685875 19642 INS_MIN_TREE_LINKS SHORTEST_PATH_SQL SHORTEST_PATH_SQL_BASE 169705305 1195428 19641 __static_sql_exec_line15 SHORTEST_PATH_SQL SHORTEST_PATH_SQL_BASE 141321652 141321652 31378 __dyn_sql_exec_line11 SHORTEST_PATH_SQL SHORTEST_PATH_SQL_BASE 20610292 9775994 19641 __plsql_vm 1 of 2 10834298 71027 19641 __anonymous_block 1 of 2 10763826 2971230 19642 IS_VPD_ENABLED SYS IS_VPD_ENABLED 1 of 2 6934407 395925 39284 __static_sql_exec_line22 SYS IS_VPD_ENABLED 1 of 2 6538482 6538482 39284 DICTIONARY_OBJ_OWNER SYS DICTIONARY_OBJ_OWNER 1 of 2 812866 812866 39284 DICTIONARY_OBJ_NAME SYS DICTIONARY_OBJ_NAME 1 of 2 45323 45323 39284 __static_sql_exec_line12 SHORTEST_PATH_SQL SHORTEST_PATH_SQL_BASE 4413730 4413730 19641 __static_sql_exec_line31 SHORTEST_PATH_SQL SHORTEST_PATH_SQL_BASE 2164203 2164203 31378 __static_sql_exec_line93 SHORTEST_PATH_SQL SHORTEST_PATH_SQL_BASE 7706832 7706832 19641 __dyn_sql_exec_line81 SHORTEST_PATH_SQL SHORTEST_PATH_SQL_BASE 76008 75449 1 __plsql_vm 2 of 2 559 4 1 __static_sql_exec_line700 SYS DBMS_HPROF 128 128 1 STOP_PROFILING LIB HPROF_UTILS 22 22 1 STOP_PROFILING SYS DBMS_HPROF 0 0 1 Custom reporting script, passed run id Call to be profiled Custom wrapper package around start …and stop profiling HTML results filename
  • 38. Tuning 1 - SQL for Isolated Nodes Brendan Furey, 2022 38 Tuning 1 - SQL for Isolated Nodes (5 slides) Recap of join methods and types, then queries with antijoin structures and hints Analysing Performance of Algorithmic SQL and PL/SQL
  • 39. SQL Join Definitions Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 39 Join Types For each row in the outer data set that matches the single-table predicates, the database retrieves all rows in the inner data set that satisfy the join predicate. If an index is available, then the database can use it to access the inner data set by rowid Hash Join - The database uses a hash join to join larger data sets The optimizer uses the smaller of two data sets to build a hash table on the join key in memory, using a deterministic hash function to specify the location in the hash table in which to store each row. The database then scans the larger data set, probing the hash table to find the rows that meet the join condition Extracted from: SQL Tuning Guide, 21c Antijoin An antijoin is a join between two data sets that returns a row from the first set when a matching row does not exist in the subquery data set. Like a semijoin, an antijoin stops processing the subquery data set when the first match is found. Unlike a semijoin, the antijoin only returns a row when no match is found Nested Loops Join - Nested loops join an outer data set to an inner data set Join Methods
  • 40. SQL for Isolated Nodes: SQL 1 - Not Exists / Or Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 40 Execution Plan - Ran on only_tv_v dataset (744,374 nodes and 22,503,060 links) INSERT INTO node_roots SELECT nod.id, nod.id, 0 FROM nodes nod WHERE NOT EXISTS (SELECT 1 FROM links lnk WHERE lnk.node_id_fr = nod.id OR lnk.node_id_to = nod.id); ------------------------------------------------------------------------------------------------------------ | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | ------------------------------------------------------------------------------------------------------------ | 0 | INSERT STATEMENT | | 1 | | 0 |00:00:25.78 | 191K| 93127 | | 1 | LOAD TABLE CONVENTIONAL | NODE_ROOTS | 1 | | 0 |00:00:25.78 | 191K| 93127 | |* 2 | HASH JOIN ANTI | | 1 | 53174 | 8659 |00:00:25.73 | 176K| 93122 | | 3 | INDEX FAST FULL SCAN | SYS_C0018310 | 1 | 744K| 744K|00:00:00.08 | 1461 | 0 | | 4 | VIEW | VW_SQ_1 | 1 | 45M| 45M|00:00:22.86 | 174K| 93122 | | 5 | UNION-ALL | | 1 | | 45M|00:00:15.93 | 174K| 93122 | | 6 | TABLE ACCESS FULL | LINKS | 1 | 22M| 22M|00:00:02.61 | 87315 | 46561 | | 7 | TABLE ACCESS FULL | LINKS | 1 | 22M| 22M|00:00:02.19 | 87315 | 46561 | ------------------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("VW_COL_1"="NOD"."ID")  UNION ALL results in a single hash antijoin, with a probe table twice the size of links  What if we replaced the NOT EXISTS with explicit antijoins?...  All the nodes that are present only in the nodes table but not in the links table  Can be expressed in a single SQL statement for the insert  Query obtains the 8,659 isolated nodes in 21 seconds  S5: OR transformed into UNION ALL of two full links scans, S6/7  S4: View of 45M rows used as probe table in hash antijoin, S2…  S3: With scan of nodes unique index as the build table
  • 41. SQL for Isolated Nodes: SQL 2 - Outer Joins Unhinted Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 41 Execution Plan INSERT INTO node_roots SELECT nod.id, nod.id, 0 FROM nodes nod LEFT JOIN links lnk_f ON lnk_f.node_id_fr = nod.id LEFT JOIN links lnk_t ON lnk_t.node_id_to = nod.id WHERE lnk_f.node_id_fr IS NULL AND lnk_t.node_id_fr IS NULL; ------------------------------------------------------------------------------------------------------------ | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | ------------------------------------------------------------------------------------------------------------ | 0 | INSERT STATEMENT | | 1 | | 0 |00:00:12.48 | 191K| 93127 | | 1 | LOAD TABLE CONVENTIONAL | NODE_ROOTS | 1 | | 0 |00:00:12.48 | 191K| 93127 | |* 2 | HASH JOIN ANTI | | 1 | 532 | 8659 |00:00:12.43 | 176K| 93122 | |* 3 | HASH JOIN ANTI | | 1 | 53174 | 57851 |00:00:08.41 | 88776 | 46561 | | 4 | INDEX FAST FULL SCAN | SYS_C0018310 | 1 | 744K| 744K|00:00:00.07 | 1461 | 0 | | 5 | TABLE ACCESS FULL | LINKS | 1 | 22M| 22M|00:00:02.58 | 87315 | 46561 | | 6 | TABLE ACCESS FULL | LINKS | 1 | 22M| 22M|00:00:01.83 | 87315 | 46561 | ------------------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("LNK_T"."NODE_ID_TO"="NOD"."ID") 3 - access("LNK_F"."NODE_ID_FR"="NOD"."ID")  Query obtains the 8,659 isolated nodes in 12 seconds  S4: An index scan of the unique index on nodes as the build table for a hash antijoin, S3  S5: Full scan of the links table as the probe table  S2: Hash antijoin uses result set as the build table  S6: With another full scan of the links table as the second probe table  Convert NOT EXISTS into outer antijoins  Where the CBO in SQL-1 used a view/union and a single hash antijoin…  Two outer joins resulted in two hash antijoins, but with smaller probe tables and faster  How would this compare with a plan using nested loop joins?
  • 42. SQL for Isolated Nodes: SQL 3 - Outer Joins Hinted Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 42 Execution Plan INSERT INTO node_roots SELECT /*+gather_plan_statistics USE_NL (lnk_f) USE_NL (lnk_t)*/ nod.id, nod.id, 0 FROM nodes nod LEFT JOIN links lnk_f ON lnk_f.node_id_fr = nod.id LEFT JOIN links lnk_t ON lnk_t.node_id_to = nod.id WHERE lnk_f.node_id_fr IS NULL AND lnk_t.node_id_fr IS NULL; ------------------------------------------------------------------------------------------------------------ | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | ------------------------------------------------------------------------------------------------------------ | 0 | INSERT STATEMENT | | 1 | | 0 |00:00:01.27 | 624K| 5 | | 1 | LOAD TABLE CONVENTIONAL | NODE_ROOTS | 1 | | 0 |00:00:01.27 | 624K| 5 | | 2 | NESTED LOOPS ANTI | | 1 | 532 | 8659 |00:00:00.89 | 622K| 0 | | 3 | NESTED LOOPS ANTI | | 1 | 53174 | 57851 |00:00:01.04 | 506K| 0 | | 4 | INDEX FAST FULL SCAN | SYS_C0018310 | 1 | 744K| 744K|00:00:00.11 | 1461 | 0 | |* 5 | INDEX RANGE SCAN | LINKS_FR_N1 | 744K| 20M| 686K|00:00:00.83 | 505K| 0 | |* 6 | INDEX RANGE SCAN | LINKS_TO_N1 | 57851 | 22M| 49192 |00:00:00.15 | 115K| 0 | ------------------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 5 - access("LNK_F"."NODE_ID_FR"="NOD"."ID") 6 - access("LNK_T"."NODE_ID_TO"="NOD"."ID")  Obtains the 8,659 isolated nodes in 1.5 seconds  S2, S3: Two nested loops antijoins  S4: Drives off full scan of the unique index on nodes  S5: First join to From index on links  S6: Then join to To index on links  Hint to use nested loops joins: USE_NL (lnk_f) USE_NL (lnk_t)  Estimated rows for the two range scans are much higher than the actual rows returned. Let’s look at it…
  • 43. SQL for Isolated Nodes: SQL 3 - Nested Loops Analysis Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 43 Execution Plan ---------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | ---------------------------------------------------------------------------- | 0 | INSERT STATEMENT | | 1 | | 0 | | 1 | LOAD TABLE CONVENTIONAL | NODE_ROOTS | 1 | | 0 | | 2 | NESTED LOOPS ANTI | | 1 | 532 | 8659 | | 3 | NESTED LOOPS ANTI | | 1 | 53174 | 57851 | | 4 | INDEX FAST FULL SCAN | SYS_C0018310 | 1 | 744K| 744K| |* 5 | INDEX RANGE SCAN | LINKS_FR_N1 | 744K| 20M| 686K| |* 6 | INDEX RANGE SCAN | LINKS_TO_N1 | 57851 | 22M| 49192 | ----------------------------------------------------------------------------  E-Rows of 20M in S5 and 22M in S6 seem to assume getting all matches  And seem to be across all starts, usually it’s per start  But, as we saw in the definitions, antijoins get only the first match  As reflected in the A-Rows of 686K and 49,192  It is almost as though (to speculate):  The SQL engine is smart enough to know that, in the context of the anti-join, there is no point in bringing back all the joining records when these will all be eliminated later  But that the CBO is not, and chooses a bad plan, when unhinted, for that reason  Anyway, it’s important to note that the CBO does not always choose the optimal join method E-Rows Anomaly
  • 44. Tuning 2 - SQL for Isolated Links Brendan Furey, 2022 44 Tuning 2 - SQL for Isolated Links (8 slides) Disastrous ‘Bitmap Or’ expansion, good and bad antijoin plans and efficient group counting query Analysing Performance of Algorithmic SQL and PL/SQL
  • 45. SQL for Isolated Links: SQL 1 - Not Exists / 4-way Or Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 45 INSERT INTO node_roots WITH isolated_links AS ( SELECT lnk.node_id_fr, lnk.node_id_to FROM links lnk WHERE NOT EXISTS ( SELECT 1 FROM links lnk_1 WHERE (lnk_1.node_id_fr = lnk.node_id_to OR lnk_1.node_id_to = lnk.node_id_fr OR lnk_1.node_id_fr = lnk.node_id_fr OR lnk_1.node_id_to = lnk.node_id_to) AND lnk_1.ROWID != lnk.ROWID )) SELECT node_id_fr, node_id_fr, 0 FROM isolated_links UNION SELECT node_id_to, node_id_fr, 1 FROM isolated_links  NOT EXISTS links record matching: any of 4 conditions  And not the driving links record itself  For record passing the NOT EXISTS:  Add both from and to nodes into node_roots  Links that do not connect to any other links  From and to node is neither a from nor a to node in any other link  Ran on pre1950 dataset (134,131 nodes and 8,095,294 links)  Obtains the 425 isolated links in 4,103 seconds!  Let’s look at the execution plan…
  • 46. SQL for Isolated Links: SQL 1 - Not Exists / 4-way Or - Execution Plan Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 46 Execution Plan (Extract)  S7: CBO transforms the OR conditions into a 4-section BITMAP OR  S6: Then a BITMAP CONVERSION TO ROWIDS and  S5: A links table access to filter the driving instance (S3) ----------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | ----------------------------------------------------------------------------------------------------------------------- | 0 | INSERT STATEMENT | | 1 | | 0 |00:41:31.41 | | 1 | TEMP TABLE TRANSFORMATION | | 1 | | 0 |00:41:31.41 | | 2 | LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D6C4F_4F65443 | 1 | | 0 |00:41:31.37 | |* 3 | FILTER | | 1 | | 425 |01:20:08.08 | | 4 | TABLE ACCESS FULL | LINKS | 1 | 8095K| 8095K|00:00:01.57 | | 5 | TABLE ACCESS BY INDEX ROWID BATCHED | LINKS | 8095K| 1 | 8094K|01:08:05.02 | |* 6 | BITMAP CONVERSION TO ROWIDS | | 8095K| | 8094K|01:07:45.55 | | 7 | BITMAP OR | | 8095K| | 8094K|01:07:35.34 | |* 8 | BITMAP CONVERSION FROM ROWIDS | | 8095K| | 7978K|00:09:42.09 | |* 9 | INDEX RANGE SCAN | LINKS_TO_N1 | 8095K| | 3076M|00:08:56.46 | |* 10 | BITMAP CONVERSION FROM ROWIDS | | 8095K| | 8086K|00:17:19.09 | |* 11 | INDEX RANGE SCAN | LINKS_TO_N1 | 8095K| | 5926M|00:16:17.18 | |* 12 | BITMAP CONVERSION FROM ROWIDS | | 8095K| | 7974K|00:09:27.91 | |* 13 | INDEX RANGE SCAN | LINKS_FR_N1 | 8095K| | 3076M|00:08:41.13 | |* 14 | BITMAP CONVERSION FROM ROWIDS | | 8095K| | 8086K|00:19:00.44 | |* 15 | INDEX RANGE SCAN | LINKS_FR_N1 | 8095K| | 6232M|00:17:21.29 | | 16 | LOAD TABLE CONVENTIONAL | NODE_ROOTS | 1 | | 0 |00:00:00.04 | | 17 | HASH UNIQUE | | 1 | 16M| 850 |00:00:00.03 | | 18 | UNION-ALL | | 1 | | 850 |00:00:00.01 | | 19 | VIEW | | 1 | 8095K| 425 |00:00:00.01 | | 20 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6C4F_4F65443 | 1 | 8095K| 425 |00:00:00.01 | | 21 | VIEW | | 1 | 8095K| 425 |00:00:00.01 | | 22 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6C4F_4F65443 | 1 | 8095K| 425 |00:00:00.01 | -----------------------------------------------------------------------------------------------------------------------  8095K starts, S5-S15  A-Rows very high
  • 47. SQL for Isolated Links: SQL 2 - 4 Not Exists Subqueries Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 47 INSERT INTO node_roots WITH isolated_links AS ( SELECT lnk.node_id_fr, lnk.node_id_to FROM links lnk WHERE NOT EXISTS ( SELECT 1 FROM links lnk_1 WHERE lnk_1.node_id_fr = lnk.node_id_fr AND lnk_1.ROWID != lnk.ROWID) AND NOT EXISTS ( SELECT 1 FROM links lnk_2 WHERE lnk_2.node_id_to = lnk.node_id_to AND lnk_2.ROWID != lnk.ROWID) AND NOT EXISTS ( SELECT 1 FROM links lnk_3 WHERE (lnk_3.node_id_fr = lnk.node_id_to) AND lnk_3.ROWID != lnk.ROWID) AND NOT EXISTS ( SELECT 1 FROM links lnk_4 WHERE (lnk_4.node_id_to = lnk.node_id_fr) AND lnk_4.ROWID != lnk.ROWID)) SELECT node_id_fr, node_id_fr, 0 FROM isolated_links UNION SELECT node_id_to, node_id_fr, 1 FROM isolated_links  Split the NOT EXISTS with 4 conditions into…  A NOT EXISTS for each condition, replicating…  …the ‘not the driving links record’ condition  Obtains the 425 isolated links in 20 seconds, much faster!  Let’s look at the execution plan…
  • 48. SQL for Isolated Links: SQL 2 - 4 Not Exists Subqueries - Execution Plan Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 48 Execution Plan (Extract)  S9: Plan starts with a hash antijoin on full scans of links…  S7,5,3: Then a sequence of hash right antijoins on result sets to full scans of links  …where right means the build table/probe table choice is reversed from the default  …making the build table the (smaller) result set  Note that the A-Rows drops rapidly from 116K as the sequence progresses, down to 425 (S3) ----------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | ----------------------------------------------------------------------------------------------------------------------- | 0 | INSERT STATEMENT | | 1 | | 0 |00:00:12.78 | | 1 | TEMP TABLE TRANSFORMATION | | 1 | | 0 |00:00:12.78 | | 2 | LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D6C1D_4F65443 | 1 | | 0 |00:00:12.77 | |* 3 | HASH JOIN RIGHT ANTI | | 1 | 8095K| 425 |00:00:13.60 | | 4 | TABLE ACCESS FULL | LINKS | 1 | 8095K| 8095K|00:00:00.69 | |* 5 | HASH JOIN RIGHT ANTI | | 1 | 8095K| 484 |00:00:09.94 | | 6 | TABLE ACCESS FULL | LINKS | 1 | 8095K| 8095K|00:00:00.59 | |* 7 | HASH JOIN RIGHT ANTI | | 1 | 8095K| 4196 |00:00:05.59 | | 8 | TABLE ACCESS FULL | LINKS | 1 | 8095K| 8095K|00:00:00.58 | |* 9 | HASH JOIN ANTI | | 1 | 8095K| 116K|00:00:04.99 | | 10 | TABLE ACCESS FULL | LINKS | 1 | 8095K| 8095K|00:00:00.59 | | 11 | TABLE ACCESS FULL | LINKS | 1 | 8095K| 8095K|00:00:00.55 | | 12 | LOAD TABLE CONVENTIONAL | NODE_ROOTS | 1 | | 0 |00:00:00.02 | | 13 | HASH UNIQUE | | 1 | 16M| 850 |00:00:00.01 | | 14 | UNION-ALL | | 1 | | 850 |00:00:00.01 | | 15 | VIEW | | 1 | 8095K| 425 |00:00:00.01 | | 16 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6C1D_4F65443 | 1 | 8095K| 425 |00:00:00.01 | | 17 | VIEW | | 1 | 8095K| 425 |00:00:00.01 | | 18 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6C1D_4F65443 | 1 | 8095K| 425 |00:00:00.01 | -----------------------------------------------------------------------------------------------------------------------
  • 49. SQL for Isolated Links: SQL 3 - 4 Outer Joins Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 49 INSERT INTO node_roots WITH isolated_links AS ( SELECT lnk.node_id_fr, lnk.node_id_to FROM links lnk LEFT JOIN links lnk_1 ON (lnk_1.node_id_fr = lnk.node_id_fr AND lnk_1.ROWID != lnk.ROWID) LEFT JOIN links lnk_2 ON (lnk_2.node_id_fr = lnk.node_id_to AND lnk_2.ROWID != lnk.ROWID) LEFT JOIN links lnk_3 ON (lnk_3.node_id_to = lnk.node_id_fr AND lnk_3.ROWID != lnk.ROWID) LEFT JOIN links lnk_4 ON (lnk_4.node_id_to = lnk.node_id_to AND lnk_4.ROWID != lnk.ROWID) WHERE lnk_1.node_id_fr IS NULL AND lnk_2.node_id_fr IS NULL AND lnk_3.node_id_to IS NULL AND lnk_4.node_id_to IS NULL ) SELECT node_id_fr, node_id_fr, 0 FROM isolated_links UNION SELECT node_id_to, node_id_fr, 1 FROM isolated_links  Replace each NOT EXISTS with an outer antijoin  This worked well for isolated nodes, where the plan used hash antijoin…  Almost halved the time compared with NOT EXISTS  Obtains the 425 isolated links in 1,259 seconds, much slower!  Let’s look at the execution plan…
  • 50. SQL for Isolated Links: SQL 3 - 4 Outer Joins - Execution Plan Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 50 Execution Plan (Extract)  S13, S12: The hash join anti step has been replaced by hash join outer / filter pair of steps  S7,5,3: And the sequence of hash join right anti steps has been replaced by…  hash join right outer / filter pairs of steps  The outer joins have not been recognised as antijoins, causing much more intermediate work ----------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | ----------------------------------------------------------------------------------------------------------------------- | 0 | INSERT STATEMENT | | 1 | | 0 |00:17:46.54 | | 1 | TEMP TABLE TRANSFORMATION | | 1 | | 0 |00:17:46.54 | | 2 | LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D6C1E_4F65443 | 1 | | 0 |00:17:46.52 | |* 3 | FILTER | | 1 | | 425 |00:17:46.11 | |* 4 | HASH JOIN RIGHT OUTER | | 1 | 8095K| 1302 |00:17:46.21 | | 5 | TABLE ACCESS FULL | LINKS | 1 | 8095K| 8095K|00:00:00.73 | |* 6 | FILTER | | 1 | | 472 |00:15:41.39 | |* 7 | HASH JOIN RIGHT OUTER | | 1 | 8095K| 49224 |00:12:50.53 | | 8 | TABLE ACCESS FULL | LINKS | 1 | 8095K| 8095K|00:00:00.68 | |* 9 | FILTER | | 1 | | 3724 |00:15:31.22 | |* 10 | HASH JOIN RIGHT OUTER | | 1 | 8095K| 267K|00:15:16.91 | | 11 | TABLE ACCESS FULL | LINKS | 1 | 8095K| 8095K|00:00:00.61 | |* 12 | FILTER | | 1 | | 8819 |00:14:59.94 | |* 13 | HASH JOIN OUTER | | 1 | 8095K| 6232M|00:18:35.29 | | 14 | TABLE ACCESS FULL | LINKS | 1 | 8095K| 8095K|00:00:00.59 | | 15 | TABLE ACCESS FULL | LINKS | 1 | 8095K| 8095K|00:00:00.55 | | 16 | LOAD TABLE CONVENTIONAL | NODE_ROOTS | 1 | | 0 |00:00:00.02 | | 17 | HASH UNIQUE | | 1 | 16M| 850 |00:00:00.01 | | 18 | UNION-ALL | | 1 | | 850 |00:00:00.01 | | 19 | VIEW | | 1 | 8095K| 425 |00:00:00.01 | | 20 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6C1E_4F65443 | 1 | 8095K| 425 |00:00:00.01 | | 21 | VIEW | | 1 | 8095K| 425 |00:00:00.01 | | 22 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6C1E_4F65443 | 1 | 8095K| 425 |00:00:00.01 | -----------------------------------------------------------------------------------------------------------------------
  • 51. SQL for Isolated Links: SQL 4 - Group Counting Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 51 INSERT INTO node_roots WITH all_nodes AS ( SELECT node_id_fr node_id, 'F' tp FROM links UNION ALL SELECT node_id_to, 'T' FROM links ), unique_nodes AS ( SELECT node_id, Max(tp) tp FROM all_nodes GROUP BY node_id HAVING COUNT(*) = 1 ), isolated_links AS ( SELECT lnk.node_id_fr, lnk.node_id_to FROM links lnk JOIN unique_nodes frn ON frn.node_id = lnk.node_id_fr AND frn.tp = 'F' JOIN unique_nodes ton ON ton.node_id = lnk.node_id_to AND ton.tp = 'T' ) SELECT node_id_fr, node_id_fr, 0 FROM isolated_links UNION ALL SELECT node_id_to, node_id_fr, 1 FROM isolated_links  all_nodes:  Gets all node instances with a type of F(rom) or T(o)  unique_nodes:  Selects from all_nodes the nodes having exactly one instance, along with its type  isolated_links:  Selects all links and inner-joins them to unique_nodes on both ends  main section:  Adds both nodes with from node as root  Re-define the logic for an isolated link as  Its from and to node both appear in exactly one link  Avoids the expensive self-join of links in favour of a group counting query  Obtains the 425 isolated links in 1.5 seconds!  Let’s look at the execution plan…
  • 52. SQL for Isolated Links: SQL 4 - Group Counting - Execution Plan Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 52 Execution Plan (Extract)  The plan shows two LOAD AS SELECTs  The first does a HASH GROUP BY, S4, on a UNION ALL of full scans on links; most of the time goes here  The filter step, S3, shows only 1,797 rows, making the rest of the query very fast – early pruning! ----------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | ----------------------------------------------------------------------------------------------------------------------- | 0 | INSERT STATEMENT | | 1 | | 0 |00:00:01.83 | | 1 | TEMP TABLE TRANSFORMATION | | 1 | | 0 |00:00:01.83 | | 2 | LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D6C20_4F65443 | 1 | | 0 |00:00:01.82 | |* 3 | FILTER | | 1 | | 1797 |00:00:01.91 | | 4 | HASH GROUP BY | | 1 | 26 | 132K|00:00:01.82 | | 5 | VIEW | | 1 | 16M| 16M|00:00:00.36 | | 6 | UNION-ALL | | 1 | | 16M|00:00:00.34 | | 7 | TABLE ACCESS FULL | LINKS | 1 | 8095K| 8095K|00:00:00.16 | | 8 | TABLE ACCESS FULL | LINKS | 1 | 8095K| 8095K|00:00:00.15 | | 9 | LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D6C21_4F65443 | 1 | | 0 |00:00:00.01 | |* 10 | HASH JOIN | | 1 | 1 | 425 |00:00:00.01 | |* 11 | VIEW | | 1 | 26 | 901 |00:00:00.01 | | 12 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6C20_4F65443 | 1 | 26 | 1797 |00:00:00.01 | | 13 | NESTED LOOPS | | 1 | 1685 | 896 |00:00:00.01 | | 14 | NESTED LOOPS | | 1 | 1690 | 896 |00:00:00.01 | |* 15 | VIEW | | 1 | 26 | 896 |00:00:00.01 | | 16 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6C20_4F65443 | 1 | 26 | 1797 |00:00:00.01 | |* 17 | INDEX RANGE SCAN | LINKS_TO_N1 | 896 | 65 | 896 |00:00:00.01 | | 18 | TABLE ACCESS BY INDEX ROWID | LINKS | 896 | 65 | 896 |00:00:00.01 | | 19 | LOAD TABLE CONVENTIONAL | NODE_ROOTS | 1 | | 0 |00:00:00.01 | | 20 | UNION-ALL | | 1 | | 850 |00:00:00.01 | | 21 | VIEW | | 1 | 1 | 425 |00:00:00.01 | | 22 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6C21_4F65443 | 1 | 1 | 425 |00:00:00.01 | | 23 | VIEW | | 1 | 1 | 425 |00:00:00.01 | | 24 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6C21_4F65443 | 1 | 1 | 425 |00:00:00.01 | -----------------------------------------------------------------------------------------------------------------------
  • 53. Tuning 3 - SQL for Root Node Selector Brendan Furey, 2022 53 Tuning 3 - SQL for Root Node Selector (4 slides) Code timing several methods for root node selection Analysing Performance of Algorithmic SQL and PL/SQL
  • 54. SQL for Root Node Selector: Method 0 - Select from Unused Nodes (Unordered) Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 54 SELECT id INTO l_root_id FROM nodes WHERE id NOT IN (SELECT node_id FROM node_roots) AND ROWNUM = 1  Code timing showed root node selection took 90% of the time on the Bacon/only_tv_v dataset (744,374 nodes and 22,503,060 links)  Execution plan shows a nested loops antijoin from the nodes index to the root nodes index  We’ll try two variants with different queries and ordering added, then try a different approach --------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | --------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 0 |00:00:00.37 | 42508 | |* 1 | COUNT STOPKEY | | 1 | | 0 |00:00:00.37 | 42508 | |* 2 | FILTER | | 1 | | 0 |00:00:00.37 | 42508 | | 3 | NESTED LOOPS ANTI SNA| | 1 | 20 | 0 |00:00:00.36 | 40791 | | 4 | INDEX FAST FULL SCAN| SYS_C0018310 | 1 | 520 | 744K|00:00:00.09 | 1460 | |* 5 | INDEX UNIQUE SCAN | NODE_ROOTS_N1 | 744K| 714K| 744K|00:00:00.23 | 39331 | |* 6 | TABLE ACCESS FULL | NODE_ROOTS | 1 | 1 | 0 |00:00:00.01 | 1717 | --------------------------------------------------------------------------------------------------- Execution Plan (Extract) Root Selection ms/Call %Total Non-Root Selection %Total Total 303 41 58 221 42 524 Elapsed Times  The base method, with no ordering, took 303 seconds
  • 55. SQL for Root Node Selector: Method 1 - Select from Unused Nodes (Minimum Id) Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 55 SELECT Min(id) INTO l_root_id FROM nodes WHERE id NOT IN (SELECT node_id FROM node_roots)  The first ordering query takes a Min(id) from nodes not in the solution table ------------------------------------------------------------------------------------------------------------------------------ | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem | ------------------------------------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.50 | 3178 | | | | | 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.50 | 3178 | | | | |* 2 | HASH JOIN RIGHT ANTI NA| | 1 | 28678 | 0 |00:00:00.50 | 3178 | 37M| 6400K| 30M (0)| | 3 | TABLE ACCESS FULL | NODE_ROOTS | 1 | 715K| 744K|00:00:00.06 | 1717 | | | | | 4 | INDEX FAST FULL SCAN | SYS_C0018310 | 1 | 744K| 744K|00:00:00.08 | 1461 | | | | ------------------------------------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("ID"="NODE_ID") Execution Plan Root Selection ms/Call %Total Non-Root Selection %Total Total 2,046 275 92 208 8 2,233 Elapsed Times SQL  The first ordering method, took 2,046 seconds  This is nearly 7 times slower than the base, unordered method
  • 56. SQL for Root Node Selector: Method 2 - Select from Unused Nodes (Ordered by Id, ROWNUM = 1) Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 56 SELECT id INTO l_root_id FROM (SELECT id FROM nodes WHERE id NOT IN ( SELECT node_id FROM node_roots) ORDER BY 1 ) WHERE ROWNUM = 1  The second ordering query uses a ROWNUM = 1 on an ordered subquery from nodes not in the solution table ---------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | ---------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 0 |00:00:00.42 | 26499 | |* 1 | COUNT STOPKEY | | 1 | | 0 |00:00:00.42 | 26499 | | 2 | VIEW | | 1 | 1 | 0 |00:00:00.42 | 26499 | |* 3 | FILTER | | 1 | | 0 |00:00:00.42 | 26499 | | 4 | NESTED LOOPS ANTI SNA| | 1 | 20 | 0 |00:00:00.41 | 24782 | | 5 | INDEX FULL SCAN | SYS_C0018310 | 1 | 744K| 744K|00:00:00.10 | 1398 | |* 6 | INDEX UNIQUE SCAN | NODE_ROOTS_N1 | 744K| 688K| 744K|00:00:00.26 | 23384 | |* 7 | TABLE ACCESS FULL | NODE_ROOTS | 1 | 1 | 0 |00:00:00.01 | 1717 | ---------------------------------------------------------------------------------------------------- Execution Plan (Steps) Root Selection ms/Call %Total Non-Root Selection %Total Total 289 39 60 193 40 482 Elapsed Times SQL Predicate Information (identified by operation id): ----------------------------- 1 - filter(ROWNUM=1) 3 - filter( IS NULL) 6 - access("ID"="NODE_ID") 7 - filter("NODE_ID" IS NULL) (Predicates)  The second ordering method took 289 seconds  This is 7 times faster than the first ordering method and slightly faster than the base, unordered
  • 57. SQL for Root Node Selector: Method 3 - Fetch from Cursor (Ordered by Id), Check Unused Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 57 CURSOR c_roots IS SELECT id FROM nodes ORDER BY 1; OPEN c_roots; FETCH c_roots INTO l_root_id  The ordering query is opened once as a cursor, and fetched for each new subnetwork  An existence check is made against the node_roots table, if present we skip to the next fetch ------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | ------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 3 | | 1 | INDEX FULL SCAN | SYS_C0018310 | 1 | 744K| 1 |00:00:00.01 | 3 | ------------------------------------------------------------------------------------------- Cursor Execution Plan Cursor SQL SELECT 1 INTO l_dummy FROM node_roots WHERE node_id = l_root_id Existence Check SQL Existence Check Execution Plan --------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | --------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 3 | |* 1 | INDEX UNIQUE SCAN| NODE_ROOTS_N1 | 1 | 1 | 1 |00:00:00.01 | 3 | --------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("NODE_ID"=:B1) Elapsed Times Root Selection ms/Call %Total Non-Root Selection %Total Total 67 11 27 185 73 252  We get the root selection time by adding up multiple code timing lines for cursor and check SQL  This third ordering method took 67 seconds  > 4 times faster than next best
  • 58. Tuning Results Brendan Furey, 2022 58 Tuning Results (2 slides) Code timing results for one dataset and before and after results for Subnetwork Grouper for all Analysing Performance of Algorithmic SQL and PL/SQL
  • 59. Code Timing - Ins_Node_Roots - Results on Bacon/only_tv_v after Tuning Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 59 Code Timing Output Timer Set: Ins_Node_Roots, Constructed at 30 Jul 2022 17:38:25, written at 17:42:37 =================================================================================== Timer Elapsed CPU Calls Ela/Call CPU/Call ------------------------------------------------- ---------- ---------- ---------- ------------- ------------- Insert isolated nodes 3: 8659 1.24 1.22 1 1.23700 1.22000 Insert isolated links 5: 7078 5.59 5.30 1 5.59100 5.30000 OPEN c_roots 0.19 0.20 1 0.19000 0.20000 Count nodes 0.01 0.00 1 0.01400 0.00000 FETCH c_roots (first) 0.00 0.00 1 0.00000 0.00000 SELECT 1 INTO l_dummy: Not found 0.70 0.84 7443 0.00009 0.00011 Insert min_tree_links (root node 1, size: 680060) 142.60 137.63 1 142.59600 137.63000 Insert node_roots (root node 1, size: 680060) 3.68 3.64 1 3.68100 3.64000 FETCH c_roots (remaining) 28.60 26.67 664224 0.00004 0.00004 SELECT 1 INTO l_dummy: Found 37.05 37.41 656782 0.00006 0.00006 Insert min_tree_links (3 nodes) 7.44 6.82 2091 0.00356 0.00326 Insert node_roots (3 nodes) 0.53 0.37 2091 0.00025 0.00018 Insert min_tree_links (4-39 nodes) 21.90 20.15 5317 0.00412 0.00379 Insert node_roots (4-39 nodes) 1.67 1.39 5317 0.00031 0.00026 Insert min_tree_links (root node 332, size: 52) 0.01 0.00 1 0.00900 0.00000 Insert node_roots (root node 332, size: 52) 0.00 0.00 1 0.00100 0.00000 ... (Other) 0.00 0.00 1 0.00100 0.00000 ------------------------------------------------- ---------- ---------- ---------- ------------- ------------- Total 251.67 241.98 1343341 0.00019 0.00018 ------------------------------------------------- ---------- ---------- ---------- ------------- ------------- [Timer timed (per call in ms): Elapsed: 0.00935, CPU: 0.00935]  The total time has come down from 1714 seconds to 252 seconds, a reduction factor of 7  The largest contribution is now from the timer Insert min_tree_links (root node 1, size: 680060)  The results show the additional pre-insert steps, taking 1 and 6 seconds  We also see the new cursor fetch step, and existence query, taking 29 and 37 seconds
  • 60. Ins_Node_Roots - Performance - Results before/after Tuning Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 60 Dataset #Nodes #Links #Subnetworks #Maxlev Base Ela(s) Tuned Ela(s) three_subnets 14 13 3 3 0.07 0.5 foreign_keys 289 319 43 5 0.2 0.6 brightkite 58,228 214,078 547 10 7 7 bacon/small 161 3,342 1 5 0.1 0.5 bacon/top250 12,466 583,993 15 6 1.9 4.2 bacon/pre1950 134,131 8,095,294 2,432 13 85 61 bacon/only_tv_v 744,374 22,503,060 12,198 11 1,714 252 bacon/no_tv_v 2,386,567 87,866,033 55,276 10 16,108 2,081 bacon/post1950 2,696,175 101,597,227 60,544 10 19,736 2,930 bacon/full 2,800,309 109,262,592 62,557 10 20,631 3,756  The tuned procedure is between 5.5 and 7.7 times faster on the four largest datasets
  • 61. Conclusion Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 61 SQL  Be aware of the built-in SQL algorithms at different levels  Understand the use of subquery sequencing in logical query design  Understand how queries can be transformed, and how performance may be affected  By the CBO, and by manual rewriting  Including logical or physical splitting of complex queries  Understand the use of hints to affect the choice of algorithms the CBO makes  Use execution plans to analyse SQL performance PL/SQL  Use PL/SQL algorithms when there isn’t an appropriate SQL built-in equivalent  But use SQL as fully as possible within these algorithms, in particular to process data in sets  Be familiar with the Oracle standard profilers, and the possibilities offered by custom code timing For more detail  See my blog and GitHub project…
  • 62. References Brendan Furey, 2022 Analysing Performance of Algorithmic SQL and PL/SQL 62 1. Algorithm, Computer Hope, March 2021 2. Declarative Language, Britannia.com, Undated 3. SQL Tuning Guide, 21c 4. Shortest Path Analysis of Large Networks by SQL and PL/SQL: Blog, Brendan Furey, August 2022 5. SQL and PL/SQL for Shortest Path Problems: GitHub, Brendan Furey, August 2022 6. Timer_Set - Oracle PL/SQL code timing module: GitHub, Brendan Furey, January 2019 7. Friendship network of Brightkite users, Jure Leskovec, Stanford University, Undated 8. Bacon Numbers Datasets, Oberlin College, December 2016 9. SQL for Shortest Path Problems, Brendan Furey, April 2015 10. SQL for Shortest Path Problems 2: A Branch and Bound Approach, Brendan Furey, May 2015 11. PL/SQL Pipelined Function for Network Analysis, Brendan Furey, May 2015 12. PL/SQL Profiling 1: Overview, Brendan Furey, June 2020

Notes de l'éditeur

  1. http://aprogrammerwrites.eu/
  2. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  3. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  4. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  5. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  6. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  7. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  8. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  9. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  10. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  11. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  12. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  13. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  14. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  15. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  16. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  17. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  18. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  19. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  20. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  21. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  22. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  23. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  24. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  25. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  26. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  27. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  28. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  29. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  30. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  31. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  32. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  33. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  34. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  35. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  36. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  37. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  38. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  39. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  40. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  41. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html
  42. https://en.m.wikipedia.org/wiki/Modularity http://www.ocrcomputing.org.uk/f452/solution_design/modular.html