SlideShare une entreprise Scribd logo
1  sur  20
Different algorithm to implement joins are
Nested-loop join
Block nested-loop join
Indexed nested-loop join
Merge join
Hash join
Choice based on cost estimate
Example
Number of records of customer 10,000 depositor 5000
Number of blocks of customer 40 depositor 100
1.Nested-loop join
A nested loop join is a naive algorithm that
joins two sets by using two nested loops. Join
operations are important to database management.
Algorithm:
Two relations and are joined as follows:
For each tuple r in R do
For each tuple s in S do
If r and s satisfy the join condition
Then output the tuple <r,s>
This algorithm will involve nr*bs+ br block transfers and
nr+br seeks, where br and bs are number of blocks in
relations r and s respectively, and nr is the number of tuples
in relation r.
The algorithm runs in I/Os, where and is the number of
tuples contained in and respectively. Can easily be
generalized to join any number of relations.
The block nested loop join algorithm is a generalization of
the simple nested loops algorithm that takes advantage of
additional memory to reduce the number of times that the
relation is scanned.
2.Block nested-loop join
Algorithm:
for each block BR of R do begin
for each block BS of S do begin
for each tuple tR in BR do
for each tuple tS in BS do
check whether pair (tR; tS)
satises join condition
if they do, add tR tS to the result
end end end end
Also requires no indexes and can be used with any kind of join
condition.
Worst case: db buffer can only hold one block of each relation
=) br*bs+br disk accesses.
Best case: both relations t into db buffer
=) br+bs disk accesses.
If smaller relation completely ts into db buffer, use that as inner
relation. Reduces the cost estimate to BR + BS disk accesses.
3. Index Nested-Loop Join
 If an index is available on the inner loop's join attribute and join is
an equi-join or natural join, more efficient index lookups can
replace scans.
 It is even possible (reasonable) to construct index just to compute
a join.
 For each tuple tr in the outer relation R, use the index to lookup
tuples in S that satisfy join condition with tr
 Worst case: db buffer has space for only one page of R and one
page of the index associated with S:
 br disk accesses to read R, and for each tuple in R, perform
index lookup on S.
 Cost of the join: br+nr*c, where c is the cost of a single
selection on S using the join condition.
 If indexes are available on both R and S, use the one with the
fewer tuples as the outer relation.
4.Merge Join
Basic idea: first sort both relations on join attribute
(if not already sorted this way)
 Join steps are similar to the merge stage in the
external
sort-merge algorithm (discussed later)
 Every pair with same value on join attribute must
be matched.
values of join attributes
Relation R Relation S
1
2
2
3
4
5
1
2
3
3
5
If no repeated join attribute values, each tuple needs to be
read only once. As a result, each block is read only once.
Thus, the number of block accesses is BR + BS (plus the cost
of sorting, if relations are unsorted).
Worst case: all join attribute values are the same. Then the
number of block accesses is br*bs+br
If one relation is sorted and the other has a secondary B+-
tree index on the join attribute, a hybrid merge-join is
possible.
The sorted relation is merged with the leaf node entries of
the B+-tree. The result is sorted on the addresses (rids) of the
unsorted relation's tuples, and then the addresses can be
replaced by the actual tuples efficiently.
5.Hash-Join
only applicable in case of equi-join or natural join
A hash function is used to partition tuples of both
relations into sets that have the same hash value on
the join attribute
Partitioning Phase: 2 * Br+Bs block accesses
Matching Phase: Br+Bs block accesses
(under the assumption that one partition of each
relation ts into the database buffer)
OTHER OPERATIONS
Duplicate Elimination
Sorting: remove all but one copy of tuples
having identical value(s) on projection
attribute(s)
Hashing: partition relation using hash function
on projection attribute(s); then read partitions
into buffer and create in-memory hash index;
tuple is only inserted into index if not already
present
Set Operations:
Sorting or hashing
Hashing: Partition both relations using the same
hash function; use in-memory index for partitions Ri
R U S: if tuple in Ri or in Si, add tuple to result
∩: if tuple in Ri and in Si, . . .
- : if tuple in Ri and not in Si, . . .
Grouping and aggregation:
Compute groups via sorting or hashing.
Hashing: while groups (partitions) are built,
compute partial aggregate values (for group
attribute A, V(A,R) tuples to store values)
Evaluation of Expressions
Alternatives for evaluating an entire expression tree
are
Materialization: generate results of an expression
whose inputs are relations or are already computed,
materialize it on disk. repeat
Pipelining: pass on tuples to parent operations even
as an operation is being executed.
Evaluation of Expressions
Strategy 1: materialization. Evaluate one operation
at a time, starting at the lowest level. Use
intermediate results materialized in temporary
relations to evaluate next level operation(s).
Example: in figure below, compute and store.
σ balance<2500(account) the store its join with customer and finally
compute the projections on customer name.
Лcustomer name
∞
Лbalance<2500 customer
account
 Materialized evaluation is always applicable.
Cost of writing results to disk and editing them
back can be quite high.
overall cost = sum of cost of individual
operations + cost of writing intermediate results to
disk.
Double Buffering: use two O/P buffer for each
operation, when one is full write it to disk while
the other is getting filled, reduced execution time.
Strategy 2: pipelining. evaluate several
operations simultaneously, and pass the result
(tuple- or block-wise)on to the next operation.
In the example above, once a tuple from
OFFERS satisfying selection condition has been
found, pass it on to the join. Similarly, don't store
result of (final) join, but pass tuples directly to
projection.
Much cheaper than materialization, because
temporary relations are not generated and stored
on disk.
Pipelining is not always possible, e.g., for all
operations that include sorting (blocking
operation).
Pipelining can be executed in either demand
driven or producer driven fashion.
DATABASE TUNINIG
Tuning:
The process of continuing to revise/adjust the physical
database design by monitoring resource utilization as well as
internal DBMS processing to reveal bottlenecks such as
contention for the same data or devices.
Goal:
To make application run faster
To lower the response time of queries/transactions
To improve the overall throughput of transactions
Tuning Indexes
Reasons to tuning indexes
Certain queries may take too long to run for lack of
an index;
Certain indexes may not get utilized at all;
Certain indexes may be causing excessive overhead
because the index is on an attribute that undergoes
frequent changes
Options to tuning indexes
Drop or/and build new indexes
Change a non-clustered index to a clustered index
(and vice versa)
Rebuilding the index
Tuning the Database Design
Dynamically changed processing requirements
need to be addressed by making changes to the
conceptual schema if necessary and to reflect
those changes into the logical schema and
physical design.
Tuning Queries
Indications for tuning queries
A query issues too many disk accesses
The query plan shows that relevant indexes
are not being used.

Contenu connexe

Tendances

13. Query Processing in DBMS
13. Query Processing in DBMS13. Query Processing in DBMS
13. Query Processing in DBMS
koolkampus
 
Introduction to data structures and Algorithm
Introduction to data structures and AlgorithmIntroduction to data structures and Algorithm
Introduction to data structures and Algorithm
Dhaval Kaneria
 

Tendances (20)

Deadlock dbms
Deadlock dbmsDeadlock dbms
Deadlock dbms
 
Relational algebra ppt
Relational algebra pptRelational algebra ppt
Relational algebra ppt
 
Java(Polymorphism)
Java(Polymorphism)Java(Polymorphism)
Java(Polymorphism)
 
13. Query Processing in DBMS
13. Query Processing in DBMS13. Query Processing in DBMS
13. Query Processing in DBMS
 
Symbol table in compiler Design
Symbol table in compiler DesignSymbol table in compiler Design
Symbol table in compiler Design
 
Introduction to data structures and Algorithm
Introduction to data structures and AlgorithmIntroduction to data structures and Algorithm
Introduction to data structures and Algorithm
 
joins in database
 joins in database joins in database
joins in database
 
Graph coloring using backtracking
Graph coloring using backtrackingGraph coloring using backtracking
Graph coloring using backtracking
 
Database replication
Database replicationDatabase replication
Database replication
 
Log based and Recovery with concurrent transaction
Log based and Recovery with concurrent transactionLog based and Recovery with concurrent transaction
Log based and Recovery with concurrent transaction
 
Query processing and optimization (updated)
Query processing and optimization (updated)Query processing and optimization (updated)
Query processing and optimization (updated)
 
Time space trade off
Time space trade offTime space trade off
Time space trade off
 
Input-Buffering
Input-BufferingInput-Buffering
Input-Buffering
 
Abstract data types
Abstract data typesAbstract data types
Abstract data types
 
Intermediate code generation in Compiler Design
Intermediate code generation in Compiler DesignIntermediate code generation in Compiler Design
Intermediate code generation in Compiler Design
 
Nested Queries Lecture
Nested Queries LectureNested Queries Lecture
Nested Queries Lecture
 
linked list in data structure
linked list in data structure linked list in data structure
linked list in data structure
 
Register allocation and assignment
Register allocation and assignmentRegister allocation and assignment
Register allocation and assignment
 
DeadLock in Operating-Systems
DeadLock in Operating-SystemsDeadLock in Operating-Systems
DeadLock in Operating-Systems
 
Data Structures : hashing (1)
Data Structures : hashing (1)Data Structures : hashing (1)
Data Structures : hashing (1)
 

En vedette (14)

Hash join
Hash joinHash join
Hash join
 
Query processing and optimization
Query processing and optimizationQuery processing and optimization
Query processing and optimization
 
Query execution
Query executionQuery execution
Query execution
 
Gpu Join Presentation
Gpu Join PresentationGpu Join Presentation
Gpu Join Presentation
 
Something about oracle joins
Something about oracle joinsSomething about oracle joins
Something about oracle joins
 
Corba and-java
Corba and-javaCorba and-java
Corba and-java
 
Rmi, corba and java beans
Rmi, corba and java beansRmi, corba and java beans
Rmi, corba and java beans
 
Database , 8 Query Optimization
Database , 8 Query OptimizationDatabase , 8 Query Optimization
Database , 8 Query Optimization
 
Oracle: Joins
Oracle: JoinsOracle: Joins
Oracle: Joins
 
Algorithm.ppt
Algorithm.pptAlgorithm.ppt
Algorithm.ppt
 
7. Relational Database Design in DBMS
7. Relational Database Design in DBMS7. Relational Database Design in DBMS
7. Relational Database Design in DBMS
 
Webinar Smile et WSO2
Webinar Smile et WSO2Webinar Smile et WSO2
Webinar Smile et WSO2
 
Ejb
Ejb Ejb
Ejb
 
Slideshare ppt
Slideshare pptSlideshare ppt
Slideshare ppt
 

Similaire à Join operation

for sbi so Ds c c++ unix rdbms sql cn os
for sbi so   Ds c c++ unix rdbms sql cn osfor sbi so   Ds c c++ unix rdbms sql cn os
for sbi so Ds c c++ unix rdbms sql cn os
alisha230390
 

Similaire à Join operation (20)

Join Operation.pptx
Join Operation.pptxJoin Operation.pptx
Join Operation.pptx
 
Query processing System
Query processing SystemQuery processing System
Query processing System
 
Lesson11 transactions
Lesson11 transactionsLesson11 transactions
Lesson11 transactions
 
Query Optimization - Brandon Latronica
Query Optimization - Brandon LatronicaQuery Optimization - Brandon Latronica
Query Optimization - Brandon Latronica
 
Ch13
Ch13Ch13
Ch13
 
Query trees
Query treesQuery trees
Query trees
 
ch13
ch13ch13
ch13
 
Hashing and File Structures in Data Structure.pdf
Hashing and File Structures in Data Structure.pdfHashing and File Structures in Data Structure.pdf
Hashing and File Structures in Data Structure.pdf
 
for sbi so Ds c c++ unix rdbms sql cn os
for sbi so   Ds c c++ unix rdbms sql cn osfor sbi so   Ds c c++ unix rdbms sql cn os
for sbi so Ds c c++ unix rdbms sql cn os
 
Algorithms for Query Processing and Optimization of Spatial Operations
Algorithms for Query Processing and Optimization of Spatial OperationsAlgorithms for Query Processing and Optimization of Spatial Operations
Algorithms for Query Processing and Optimization of Spatial Operations
 
Algorithm ch13.ppt
Algorithm ch13.pptAlgorithm ch13.ppt
Algorithm ch13.ppt
 
Seminar - Similarity Joins in SQL (performance and semantic joins)
Seminar - Similarity Joins in SQL (performance and semantic joins)Seminar - Similarity Joins in SQL (performance and semantic joins)
Seminar - Similarity Joins in SQL (performance and semantic joins)
 
Query Processing, Query Optimization and Transaction
Query Processing, Query Optimization and TransactionQuery Processing, Query Optimization and Transaction
Query Processing, Query Optimization and Transaction
 
Bigdata analytics
Bigdata analyticsBigdata analytics
Bigdata analytics
 
Improved Query Performance With Variant Indexes - review presentation
Improved Query Performance With Variant Indexes - review presentationImproved Query Performance With Variant Indexes - review presentation
Improved Query Performance With Variant Indexes - review presentation
 
January 2016 Meetup: Speeding up (big) data manipulation with data.table package
January 2016 Meetup: Speeding up (big) data manipulation with data.table packageJanuary 2016 Meetup: Speeding up (big) data manipulation with data.table package
January 2016 Meetup: Speeding up (big) data manipulation with data.table package
 
The D-basis Algorithm for Association Rules of High Confidence
The D-basis Algorithm for Association Rules of High ConfidenceThe D-basis Algorithm for Association Rules of High Confidence
The D-basis Algorithm for Association Rules of High Confidence
 
Bloom Filters: An Introduction
Bloom Filters: An IntroductionBloom Filters: An Introduction
Bloom Filters: An Introduction
 
Final exam in advance dbms
Final exam in advance dbmsFinal exam in advance dbms
Final exam in advance dbms
 
ch12.ppt
ch12.pptch12.ppt
ch12.ppt
 

Dernier

FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
dollysharma2066
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
rknatarajan
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Christo Ananth
 

Dernier (20)

Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank  Design by Working Stress - IS Method.pdfIntze Overhead Water Tank  Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
 
Vivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design SpainVivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design Spain
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 

Join operation

  • 1.
  • 2. Different algorithm to implement joins are Nested-loop join Block nested-loop join Indexed nested-loop join Merge join Hash join Choice based on cost estimate Example Number of records of customer 10,000 depositor 5000 Number of blocks of customer 40 depositor 100
  • 3. 1.Nested-loop join A nested loop join is a naive algorithm that joins two sets by using two nested loops. Join operations are important to database management. Algorithm: Two relations and are joined as follows: For each tuple r in R do For each tuple s in S do If r and s satisfy the join condition Then output the tuple <r,s>
  • 4. This algorithm will involve nr*bs+ br block transfers and nr+br seeks, where br and bs are number of blocks in relations r and s respectively, and nr is the number of tuples in relation r. The algorithm runs in I/Os, where and is the number of tuples contained in and respectively. Can easily be generalized to join any number of relations. The block nested loop join algorithm is a generalization of the simple nested loops algorithm that takes advantage of additional memory to reduce the number of times that the relation is scanned.
  • 5. 2.Block nested-loop join Algorithm: for each block BR of R do begin for each block BS of S do begin for each tuple tR in BR do for each tuple tS in BS do check whether pair (tR; tS) satises join condition if they do, add tR tS to the result end end end end Also requires no indexes and can be used with any kind of join condition. Worst case: db buffer can only hold one block of each relation =) br*bs+br disk accesses. Best case: both relations t into db buffer =) br+bs disk accesses. If smaller relation completely ts into db buffer, use that as inner relation. Reduces the cost estimate to BR + BS disk accesses.
  • 6. 3. Index Nested-Loop Join  If an index is available on the inner loop's join attribute and join is an equi-join or natural join, more efficient index lookups can replace scans.  It is even possible (reasonable) to construct index just to compute a join.  For each tuple tr in the outer relation R, use the index to lookup tuples in S that satisfy join condition with tr  Worst case: db buffer has space for only one page of R and one page of the index associated with S:  br disk accesses to read R, and for each tuple in R, perform index lookup on S.  Cost of the join: br+nr*c, where c is the cost of a single selection on S using the join condition.  If indexes are available on both R and S, use the one with the fewer tuples as the outer relation.
  • 7. 4.Merge Join Basic idea: first sort both relations on join attribute (if not already sorted this way)  Join steps are similar to the merge stage in the external sort-merge algorithm (discussed later)  Every pair with same value on join attribute must be matched. values of join attributes Relation R Relation S 1 2 2 3 4 5 1 2 3 3 5
  • 8. If no repeated join attribute values, each tuple needs to be read only once. As a result, each block is read only once. Thus, the number of block accesses is BR + BS (plus the cost of sorting, if relations are unsorted). Worst case: all join attribute values are the same. Then the number of block accesses is br*bs+br If one relation is sorted and the other has a secondary B+- tree index on the join attribute, a hybrid merge-join is possible. The sorted relation is merged with the leaf node entries of the B+-tree. The result is sorted on the addresses (rids) of the unsorted relation's tuples, and then the addresses can be replaced by the actual tuples efficiently.
  • 9. 5.Hash-Join only applicable in case of equi-join or natural join A hash function is used to partition tuples of both relations into sets that have the same hash value on the join attribute Partitioning Phase: 2 * Br+Bs block accesses Matching Phase: Br+Bs block accesses (under the assumption that one partition of each relation ts into the database buffer)
  • 10. OTHER OPERATIONS Duplicate Elimination Sorting: remove all but one copy of tuples having identical value(s) on projection attribute(s) Hashing: partition relation using hash function on projection attribute(s); then read partitions into buffer and create in-memory hash index; tuple is only inserted into index if not already present
  • 11. Set Operations: Sorting or hashing Hashing: Partition both relations using the same hash function; use in-memory index for partitions Ri R U S: if tuple in Ri or in Si, add tuple to result ∩: if tuple in Ri and in Si, . . . - : if tuple in Ri and not in Si, . . .
  • 12. Grouping and aggregation: Compute groups via sorting or hashing. Hashing: while groups (partitions) are built, compute partial aggregate values (for group attribute A, V(A,R) tuples to store values)
  • 13. Evaluation of Expressions Alternatives for evaluating an entire expression tree are Materialization: generate results of an expression whose inputs are relations or are already computed, materialize it on disk. repeat Pipelining: pass on tuples to parent operations even as an operation is being executed.
  • 14. Evaluation of Expressions Strategy 1: materialization. Evaluate one operation at a time, starting at the lowest level. Use intermediate results materialized in temporary relations to evaluate next level operation(s). Example: in figure below, compute and store. σ balance<2500(account) the store its join with customer and finally compute the projections on customer name. Лcustomer name ∞ Лbalance<2500 customer account
  • 15.  Materialized evaluation is always applicable. Cost of writing results to disk and editing them back can be quite high. overall cost = sum of cost of individual operations + cost of writing intermediate results to disk. Double Buffering: use two O/P buffer for each operation, when one is full write it to disk while the other is getting filled, reduced execution time.
  • 16. Strategy 2: pipelining. evaluate several operations simultaneously, and pass the result (tuple- or block-wise)on to the next operation. In the example above, once a tuple from OFFERS satisfying selection condition has been found, pass it on to the join. Similarly, don't store result of (final) join, but pass tuples directly to projection. Much cheaper than materialization, because temporary relations are not generated and stored on disk.
  • 17. Pipelining is not always possible, e.g., for all operations that include sorting (blocking operation). Pipelining can be executed in either demand driven or producer driven fashion.
  • 18. DATABASE TUNINIG Tuning: The process of continuing to revise/adjust the physical database design by monitoring resource utilization as well as internal DBMS processing to reveal bottlenecks such as contention for the same data or devices. Goal: To make application run faster To lower the response time of queries/transactions To improve the overall throughput of transactions
  • 19. Tuning Indexes Reasons to tuning indexes Certain queries may take too long to run for lack of an index; Certain indexes may not get utilized at all; Certain indexes may be causing excessive overhead because the index is on an attribute that undergoes frequent changes Options to tuning indexes Drop or/and build new indexes Change a non-clustered index to a clustered index (and vice versa) Rebuilding the index
  • 20. Tuning the Database Design Dynamically changed processing requirements need to be addressed by making changes to the conceptual schema if necessary and to reflect those changes into the logical schema and physical design. Tuning Queries Indications for tuning queries A query issues too many disk accesses The query plan shows that relevant indexes are not being used.