SlideShare une entreprise Scribd logo
1  sur  27
HashingHashing
Gaurav TrivediGaurav Trivedi
EE 693EE 693
Algorithms and Data StructuresAlgorithms and Data Structures
ContentsContents
 Static HashingStatic Hashing
• File OrganizationFile Organization
• Properties of the Hash FunctionProperties of the Hash Function
• Bucket OverflowBucket Overflow
• IndicesIndices
 Dynamic HashingDynamic Hashing
• Underlying Data StructureUnderlying Data Structure
• Querying and UpdatingQuerying and Updating
 ComparisonsComparisons
• Other types of hashingOther types of hashing
• Ordered Indexing vs. HashingOrdered Indexing vs. Hashing
Static HashingStatic Hashing
 Hashing provides a means forHashing provides a means for
accessing data without the use of anaccessing data without the use of an
index structure.index structure.
 Data is addressed on disk byData is addressed on disk by
computing a function on a searchcomputing a function on a search
key instead.key instead.
OrganizationOrganization
 AA bucketbucket in a hash file is unit ofin a hash file is unit of
storage (typically a disk block) thatstorage (typically a disk block) that
can hold one or more records.can hold one or more records.
 TheThe hash functionhash function, h, is a function, h, is a function
from the set of all search-keys, K, tofrom the set of all search-keys, K, to
the set of all bucket addresses, B.the set of all bucket addresses, B.
 Insertion, deletion, and lookup areInsertion, deletion, and lookup are
done in constant time.done in constant time.
Querying and UpdatesQuerying and Updates
 To insert a record into the structureTo insert a record into the structure
compute the hash value h(Kcompute the hash value h(Kii), and), and
place the record in the bucketplace the record in the bucket
address returned.address returned.
 For lookup operations, compute theFor lookup operations, compute the
hash value as above and search eachhash value as above and search each
record in the bucket for the specificrecord in the bucket for the specific
record.record.
 To delete simply lookup and remove.To delete simply lookup and remove.
Properties of the Hash FunctionProperties of the Hash Function
 The distribution should be uniform.The distribution should be uniform.
• An ideal hash function should assign theAn ideal hash function should assign the
same number of records in each bucket.same number of records in each bucket.
 The distribution should be random.The distribution should be random.
• Regardless of the actual search-keys,Regardless of the actual search-keys,
the each bucket has the same numberthe each bucket has the same number
of records on averageof records on average
• Hash values should not depend on anyHash values should not depend on any
ordering or the search-keysordering or the search-keys
Bucket OverflowBucket Overflow
 How does bucket overflow occur?How does bucket overflow occur?
• Not enough buckets to handle dataNot enough buckets to handle data
• A few buckets have considerably moreA few buckets have considerably more
records then others. This is referred torecords then others. This is referred to
as skew.as skew.
 Multiple records have the same hash valueMultiple records have the same hash value
 Non-uniform hash function distribution.Non-uniform hash function distribution.
SolutionsSolutions
 Provide more buckets then areProvide more buckets then are
needed.needed.
 Overflow chainingOverflow chaining
• If a bucket is full, link another bucket toIf a bucket is full, link another bucket to
it. Repeat as necessary.it. Repeat as necessary.
• The system must then check overflowThe system must then check overflow
buckets for querying and updates. Thisbuckets for querying and updates. This
is known asis known as closed hashingclosed hashing..
AlternativesAlternatives
 Open hashingOpen hashing
• The number of buckets is fixedThe number of buckets is fixed
• Overflow is handled by using the nextOverflow is handled by using the next
bucket in cyclic order that has space.bucket in cyclic order that has space.
 This is known asThis is known as linear probinglinear probing..
 Compute more hash functions.Compute more hash functions.
Note: Closed hashing is preferred inNote: Closed hashing is preferred in
database systems.database systems.
IndicesIndices
 AA hash indexhash index organizes the searchorganizes the search
keys, with their pointers, into a hashkeys, with their pointers, into a hash
file.file.
 Hash indices never primary evenHash indices never primary even
though they provide direct access.though they provide direct access.
Example of Hash IndexExample of Hash Index
Dynamic HashingDynamic Hashing
 More effective then static hashingMore effective then static hashing
when the database grows or shrinkswhen the database grows or shrinks
 Extendable hashingExtendable hashing splits andsplits and
coalesces buckets appropriately withcoalesces buckets appropriately with
the database size.the database size.
• i.e. buckets are added and deleted oni.e. buckets are added and deleted on
demand.demand.
The Hash FunctionThe Hash Function
 Typically produces a large number ofTypically produces a large number of
values, uniformly and randomly.values, uniformly and randomly.
 Only part of the value is usedOnly part of the value is used
depending on the size of thedepending on the size of the
database.database.
Data StructureData Structure
 Hash indices are typically a prefix ofHash indices are typically a prefix of
the entire hash value.the entire hash value.
 More then one consecutive index canMore then one consecutive index can
point to the same bucket.point to the same bucket.
• The indices have the same hash prefixThe indices have the same hash prefix
which can be shorter then the length ofwhich can be shorter then the length of
the index.the index.
General Extendable HashGeneral Extendable Hash
StructureStructure
In this structure, i2 = i3 = i, whereas i1 = i – 1
Queries and UpdatesQueries and Updates
 LookupLookup
• Take the first i bits of the hash value.Take the first i bits of the hash value.
• Following the corresponding entry in theFollowing the corresponding entry in the
bucket address table.bucket address table.
• Look in the bucket.Look in the bucket.
Queries and Updates (Cont’d)Queries and Updates (Cont’d)
 InsertionInsertion
• Follow lookup procedureFollow lookup procedure
• If the bucket has space, add the record.If the bucket has space, add the record.
• If not…If not…
Insertion (Cont’d)Insertion (Cont’d)
 Case 1: i = iCase 1: i = ijj
• Use an additional bit in the hash valueUse an additional bit in the hash value
 This doubles the size of the bucket address table.This doubles the size of the bucket address table.
 Makes two entries in the table point to the fullMakes two entries in the table point to the full
bucket.bucket.
• Allocate a new bucket, z.Allocate a new bucket, z.
 Set iSet ijj and iand izz to ito i
 Point the second entry to the new bucketPoint the second entry to the new bucket
 Rehash the old bucketRehash the old bucket
• Repeat insertion attemptRepeat insertion attempt
Insertion (Cont’d)Insertion (Cont’d)
 Case 2: i > iCase 2: i > ijj
• Allocate a new bucket, zAllocate a new bucket, z
• Add 1 to iAdd 1 to ijj, set, set iijj andand iizz to this new valueto this new value
• Put half of the entries in the first bucketPut half of the entries in the first bucket
and half in the otherand half in the other
• Rehash records in bucket jRehash records in bucket j
• Reattempt insertionReattempt insertion
Insertion (Finally)Insertion (Finally)
 If all the records in the bucket haveIf all the records in the bucket have
the same search value, simply usethe same search value, simply use
overflow buckets as seen in staticoverflow buckets as seen in static
hashing.hashing.
Use of Extendable HashUse of Extendable Hash
Structure: ExampleStructure: Example
Initial Hash structure, bucket size = 2
Example (Cont.)Example (Cont.)
 Hash structure after insertion ofHash structure after insertion of
one Brighton and two Downtownone Brighton and two Downtown
recordsrecords
Example (Cont.)Example (Cont.)
Hash structure after insertion of Mianus record
Example (Cont.)Example (Cont.)
Hash structure after insertion of three Perryridge records
Example (Cont.)Example (Cont.)
 Hash structure after insertion ofHash structure after insertion of
Redwood and Round Hill recordsRedwood and Round Hill records
Comparison to Other HashingComparison to Other Hashing
MethodsMethods
 Advantage: performance does notAdvantage: performance does not
decrease as the database sizedecrease as the database size
increasesincreases
• Space is conserved by adding andSpace is conserved by adding and
removing as necessaryremoving as necessary
 Disadvantage: additional level ofDisadvantage: additional level of
indirection for operationsindirection for operations
• Complex implementationComplex implementation
Ordered Indexing vs. HashingOrdered Indexing vs. Hashing
 Hashing is less efficient if queries toHashing is less efficient if queries to
the database include ranges asthe database include ranges as
opposed to specific values.opposed to specific values.
 In cases where ranges are infrequentIn cases where ranges are infrequent
hashing provides faster insertion,hashing provides faster insertion,
deletion, and lookup then ordereddeletion, and lookup then ordered
indexing.indexing.

Contenu connexe

Tendances

Tendances (20)

Hashing
HashingHashing
Hashing
 
Hashing
HashingHashing
Hashing
 
Ch17 Hashing
Ch17 HashingCh17 Hashing
Ch17 Hashing
 
Hashing and Hashtable, application of hashing, advantages of hashing, disadva...
Hashing and Hashtable, application of hashing, advantages of hashing, disadva...Hashing and Hashtable, application of hashing, advantages of hashing, disadva...
Hashing and Hashtable, application of hashing, advantages of hashing, disadva...
 
Hashing In Data Structure
Hashing In Data Structure Hashing In Data Structure
Hashing In Data Structure
 
linear probing
linear probinglinear probing
linear probing
 
Data Structure and Algorithms Hashing
Data Structure and Algorithms HashingData Structure and Algorithms Hashing
Data Structure and Algorithms Hashing
 
Hashing
HashingHashing
Hashing
 
Hashing in datastructure
Hashing in datastructureHashing in datastructure
Hashing in datastructure
 
Hashing algorithms and its uses
Hashing algorithms and its usesHashing algorithms and its uses
Hashing algorithms and its uses
 
08 Hash Tables
08 Hash Tables08 Hash Tables
08 Hash Tables
 
4.4 hashing
4.4 hashing4.4 hashing
4.4 hashing
 
Hashing
HashingHashing
Hashing
 
18 hashing
18 hashing18 hashing
18 hashing
 
Quadratic probing
Quadratic probingQuadratic probing
Quadratic probing
 
Chapter 12 ds
Chapter 12 dsChapter 12 ds
Chapter 12 ds
 
Hashing
HashingHashing
Hashing
 
Hashing notes data structures (HASHING AND HASH FUNCTIONS)
Hashing notes data structures (HASHING AND HASH FUNCTIONS)Hashing notes data structures (HASHING AND HASH FUNCTIONS)
Hashing notes data structures (HASHING AND HASH FUNCTIONS)
 
Application of hashing in better alg design tanmay
Application of hashing in better alg design tanmayApplication of hashing in better alg design tanmay
Application of hashing in better alg design tanmay
 
Hash table in data structure and algorithm
Hash table in data structure and algorithmHash table in data structure and algorithm
Hash table in data structure and algorithm
 

Similaire à Hashing gt1

Data Step Hash Object vs SQL Join
Data Step Hash Object vs SQL JoinData Step Hash Object vs SQL Join
Data Step Hash Object vs SQL Join
Geoff Ness
 
hashing in data strutures advanced in languae java
hashing in data strutures advanced in languae javahashing in data strutures advanced in languae java
hashing in data strutures advanced in languae java
ishasharma835109
 
rhbase_tutorial
rhbase_tutorialrhbase_tutorial
rhbase_tutorial
Aaron Benz
 
Data Indexing Presentation-My.pptppt.ppt
Data Indexing Presentation-My.pptppt.pptData Indexing Presentation-My.pptppt.ppt
Data Indexing Presentation-My.pptppt.ppt
sdsm2
 

Similaire à Hashing gt1 (20)

Cs437 lecture 14_15
Cs437 lecture 14_15Cs437 lecture 14_15
Cs437 lecture 14_15
 
Database management system session 6
Database management system session 6Database management system session 6
Database management system session 6
 
Overview of Storage and Indexing ...
Overview of Storage and Indexing                                             ...Overview of Storage and Indexing                                             ...
Overview of Storage and Indexing ...
 
DMBS Indexes.pptx
DMBS Indexes.pptxDMBS Indexes.pptx
DMBS Indexes.pptx
 
Assignment#12
Assignment#12Assignment#12
Assignment#12
 
File organization 1
File organization 1File organization 1
File organization 1
 
Hashing And Hashing Tables
Hashing And Hashing TablesHashing And Hashing Tables
Hashing And Hashing Tables
 
asdfew.pptx
asdfew.pptxasdfew.pptx
asdfew.pptx
 
Isam
IsamIsam
Isam
 
Indexing and hashing
Indexing and hashingIndexing and hashing
Indexing and hashing
 
Ts project Hash based inventory system
Ts project Hash based inventory systemTs project Hash based inventory system
Ts project Hash based inventory system
 
Data Step Hash Object vs SQL Join
Data Step Hash Object vs SQL JoinData Step Hash Object vs SQL Join
Data Step Hash Object vs SQL Join
 
Hash based inventory system
Hash based inventory systemHash based inventory system
Hash based inventory system
 
Unit08 dbms
Unit08 dbmsUnit08 dbms
Unit08 dbms
 
hashing in data strutures advanced in languae java
hashing in data strutures advanced in languae javahashing in data strutures advanced in languae java
hashing in data strutures advanced in languae java
 
rhbase_tutorial
rhbase_tutorialrhbase_tutorial
rhbase_tutorial
 
Probabilistic data structures. Part 4. Similarity
Probabilistic data structures. Part 4. SimilarityProbabilistic data structures. Part 4. Similarity
Probabilistic data structures. Part 4. Similarity
 
Data Indexing Presentation-My.pptppt.ppt
Data Indexing Presentation-My.pptppt.pptData Indexing Presentation-My.pptppt.ppt
Data Indexing Presentation-My.pptppt.ppt
 
Advance Hive, NoSQL Database (HBase) - Module 7
Advance Hive, NoSQL Database (HBase) - Module 7Advance Hive, NoSQL Database (HBase) - Module 7
Advance Hive, NoSQL Database (HBase) - Module 7
 
Apache Hive
Apache HiveApache Hive
Apache Hive
 

Plus de Gopi Saiteja (20)

Trees gt(1)
Trees gt(1)Trees gt(1)
Trees gt(1)
 
Topic11 sortingandsearching
Topic11 sortingandsearchingTopic11 sortingandsearching
Topic11 sortingandsearching
 
Heapsort
HeapsortHeapsort
Heapsort
 
Ee693 sept2014quizgt2
Ee693 sept2014quizgt2Ee693 sept2014quizgt2
Ee693 sept2014quizgt2
 
Ee693 sept2014quizgt1
Ee693 sept2014quizgt1Ee693 sept2014quizgt1
Ee693 sept2014quizgt1
 
Ee693 sept2014quiz1
Ee693 sept2014quiz1Ee693 sept2014quiz1
Ee693 sept2014quiz1
 
Ee693 sept2014midsem
Ee693 sept2014midsemEe693 sept2014midsem
Ee693 sept2014midsem
 
Ee693 questionshomework
Ee693 questionshomeworkEe693 questionshomework
Ee693 questionshomework
 
Dynamic programming
Dynamic programmingDynamic programming
Dynamic programming
 
Cs105 l15-bucket radix
Cs105 l15-bucket radixCs105 l15-bucket radix
Cs105 l15-bucket radix
 
Chapter11 sorting algorithmsefficiency
Chapter11 sorting algorithmsefficiencyChapter11 sorting algorithmsefficiency
Chapter11 sorting algorithmsefficiency
 
Answers withexplanations
Answers withexplanationsAnswers withexplanations
Answers withexplanations
 
Sorting
SortingSorting
Sorting
 
Solution(1)
Solution(1)Solution(1)
Solution(1)
 
Pthread
PthreadPthread
Pthread
 
Open mp
Open mpOpen mp
Open mp
 
Introduction
IntroductionIntroduction
Introduction
 
Cuda
CudaCuda
Cuda
 
Vector space interpretation_of_random_variables
Vector space interpretation_of_random_variablesVector space interpretation_of_random_variables
Vector space interpretation_of_random_variables
 
Statistical signal processing(1)
Statistical signal processing(1)Statistical signal processing(1)
Statistical signal processing(1)
 

Dernier

ALCOHOL PRODUCTION- Beer Brewing Process.pdf
ALCOHOL PRODUCTION- Beer Brewing Process.pdfALCOHOL PRODUCTION- Beer Brewing Process.pdf
ALCOHOL PRODUCTION- Beer Brewing Process.pdf
Madan Karki
 
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
drjose256
 
Activity Planning: Objectives, Project Schedule, Network Planning Model. Time...
Activity Planning: Objectives, Project Schedule, Network Planning Model. Time...Activity Planning: Objectives, Project Schedule, Network Planning Model. Time...
Activity Planning: Objectives, Project Schedule, Network Planning Model. Time...
Lovely Professional University
 
Online crime reporting system project.pdf
Online crime reporting system project.pdfOnline crime reporting system project.pdf
Online crime reporting system project.pdf
Kamal Acharya
 

Dernier (20)

Autodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptxAutodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptx
 
UNIT-2 image enhancement.pdf Image Processing Unit 2 AKTU
UNIT-2 image enhancement.pdf Image Processing Unit 2 AKTUUNIT-2 image enhancement.pdf Image Processing Unit 2 AKTU
UNIT-2 image enhancement.pdf Image Processing Unit 2 AKTU
 
Research Methodolgy & Intellectual Property Rights Series 2
Research Methodolgy & Intellectual Property Rights Series 2Research Methodolgy & Intellectual Property Rights Series 2
Research Methodolgy & Intellectual Property Rights Series 2
 
Linux Systems Programming: Semaphores, Shared Memory, and Message Queues
Linux Systems Programming: Semaphores, Shared Memory, and Message QueuesLinux Systems Programming: Semaphores, Shared Memory, and Message Queues
Linux Systems Programming: Semaphores, Shared Memory, and Message Queues
 
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
 
ALCOHOL PRODUCTION- Beer Brewing Process.pdf
ALCOHOL PRODUCTION- Beer Brewing Process.pdfALCOHOL PRODUCTION- Beer Brewing Process.pdf
ALCOHOL PRODUCTION- Beer Brewing Process.pdf
 
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
 
Activity Planning: Objectives, Project Schedule, Network Planning Model. Time...
Activity Planning: Objectives, Project Schedule, Network Planning Model. Time...Activity Planning: Objectives, Project Schedule, Network Planning Model. Time...
Activity Planning: Objectives, Project Schedule, Network Planning Model. Time...
 
SLIDESHARE PPT-DECISION MAKING METHODS.pptx
SLIDESHARE PPT-DECISION MAKING METHODS.pptxSLIDESHARE PPT-DECISION MAKING METHODS.pptx
SLIDESHARE PPT-DECISION MAKING METHODS.pptx
 
Filters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility ApplicationsFilters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility Applications
 
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdfInstruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
 
Insurance management system project report.pdf
Insurance management system project report.pdfInsurance management system project report.pdf
Insurance management system project report.pdf
 
Introduction to Arduino Programming: Features of Arduino
Introduction to Arduino Programming: Features of ArduinoIntroduction to Arduino Programming: Features of Arduino
Introduction to Arduino Programming: Features of Arduino
 
Vip ℂall Girls Karkardooma Phone No 9999965857 High Profile ℂall Girl Delhi N...
Vip ℂall Girls Karkardooma Phone No 9999965857 High Profile ℂall Girl Delhi N...Vip ℂall Girls Karkardooma Phone No 9999965857 High Profile ℂall Girl Delhi N...
Vip ℂall Girls Karkardooma Phone No 9999965857 High Profile ℂall Girl Delhi N...
 
Worksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptxWorksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptx
 
Fuzzy logic method-based stress detector with blood pressure and body tempera...
Fuzzy logic method-based stress detector with blood pressure and body tempera...Fuzzy logic method-based stress detector with blood pressure and body tempera...
Fuzzy logic method-based stress detector with blood pressure and body tempera...
 
Online crime reporting system project.pdf
Online crime reporting system project.pdfOnline crime reporting system project.pdf
Online crime reporting system project.pdf
 
litvinenko_Henry_Intrusion_Hong-Kong_2024.pdf
litvinenko_Henry_Intrusion_Hong-Kong_2024.pdflitvinenko_Henry_Intrusion_Hong-Kong_2024.pdf
litvinenko_Henry_Intrusion_Hong-Kong_2024.pdf
 
Passive Air Cooling System and Solar Water Heater.ppt
Passive Air Cooling System and Solar Water Heater.pptPassive Air Cooling System and Solar Water Heater.ppt
Passive Air Cooling System and Solar Water Heater.ppt
 
5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...
 

Hashing gt1

  • 1. HashingHashing Gaurav TrivediGaurav Trivedi EE 693EE 693 Algorithms and Data StructuresAlgorithms and Data Structures
  • 2. ContentsContents  Static HashingStatic Hashing • File OrganizationFile Organization • Properties of the Hash FunctionProperties of the Hash Function • Bucket OverflowBucket Overflow • IndicesIndices  Dynamic HashingDynamic Hashing • Underlying Data StructureUnderlying Data Structure • Querying and UpdatingQuerying and Updating  ComparisonsComparisons • Other types of hashingOther types of hashing • Ordered Indexing vs. HashingOrdered Indexing vs. Hashing
  • 3. Static HashingStatic Hashing  Hashing provides a means forHashing provides a means for accessing data without the use of anaccessing data without the use of an index structure.index structure.  Data is addressed on disk byData is addressed on disk by computing a function on a searchcomputing a function on a search key instead.key instead.
  • 4. OrganizationOrganization  AA bucketbucket in a hash file is unit ofin a hash file is unit of storage (typically a disk block) thatstorage (typically a disk block) that can hold one or more records.can hold one or more records.  TheThe hash functionhash function, h, is a function, h, is a function from the set of all search-keys, K, tofrom the set of all search-keys, K, to the set of all bucket addresses, B.the set of all bucket addresses, B.  Insertion, deletion, and lookup areInsertion, deletion, and lookup are done in constant time.done in constant time.
  • 5. Querying and UpdatesQuerying and Updates  To insert a record into the structureTo insert a record into the structure compute the hash value h(Kcompute the hash value h(Kii), and), and place the record in the bucketplace the record in the bucket address returned.address returned.  For lookup operations, compute theFor lookup operations, compute the hash value as above and search eachhash value as above and search each record in the bucket for the specificrecord in the bucket for the specific record.record.  To delete simply lookup and remove.To delete simply lookup and remove.
  • 6. Properties of the Hash FunctionProperties of the Hash Function  The distribution should be uniform.The distribution should be uniform. • An ideal hash function should assign theAn ideal hash function should assign the same number of records in each bucket.same number of records in each bucket.  The distribution should be random.The distribution should be random. • Regardless of the actual search-keys,Regardless of the actual search-keys, the each bucket has the same numberthe each bucket has the same number of records on averageof records on average • Hash values should not depend on anyHash values should not depend on any ordering or the search-keysordering or the search-keys
  • 7. Bucket OverflowBucket Overflow  How does bucket overflow occur?How does bucket overflow occur? • Not enough buckets to handle dataNot enough buckets to handle data • A few buckets have considerably moreA few buckets have considerably more records then others. This is referred torecords then others. This is referred to as skew.as skew.  Multiple records have the same hash valueMultiple records have the same hash value  Non-uniform hash function distribution.Non-uniform hash function distribution.
  • 8. SolutionsSolutions  Provide more buckets then areProvide more buckets then are needed.needed.  Overflow chainingOverflow chaining • If a bucket is full, link another bucket toIf a bucket is full, link another bucket to it. Repeat as necessary.it. Repeat as necessary. • The system must then check overflowThe system must then check overflow buckets for querying and updates. Thisbuckets for querying and updates. This is known asis known as closed hashingclosed hashing..
  • 9. AlternativesAlternatives  Open hashingOpen hashing • The number of buckets is fixedThe number of buckets is fixed • Overflow is handled by using the nextOverflow is handled by using the next bucket in cyclic order that has space.bucket in cyclic order that has space.  This is known asThis is known as linear probinglinear probing..  Compute more hash functions.Compute more hash functions. Note: Closed hashing is preferred inNote: Closed hashing is preferred in database systems.database systems.
  • 10. IndicesIndices  AA hash indexhash index organizes the searchorganizes the search keys, with their pointers, into a hashkeys, with their pointers, into a hash file.file.  Hash indices never primary evenHash indices never primary even though they provide direct access.though they provide direct access.
  • 11. Example of Hash IndexExample of Hash Index
  • 12. Dynamic HashingDynamic Hashing  More effective then static hashingMore effective then static hashing when the database grows or shrinkswhen the database grows or shrinks  Extendable hashingExtendable hashing splits andsplits and coalesces buckets appropriately withcoalesces buckets appropriately with the database size.the database size. • i.e. buckets are added and deleted oni.e. buckets are added and deleted on demand.demand.
  • 13. The Hash FunctionThe Hash Function  Typically produces a large number ofTypically produces a large number of values, uniformly and randomly.values, uniformly and randomly.  Only part of the value is usedOnly part of the value is used depending on the size of thedepending on the size of the database.database.
  • 14. Data StructureData Structure  Hash indices are typically a prefix ofHash indices are typically a prefix of the entire hash value.the entire hash value.  More then one consecutive index canMore then one consecutive index can point to the same bucket.point to the same bucket. • The indices have the same hash prefixThe indices have the same hash prefix which can be shorter then the length ofwhich can be shorter then the length of the index.the index.
  • 15. General Extendable HashGeneral Extendable Hash StructureStructure In this structure, i2 = i3 = i, whereas i1 = i – 1
  • 16. Queries and UpdatesQueries and Updates  LookupLookup • Take the first i bits of the hash value.Take the first i bits of the hash value. • Following the corresponding entry in theFollowing the corresponding entry in the bucket address table.bucket address table. • Look in the bucket.Look in the bucket.
  • 17. Queries and Updates (Cont’d)Queries and Updates (Cont’d)  InsertionInsertion • Follow lookup procedureFollow lookup procedure • If the bucket has space, add the record.If the bucket has space, add the record. • If not…If not…
  • 18. Insertion (Cont’d)Insertion (Cont’d)  Case 1: i = iCase 1: i = ijj • Use an additional bit in the hash valueUse an additional bit in the hash value  This doubles the size of the bucket address table.This doubles the size of the bucket address table.  Makes two entries in the table point to the fullMakes two entries in the table point to the full bucket.bucket. • Allocate a new bucket, z.Allocate a new bucket, z.  Set iSet ijj and iand izz to ito i  Point the second entry to the new bucketPoint the second entry to the new bucket  Rehash the old bucketRehash the old bucket • Repeat insertion attemptRepeat insertion attempt
  • 19. Insertion (Cont’d)Insertion (Cont’d)  Case 2: i > iCase 2: i > ijj • Allocate a new bucket, zAllocate a new bucket, z • Add 1 to iAdd 1 to ijj, set, set iijj andand iizz to this new valueto this new value • Put half of the entries in the first bucketPut half of the entries in the first bucket and half in the otherand half in the other • Rehash records in bucket jRehash records in bucket j • Reattempt insertionReattempt insertion
  • 20. Insertion (Finally)Insertion (Finally)  If all the records in the bucket haveIf all the records in the bucket have the same search value, simply usethe same search value, simply use overflow buckets as seen in staticoverflow buckets as seen in static hashing.hashing.
  • 21. Use of Extendable HashUse of Extendable Hash Structure: ExampleStructure: Example Initial Hash structure, bucket size = 2
  • 22. Example (Cont.)Example (Cont.)  Hash structure after insertion ofHash structure after insertion of one Brighton and two Downtownone Brighton and two Downtown recordsrecords
  • 23. Example (Cont.)Example (Cont.) Hash structure after insertion of Mianus record
  • 24. Example (Cont.)Example (Cont.) Hash structure after insertion of three Perryridge records
  • 25. Example (Cont.)Example (Cont.)  Hash structure after insertion ofHash structure after insertion of Redwood and Round Hill recordsRedwood and Round Hill records
  • 26. Comparison to Other HashingComparison to Other Hashing MethodsMethods  Advantage: performance does notAdvantage: performance does not decrease as the database sizedecrease as the database size increasesincreases • Space is conserved by adding andSpace is conserved by adding and removing as necessaryremoving as necessary  Disadvantage: additional level ofDisadvantage: additional level of indirection for operationsindirection for operations • Complex implementationComplex implementation
  • 27. Ordered Indexing vs. HashingOrdered Indexing vs. Hashing  Hashing is less efficient if queries toHashing is less efficient if queries to the database include ranges asthe database include ranges as opposed to specific values.opposed to specific values.  In cases where ranges are infrequentIn cases where ranges are infrequent hashing provides faster insertion,hashing provides faster insertion, deletion, and lookup then ordereddeletion, and lookup then ordered indexing.indexing.