SlideShare une entreprise Scribd logo
1  sur  12
Fractal Tree Index Akhil Sreenath
1
CMPE 226
Fractal Tree Index
A deep dive Research Report
By
CMPE 226
Akhil Sreenath
Fractal Tree Index Akhil Sreenath
2
CMPE 226
Table of Contents
Introduction ..................................................................................................................................................1
What is fractal Tree Index?...........................................................................................................................1
How Fractal Tree works? ..............................................................................................................................2
Performance and analysis of Fractal Tree Index:..........................................................................................4
Improve worst case insertion: ..................................................................................................................5
Search Index Performance:.......................................................................................................................5
Fragmentation: .........................................................................................................................................6
B-Tree:...................................................................................................................................................6
Fractal Tree: ..........................................................................................................................................6
Schema changes:.......................................................................................................................................7
Performance in Hard Disk and SSD:..........................................................................................................8
Hard Disk:..............................................................................................................................................8
SSD: .......................................................................................................................................................8
How Fractal Tree Indexing works in MongoDB:............................................................................................8
Conclusion:..................................................................................................................................................10
References: .................................................................................................................................................10
1
CMPE 226
Introduction:
This paper deals with Fractal Tree index that can be used in MySQL and MongoDB. Fractal Tree
index is a data structure that will enable fast retrieval of the data. Fractal Tree files executes the
same operations as B-tree and effectively replaces small and more frequent writes with large
and less frequent writes which results in better insertion and compression performance.
What is fractal Tree Index?
Tokutek’s has patented Fractal Tree technology. Tokutek’s Experts made a lot of research and
development on cache-oblivious algorithmic before developing Fractal Tree. It is a highly write-
optimized algorithm that radically decreases I/O through astute buffering. Fractal Tree Index is
a data structure that store data in sorted order and allows search and sequential access same
as B-Tree but insertion and deletion are much faster than B-Tree. Each node has a buffer and
insertion, deletion and other changes made are stored in these intermediate location. The main
goal of the Buffer is to schedule a disk write, so that each write on the disk perform a lot of
valuable work. Fractal Tree index are highly optimized for large writes and reads blocks of data.
Fractal Tree index are actually based on Cache oblivious algorithm. In cache oblivious algorithm,
performance is measured by number of block transferred by disk to cache and not on the cache
or block size. So performance is independent of machine architecture and also doesn’t depend
on number of layers of the cache.
Fractal Tree Index Akhil Sreenath
2
CMPE 226
All internal nodes have message Buffer
As Buffer overflows they cascade down the tree.
Figure 1: Fractal Tree Index
How Fractal Tree works?
If there are N rows, then fractal index tree has log2N arrays. Each array is either completely full
or empty and all the arrays are sorted[1].
For example if Aj is an array then it can hold maximum of 2 (J-1) rows,
Ex: A1={}, A2 = {},A3={3,7,10,11} In this case maximum value of J is 4
Consider the above sample fractal tree index, If I want to insert 15 in the tree, we will insert in
the first array as it can accommodate one element in its first array. If I want to add one more
Message
BufferLeaf
node
Leaf node Leaf node
Message
Buffer
Message
Buffer
Leaf node Leaf node Leaf node Leaf node
Fractal Tree Index Akhil Sreenath
3
CMPE 226
element, I cannot accommodate in any of the existing arrays, as all Arrays are full. I cannot
accommodate the element in A3 as it has 4 empty spaces. Temporary index will be created to
accommodate new element.
So new element is added to the temporary index in a single array. Now we have 15 and 7
occupying first array of both index. Those two single arrays are merged to form the new array
of two fields accommodating both element 7 and 15.
Now two 2-arrays will be merged to form 4-Array in the original index.
Third and Fourth array is completely filled and the rest of the arrays will be empty.
Fractal Tree Index Akhil Sreenath
4
CMPE 226
Performance and analysis of Fractal Tree Index:
Time complexity in big O notation:
Average Worst case
Insert O(logB N/Bε
) O(logB N/Bε
)
Delete O(logB N/Bε
) O(logB N/Bε
)
Bsize of Block of Memory
Nsize of array
Fractal Tree index uses smaller branching factor like √𝐵 (less than B),so the depth of the tree
will be O(log√BN) Performance of Fractal tree index is better than traditional B-Tree indexes.
If we consider two array of size N, then cost of merge of two arrays will be 𝑂(
𝑁
𝐵
) block
transfers.
 Merging of two arrays is I/O efficient
 Cost per element to merge will be O(1/B) since O(N) elements are merged
 Maximum number of times each row will be merged is O(log2N)
 Average insertion cost would be O(
𝐿𝑜𝑔 𝑁
𝐵
)
Fractal Tree Index Akhil Sreenath
5
CMPE 226
Improve worst case insertion:
Lot of Arrays is merged during the process of insertion of an element as the cost of merging is
low. Separate threads are maintained to merge the arrays. Inserting of elements in the fractal
index will return the result quickly. The thread which is performing merging operation won’t fall
behind as long as we merge Ω(log N) arrays for every insertion.
Now let’s consider the cost of insertion, An insertion takes at O (logBN /√B) which is faster
than B-Tree by O(√𝐵).
Search Index Performance:
To search any particular row in a fractal tree index, perform binary search for all the log N
Arrays and the time complexity would be log2 N. This can be enhanced by keeping forward
pointers from rows in an array to the rows in the next column. In the figure below, 14 points to
the number greater than that in the row ie 25 and 25 points to the 26 to its next row.
This will reduce the search time, as we know the position of the next element to be searched. It
would reduce the search time complexity to O(log2N).
Fractal Tree Index Akhil Sreenath
6
CMPE 226
Fragmentation:
Fragmentation reduce the performance of a system or database as scanning through the chunk
of rows causes disk head to move all round the hard drive o search for the net row or element
in the index.
B-Tree:
Both Clustering and Non Clustering B-Trees has a fragmentation. If we insert data in a Non-
Clustering B-Tree, Logical order of the rows is completely unrelated with physical placement on
the disk. For a range queries, Scanner has to go through all chunks of data by moving disk head
around for each row which causes a lot of overhead. Non-Clustering B-Tree index is not
recommended for Range queries.
Fractal Tree:
Fractal Tree is not fragmented. Both Primary and secondary indexes are not fragmented. Also
there is no inherent tradeoff between fragmentation and insertion speed. Fractal trees perform
much better in insertion than B-Tree and with no fragmentation. So B-trees sit on a tradeoff
bend, however not the best conceivable tradeoff curve [2].
Fractal Tree Index Akhil Sreenath
7
CMPE 226
Schema changes:
Schema changes will inject broadcast messages, which goes in all the directions by visiting all
the buffers and flushed eventually down to all the leaf. If I want to add column or row into the
table, message can be broadcasted from the root node. So whenever the query generated next
time, it gets to know about the change in column in the buffer as schema change messages are
present in the entire buffer. So results of the query will be according to the new schema.
Performance is highly increased as successful queries are made with changed schema even
before actually writing to the leaf node.
In the figure below, Red color dots are Schema change broadcast messages that is located in all
the buffers.
Fractal Tree Index Akhil Sreenath
8
CMPE 226
Performance in Hard Disk and SSD:
Hard Disk:
Performance in Hard Disk is improved as there is no Fragmentation with Fractal Tree indexing.
Whenever the query is made, values are fetched quickly compared to other external memory
indexing like B-Tree.
SSD:
As SSD is very expensive, algorithms or data structure that support better compression
techniques is preferred. Fractal Tree Index supports better compression techniques which
significantly improves storage performance of SSD. Fractal Tree index has bulk and less
frequent writes which is very useful for SSD. This reduces SSD wear out and increase the life
span of SSD
How Fractal Tree Indexing works in MongoDB:
 In Mongo Db all fields in the document are always available for index.
 All the leaf nodes fit in the RAM, so no IO required.
 Messages are buffered in all the internal nodes.
 Nodes are larger than B-Tree (4 MB) that leads to higher compression ratio.
 Whenever deletion or insertion is made there is no need to update leaf node
immediately
Fractal Tree Index Akhil Sreenath
9
CMPE 226
 Better compression
When the Buffer is full messages are pushed down to the next level
Inserted 20, 25 and deleted 7
21
11 31
Insert(20)
Insert(25)
delete(7)
1,5,7 12 24,26 35
Insert(15)
21
11 31
1,5 12,20 24,25 ,26 35
Insert(15)
Fractal Tree Index Akhil Sreenath
10
CMPE 226
Consider insertions and deletions are made in MongoDB, it is not directly updated in the leaf
node. Initially it is stored in buffer as a message and if the buffer is full it will be pushed down to
the next level.
Conclusion:
Fractal Tree Index is a write optimized Algorithm which can be used in those areas where there
is more Insert, delete or update operations in the table. It significantly improves SSD storage
performance due to the less frequent and Bulky writes. Fractal Tree Index is well suited for
point queries and it is great for range queries. Even Schema changes made very simple and fast.
References:
[1]http://cdn.oreillystatic.com/en/assets/1/event/36/How%20TokuDB%20Fractal%20Tree%20
Databases%20Work%20Presentation.pdf
[2] http://www.tokutek.com/2010/11/avoiding-fragmentation-with-fractal-trees/
[3]https://oracleus.activeevents.com/2013/connect/fileDownload/session/C7B372C894D62F39
5B0EB2C5E0B9AD04/CON4645_Narvaja-MySQL%20Connect%2020130921.pdf
[4] http://www.mongodb.com/presentations/mongodb-boston-2012/mongodb-and-fractal-
tree-indexes
[5] http://www.odbms.org/wp-content/uploads/2013/11/OptimizingMongoDBWithFTI.pdf

Contenu connexe

Tendances

Phases of distributed query processing
Phases of distributed query processingPhases of distributed query processing
Phases of distributed query processingNevil Dsouza
 
Distributed Query Processing
Distributed Query ProcessingDistributed Query Processing
Distributed Query ProcessingMythili Kannan
 
Data structure-questions
Data structure-questionsData structure-questions
Data structure-questionsShekhar Chander
 
Mining Approach for Updating Sequential Patterns
Mining Approach for Updating Sequential PatternsMining Approach for Updating Sequential Patterns
Mining Approach for Updating Sequential PatternsIOSR Journals
 
R-Trees and Geospatial Data Structures
R-Trees and Geospatial Data StructuresR-Trees and Geospatial Data Structures
R-Trees and Geospatial Data StructuresAmrinder Arora
 
8 query processing and optimization
8 query processing and optimization8 query processing and optimization
8 query processing and optimizationKumar
 
C Programming: Structure and Union
C Programming: Structure and UnionC Programming: Structure and Union
C Programming: Structure and UnionSelvaraj Seerangan
 
The life of a query (oracle edition)
The life of a query (oracle edition)The life of a query (oracle edition)
The life of a query (oracle edition)maclean liu
 
HIGH PERFORMANCE SPLIT RADIX FFT
HIGH PERFORMANCE SPLIT RADIX FFTHIGH PERFORMANCE SPLIT RADIX FFT
HIGH PERFORMANCE SPLIT RADIX FFTAM Publications
 
1. Data structures introduction
1. Data structures introduction1. Data structures introduction
1. Data structures introductionMandeep Singh
 
Data decomposition techniques
Data decomposition techniquesData decomposition techniques
Data decomposition techniquesMohamed Ramadan
 
Introduction of Data Structures and Algorithms by GOWRU BHARATH KUMAR
Introduction of Data Structures and Algorithms by GOWRU BHARATH KUMARIntroduction of Data Structures and Algorithms by GOWRU BHARATH KUMAR
Introduction of Data Structures and Algorithms by GOWRU BHARATH KUMARBHARATH KUMAR
 
Database , 8 Query Optimization
Database , 8 Query OptimizationDatabase , 8 Query Optimization
Database , 8 Query OptimizationAli Usman
 
Efficient & Lock-Free Modified Skip List in Concurrent Environment
Efficient & Lock-Free Modified Skip List in Concurrent EnvironmentEfficient & Lock-Free Modified Skip List in Concurrent Environment
Efficient & Lock-Free Modified Skip List in Concurrent EnvironmentEditor IJCATR
 
MySQL 8.0.18 latest updates: Hash join and EXPLAIN ANALYZE
MySQL 8.0.18 latest updates: Hash join and EXPLAIN ANALYZEMySQL 8.0.18 latest updates: Hash join and EXPLAIN ANALYZE
MySQL 8.0.18 latest updates: Hash join and EXPLAIN ANALYZENorvald Ryeng
 
CS 542 -- Query Optimization
CS 542 -- Query OptimizationCS 542 -- Query Optimization
CS 542 -- Query OptimizationJ Singh
 

Tendances (20)

Query compiler
Query compilerQuery compiler
Query compiler
 
Phases of distributed query processing
Phases of distributed query processingPhases of distributed query processing
Phases of distributed query processing
 
Distributed Query Processing
Distributed Query ProcessingDistributed Query Processing
Distributed Query Processing
 
Linked List
Linked ListLinked List
Linked List
 
Data structure-questions
Data structure-questionsData structure-questions
Data structure-questions
 
Mining Approach for Updating Sequential Patterns
Mining Approach for Updating Sequential PatternsMining Approach for Updating Sequential Patterns
Mining Approach for Updating Sequential Patterns
 
R-Trees and Geospatial Data Structures
R-Trees and Geospatial Data StructuresR-Trees and Geospatial Data Structures
R-Trees and Geospatial Data Structures
 
8 query processing and optimization
8 query processing and optimization8 query processing and optimization
8 query processing and optimization
 
C Programming: Structure and Union
C Programming: Structure and UnionC Programming: Structure and Union
C Programming: Structure and Union
 
The life of a query (oracle edition)
The life of a query (oracle edition)The life of a query (oracle edition)
The life of a query (oracle edition)
 
Distributed DBMS - Unit 6 - Query Processing
Distributed DBMS - Unit 6 - Query ProcessingDistributed DBMS - Unit 6 - Query Processing
Distributed DBMS - Unit 6 - Query Processing
 
HIGH PERFORMANCE SPLIT RADIX FFT
HIGH PERFORMANCE SPLIT RADIX FFTHIGH PERFORMANCE SPLIT RADIX FFT
HIGH PERFORMANCE SPLIT RADIX FFT
 
1. Data structures introduction
1. Data structures introduction1. Data structures introduction
1. Data structures introduction
 
Data decomposition techniques
Data decomposition techniquesData decomposition techniques
Data decomposition techniques
 
Introduction of Data Structures and Algorithms by GOWRU BHARATH KUMAR
Introduction of Data Structures and Algorithms by GOWRU BHARATH KUMARIntroduction of Data Structures and Algorithms by GOWRU BHARATH KUMAR
Introduction of Data Structures and Algorithms by GOWRU BHARATH KUMAR
 
Database , 8 Query Optimization
Database , 8 Query OptimizationDatabase , 8 Query Optimization
Database , 8 Query Optimization
 
Homework solutionsch8
Homework solutionsch8Homework solutionsch8
Homework solutionsch8
 
Efficient & Lock-Free Modified Skip List in Concurrent Environment
Efficient & Lock-Free Modified Skip List in Concurrent EnvironmentEfficient & Lock-Free Modified Skip List in Concurrent Environment
Efficient & Lock-Free Modified Skip List in Concurrent Environment
 
MySQL 8.0.18 latest updates: Hash join and EXPLAIN ANALYZE
MySQL 8.0.18 latest updates: Hash join and EXPLAIN ANALYZEMySQL 8.0.18 latest updates: Hash join and EXPLAIN ANALYZE
MySQL 8.0.18 latest updates: Hash join and EXPLAIN ANALYZE
 
CS 542 -- Query Optimization
CS 542 -- Query OptimizationCS 542 -- Query Optimization
CS 542 -- Query Optimization
 

En vedette

Ahmed-Hamdy-CV2016
Ahmed-Hamdy-CV2016Ahmed-Hamdy-CV2016
Ahmed-Hamdy-CV2016Ahmed Hamdy
 
Predictive analytics in heavy industry
Predictive analytics in heavy industryPredictive analytics in heavy industry
Predictive analytics in heavy industryMichael Vermeer
 
Peng.komputer ppt
Peng.komputer pptPeng.komputer ppt
Peng.komputer pptsumarnirola
 
Análisis literario stephany calderon
Análisis literario stephany calderonAnálisis literario stephany calderon
Análisis literario stephany calderonStefis Calderon Ayala
 
DUCT-ARMOR-ASBESTOS-TEST-630141
DUCT-ARMOR-ASBESTOS-TEST-630141DUCT-ARMOR-ASBESTOS-TEST-630141
DUCT-ARMOR-ASBESTOS-TEST-630141Duct Doctors, LLC
 
MS Introduction to HR
MS Introduction to HRMS Introduction to HR
MS Introduction to HRmuralimba09
 
Ravi Chawla 11.08.2016
Ravi Chawla 11.08.2016Ravi Chawla 11.08.2016
Ravi Chawla 11.08.2016CA Ravi Chawla
 

En vedette (7)

Ahmed-Hamdy-CV2016
Ahmed-Hamdy-CV2016Ahmed-Hamdy-CV2016
Ahmed-Hamdy-CV2016
 
Predictive analytics in heavy industry
Predictive analytics in heavy industryPredictive analytics in heavy industry
Predictive analytics in heavy industry
 
Peng.komputer ppt
Peng.komputer pptPeng.komputer ppt
Peng.komputer ppt
 
Análisis literario stephany calderon
Análisis literario stephany calderonAnálisis literario stephany calderon
Análisis literario stephany calderon
 
DUCT-ARMOR-ASBESTOS-TEST-630141
DUCT-ARMOR-ASBESTOS-TEST-630141DUCT-ARMOR-ASBESTOS-TEST-630141
DUCT-ARMOR-ASBESTOS-TEST-630141
 
MS Introduction to HR
MS Introduction to HRMS Introduction to HR
MS Introduction to HR
 
Ravi Chawla 11.08.2016
Ravi Chawla 11.08.2016Ravi Chawla 11.08.2016
Ravi Chawla 11.08.2016
 

Similaire à FractalTreeIndex

An OpenCL Method of Parallel Sorting Algorithms for GPU Architecture
An OpenCL Method of Parallel Sorting Algorithms for GPU ArchitectureAn OpenCL Method of Parallel Sorting Algorithms for GPU Architecture
An OpenCL Method of Parallel Sorting Algorithms for GPU ArchitectureWaqas Tariq
 
MODELLING AND SIMULATION OF 128-BIT CROSSBAR SWITCH FOR NETWORK -ONCHIP
MODELLING AND SIMULATION OF 128-BIT CROSSBAR SWITCH FOR NETWORK -ONCHIPMODELLING AND SIMULATION OF 128-BIT CROSSBAR SWITCH FOR NETWORK -ONCHIP
MODELLING AND SIMULATION OF 128-BIT CROSSBAR SWITCH FOR NETWORK -ONCHIPVLSICS Design
 
Technical aptitude questions_e_book1
Technical aptitude questions_e_book1Technical aptitude questions_e_book1
Technical aptitude questions_e_book1Sateesh Allu
 
Oracle Join Methods and 12c Adaptive Plans
Oracle Join Methods and 12c Adaptive PlansOracle Join Methods and 12c Adaptive Plans
Oracle Join Methods and 12c Adaptive PlansFranck Pachot
 
Aes encryption engine for many core processor arrays for enhanced security
Aes encryption engine for many core processor arrays for enhanced securityAes encryption engine for many core processor arrays for enhanced security
Aes encryption engine for many core processor arrays for enhanced securityIAEME Publication
 
Low complexity low-latency architecture for matching
Low complexity low-latency architecture for matchingLow complexity low-latency architecture for matching
Low complexity low-latency architecture for matchingBhavya Venkatesh
 
Implementation of low power divider techniques using
Implementation of low power divider techniques usingImplementation of low power divider techniques using
Implementation of low power divider techniques usingeSAT Publishing House
 
Implementation of low power divider techniques using radix
Implementation of low power divider techniques using radixImplementation of low power divider techniques using radix
Implementation of low power divider techniques using radixeSAT Journals
 
Query Optimization - Brandon Latronica
Query Optimization - Brandon LatronicaQuery Optimization - Brandon Latronica
Query Optimization - Brandon Latronica"FENG "GEORGE"" YU
 
Design of high speed adders for efficient digital design blocks
Design of high speed adders for efficient digital design blocksDesign of high speed adders for efficient digital design blocks
Design of high speed adders for efficient digital design blocksBharath Chary
 
Effective Sparse Matrix Representation for the GPU Architectures
 Effective Sparse Matrix Representation for the GPU Architectures Effective Sparse Matrix Representation for the GPU Architectures
Effective Sparse Matrix Representation for the GPU ArchitecturesIJCSEA Journal
 
Effective Sparse Matrix Representation for the GPU Architectures
Effective Sparse Matrix Representation for the GPU ArchitecturesEffective Sparse Matrix Representation for the GPU Architectures
Effective Sparse Matrix Representation for the GPU ArchitecturesIJCSEA Journal
 
for sbi so Ds c c++ unix rdbms sql cn os
for sbi so   Ds c c++ unix rdbms sql cn osfor sbi so   Ds c c++ unix rdbms sql cn os
for sbi so Ds c c++ unix rdbms sql cn osalisha230390
 
DSP IEEE paper
DSP IEEE paperDSP IEEE paper
DSP IEEE paperprreiya
 
Parallel Processing Technique for Time Efficient Matrix Multiplication
Parallel Processing Technique for Time Efficient Matrix MultiplicationParallel Processing Technique for Time Efficient Matrix Multiplication
Parallel Processing Technique for Time Efficient Matrix MultiplicationIJERA Editor
 

Similaire à FractalTreeIndex (20)

Birch
BirchBirch
Birch
 
Survey on Prefix adders
Survey on Prefix addersSurvey on Prefix adders
Survey on Prefix adders
 
An OpenCL Method of Parallel Sorting Algorithms for GPU Architecture
An OpenCL Method of Parallel Sorting Algorithms for GPU ArchitectureAn OpenCL Method of Parallel Sorting Algorithms for GPU Architecture
An OpenCL Method of Parallel Sorting Algorithms for GPU Architecture
 
post119s1-file2
post119s1-file2post119s1-file2
post119s1-file2
 
MODELLING AND SIMULATION OF 128-BIT CROSSBAR SWITCH FOR NETWORK -ONCHIP
MODELLING AND SIMULATION OF 128-BIT CROSSBAR SWITCH FOR NETWORK -ONCHIPMODELLING AND SIMULATION OF 128-BIT CROSSBAR SWITCH FOR NETWORK -ONCHIP
MODELLING AND SIMULATION OF 128-BIT CROSSBAR SWITCH FOR NETWORK -ONCHIP
 
Technical aptitude questions_e_book1
Technical aptitude questions_e_book1Technical aptitude questions_e_book1
Technical aptitude questions_e_book1
 
Oracle Join Methods and 12c Adaptive Plans
Oracle Join Methods and 12c Adaptive PlansOracle Join Methods and 12c Adaptive Plans
Oracle Join Methods and 12c Adaptive Plans
 
Aes encryption engine for many core processor arrays for enhanced security
Aes encryption engine for many core processor arrays for enhanced securityAes encryption engine for many core processor arrays for enhanced security
Aes encryption engine for many core processor arrays for enhanced security
 
Low complexity low-latency architecture for matching
Low complexity low-latency architecture for matchingLow complexity low-latency architecture for matching
Low complexity low-latency architecture for matching
 
Implementation of low power divider techniques using
Implementation of low power divider techniques usingImplementation of low power divider techniques using
Implementation of low power divider techniques using
 
Implementation of low power divider techniques using radix
Implementation of low power divider techniques using radixImplementation of low power divider techniques using radix
Implementation of low power divider techniques using radix
 
Query Optimization - Brandon Latronica
Query Optimization - Brandon LatronicaQuery Optimization - Brandon Latronica
Query Optimization - Brandon Latronica
 
Design of high speed adders for efficient digital design blocks
Design of high speed adders for efficient digital design blocksDesign of high speed adders for efficient digital design blocks
Design of high speed adders for efficient digital design blocks
 
Effective Sparse Matrix Representation for the GPU Architectures
 Effective Sparse Matrix Representation for the GPU Architectures Effective Sparse Matrix Representation for the GPU Architectures
Effective Sparse Matrix Representation for the GPU Architectures
 
Effective Sparse Matrix Representation for the GPU Architectures
Effective Sparse Matrix Representation for the GPU ArchitecturesEffective Sparse Matrix Representation for the GPU Architectures
Effective Sparse Matrix Representation for the GPU Architectures
 
for sbi so Ds c c++ unix rdbms sql cn os
for sbi so   Ds c c++ unix rdbms sql cn osfor sbi so   Ds c c++ unix rdbms sql cn os
for sbi so Ds c c++ unix rdbms sql cn os
 
DSP IEEE paper
DSP IEEE paperDSP IEEE paper
DSP IEEE paper
 
Parallel Processing Technique for Time Efficient Matrix Multiplication
Parallel Processing Technique for Time Efficient Matrix MultiplicationParallel Processing Technique for Time Efficient Matrix Multiplication
Parallel Processing Technique for Time Efficient Matrix Multiplication
 
Ca unit v 27 9-2020
Ca unit v 27 9-2020Ca unit v 27 9-2020
Ca unit v 27 9-2020
 
Ad4103173176
Ad4103173176Ad4103173176
Ad4103173176
 

FractalTreeIndex

  • 1. Fractal Tree Index Akhil Sreenath 1 CMPE 226 Fractal Tree Index A deep dive Research Report By CMPE 226 Akhil Sreenath
  • 2. Fractal Tree Index Akhil Sreenath 2 CMPE 226 Table of Contents Introduction ..................................................................................................................................................1 What is fractal Tree Index?...........................................................................................................................1 How Fractal Tree works? ..............................................................................................................................2 Performance and analysis of Fractal Tree Index:..........................................................................................4 Improve worst case insertion: ..................................................................................................................5 Search Index Performance:.......................................................................................................................5 Fragmentation: .........................................................................................................................................6 B-Tree:...................................................................................................................................................6 Fractal Tree: ..........................................................................................................................................6 Schema changes:.......................................................................................................................................7 Performance in Hard Disk and SSD:..........................................................................................................8 Hard Disk:..............................................................................................................................................8 SSD: .......................................................................................................................................................8 How Fractal Tree Indexing works in MongoDB:............................................................................................8 Conclusion:..................................................................................................................................................10 References: .................................................................................................................................................10
  • 3. 1 CMPE 226 Introduction: This paper deals with Fractal Tree index that can be used in MySQL and MongoDB. Fractal Tree index is a data structure that will enable fast retrieval of the data. Fractal Tree files executes the same operations as B-tree and effectively replaces small and more frequent writes with large and less frequent writes which results in better insertion and compression performance. What is fractal Tree Index? Tokutek’s has patented Fractal Tree technology. Tokutek’s Experts made a lot of research and development on cache-oblivious algorithmic before developing Fractal Tree. It is a highly write- optimized algorithm that radically decreases I/O through astute buffering. Fractal Tree Index is a data structure that store data in sorted order and allows search and sequential access same as B-Tree but insertion and deletion are much faster than B-Tree. Each node has a buffer and insertion, deletion and other changes made are stored in these intermediate location. The main goal of the Buffer is to schedule a disk write, so that each write on the disk perform a lot of valuable work. Fractal Tree index are highly optimized for large writes and reads blocks of data. Fractal Tree index are actually based on Cache oblivious algorithm. In cache oblivious algorithm, performance is measured by number of block transferred by disk to cache and not on the cache or block size. So performance is independent of machine architecture and also doesn’t depend on number of layers of the cache.
  • 4. Fractal Tree Index Akhil Sreenath 2 CMPE 226 All internal nodes have message Buffer As Buffer overflows they cascade down the tree. Figure 1: Fractal Tree Index How Fractal Tree works? If there are N rows, then fractal index tree has log2N arrays. Each array is either completely full or empty and all the arrays are sorted[1]. For example if Aj is an array then it can hold maximum of 2 (J-1) rows, Ex: A1={}, A2 = {},A3={3,7,10,11} In this case maximum value of J is 4 Consider the above sample fractal tree index, If I want to insert 15 in the tree, we will insert in the first array as it can accommodate one element in its first array. If I want to add one more Message BufferLeaf node Leaf node Leaf node Message Buffer Message Buffer Leaf node Leaf node Leaf node Leaf node
  • 5. Fractal Tree Index Akhil Sreenath 3 CMPE 226 element, I cannot accommodate in any of the existing arrays, as all Arrays are full. I cannot accommodate the element in A3 as it has 4 empty spaces. Temporary index will be created to accommodate new element. So new element is added to the temporary index in a single array. Now we have 15 and 7 occupying first array of both index. Those two single arrays are merged to form the new array of two fields accommodating both element 7 and 15. Now two 2-arrays will be merged to form 4-Array in the original index. Third and Fourth array is completely filled and the rest of the arrays will be empty.
  • 6. Fractal Tree Index Akhil Sreenath 4 CMPE 226 Performance and analysis of Fractal Tree Index: Time complexity in big O notation: Average Worst case Insert O(logB N/Bε ) O(logB N/Bε ) Delete O(logB N/Bε ) O(logB N/Bε ) Bsize of Block of Memory Nsize of array Fractal Tree index uses smaller branching factor like √𝐵 (less than B),so the depth of the tree will be O(log√BN) Performance of Fractal tree index is better than traditional B-Tree indexes. If we consider two array of size N, then cost of merge of two arrays will be 𝑂( 𝑁 𝐵 ) block transfers.  Merging of two arrays is I/O efficient  Cost per element to merge will be O(1/B) since O(N) elements are merged  Maximum number of times each row will be merged is O(log2N)  Average insertion cost would be O( 𝐿𝑜𝑔 𝑁 𝐵 )
  • 7. Fractal Tree Index Akhil Sreenath 5 CMPE 226 Improve worst case insertion: Lot of Arrays is merged during the process of insertion of an element as the cost of merging is low. Separate threads are maintained to merge the arrays. Inserting of elements in the fractal index will return the result quickly. The thread which is performing merging operation won’t fall behind as long as we merge Ω(log N) arrays for every insertion. Now let’s consider the cost of insertion, An insertion takes at O (logBN /√B) which is faster than B-Tree by O(√𝐵). Search Index Performance: To search any particular row in a fractal tree index, perform binary search for all the log N Arrays and the time complexity would be log2 N. This can be enhanced by keeping forward pointers from rows in an array to the rows in the next column. In the figure below, 14 points to the number greater than that in the row ie 25 and 25 points to the 26 to its next row. This will reduce the search time, as we know the position of the next element to be searched. It would reduce the search time complexity to O(log2N).
  • 8. Fractal Tree Index Akhil Sreenath 6 CMPE 226 Fragmentation: Fragmentation reduce the performance of a system or database as scanning through the chunk of rows causes disk head to move all round the hard drive o search for the net row or element in the index. B-Tree: Both Clustering and Non Clustering B-Trees has a fragmentation. If we insert data in a Non- Clustering B-Tree, Logical order of the rows is completely unrelated with physical placement on the disk. For a range queries, Scanner has to go through all chunks of data by moving disk head around for each row which causes a lot of overhead. Non-Clustering B-Tree index is not recommended for Range queries. Fractal Tree: Fractal Tree is not fragmented. Both Primary and secondary indexes are not fragmented. Also there is no inherent tradeoff between fragmentation and insertion speed. Fractal trees perform much better in insertion than B-Tree and with no fragmentation. So B-trees sit on a tradeoff bend, however not the best conceivable tradeoff curve [2].
  • 9. Fractal Tree Index Akhil Sreenath 7 CMPE 226 Schema changes: Schema changes will inject broadcast messages, which goes in all the directions by visiting all the buffers and flushed eventually down to all the leaf. If I want to add column or row into the table, message can be broadcasted from the root node. So whenever the query generated next time, it gets to know about the change in column in the buffer as schema change messages are present in the entire buffer. So results of the query will be according to the new schema. Performance is highly increased as successful queries are made with changed schema even before actually writing to the leaf node. In the figure below, Red color dots are Schema change broadcast messages that is located in all the buffers.
  • 10. Fractal Tree Index Akhil Sreenath 8 CMPE 226 Performance in Hard Disk and SSD: Hard Disk: Performance in Hard Disk is improved as there is no Fragmentation with Fractal Tree indexing. Whenever the query is made, values are fetched quickly compared to other external memory indexing like B-Tree. SSD: As SSD is very expensive, algorithms or data structure that support better compression techniques is preferred. Fractal Tree Index supports better compression techniques which significantly improves storage performance of SSD. Fractal Tree index has bulk and less frequent writes which is very useful for SSD. This reduces SSD wear out and increase the life span of SSD How Fractal Tree Indexing works in MongoDB:  In Mongo Db all fields in the document are always available for index.  All the leaf nodes fit in the RAM, so no IO required.  Messages are buffered in all the internal nodes.  Nodes are larger than B-Tree (4 MB) that leads to higher compression ratio.  Whenever deletion or insertion is made there is no need to update leaf node immediately
  • 11. Fractal Tree Index Akhil Sreenath 9 CMPE 226  Better compression When the Buffer is full messages are pushed down to the next level Inserted 20, 25 and deleted 7 21 11 31 Insert(20) Insert(25) delete(7) 1,5,7 12 24,26 35 Insert(15) 21 11 31 1,5 12,20 24,25 ,26 35 Insert(15)
  • 12. Fractal Tree Index Akhil Sreenath 10 CMPE 226 Consider insertions and deletions are made in MongoDB, it is not directly updated in the leaf node. Initially it is stored in buffer as a message and if the buffer is full it will be pushed down to the next level. Conclusion: Fractal Tree Index is a write optimized Algorithm which can be used in those areas where there is more Insert, delete or update operations in the table. It significantly improves SSD storage performance due to the less frequent and Bulky writes. Fractal Tree Index is well suited for point queries and it is great for range queries. Even Schema changes made very simple and fast. References: [1]http://cdn.oreillystatic.com/en/assets/1/event/36/How%20TokuDB%20Fractal%20Tree%20 Databases%20Work%20Presentation.pdf [2] http://www.tokutek.com/2010/11/avoiding-fragmentation-with-fractal-trees/ [3]https://oracleus.activeevents.com/2013/connect/fileDownload/session/C7B372C894D62F39 5B0EB2C5E0B9AD04/CON4645_Narvaja-MySQL%20Connect%2020130921.pdf [4] http://www.mongodb.com/presentations/mongodb-boston-2012/mongodb-and-fractal- tree-indexes [5] http://www.odbms.org/wp-content/uploads/2013/11/OptimizingMongoDBWithFTI.pdf