SlideShare une entreprise Scribd logo
1  sur  17
Sub-Join: A Query Optimization Algorithm for Flash-based Database   Zhichao Liang, Da Zhou, Xiaofeng Meng 2009-10-11
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],Background Read Write Erase Time  25 200 1.5
Motivation ,[object Object],[object Object],HDD SSD Ratio Random Read  3.80 70.76 18.62 Sequential Read  37.45 73.08 1.95 Random Write 3.73 3.71 0.99 Sequential Write 37.69 68.52 1.82 HDD SSD Ratio Time 2.71 1.96 1.38
Motivation (cont.) ,[object Object],[object Object],[object Object],[object Object]
Related works ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Our method : Sub-Join ,[object Object],Sub-Table Construct Sub-Table Join Data Call Back
Our method : Sub-Join (cont.) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],A_ID A_Join
Our method : Sub-Join (cont.) ,[object Object],[object Object],[object Object],[object Object]
Our method : Sub-Join (cont.) ,[object Object],[object Object]
Our method : Sub-Join (cont.) ,[object Object],Original Table Sub Table Join Table Results Table B_ID B_Join B_Other 1 6 aaaaa… 2 3 aaaaa… 3 3 aaaaa… 4 7 aaaaa… 5 5 aaaaa… 6 8 aaaaa… 7 3 aaaaa… 8 1 aaaaa… A_ID A_Join A_Other 1 6 aaaaa… 2 6 aaaaa… 3 3 aaaaa… 4 10 aaaaa… 5 5 aaaaa… 6 3 aaaaa… A_Join A_ID 3 3 3 6 5 5 6 1 6 2 10 4 B_Join B_ID 1 8 3 2 3 3 3 7 5 5 6 1 7 4 8 6 A_ID B_ID Join 3 2 3 3 3 3 3 7 3 6 2 3 6 3 3 6 7 3 5 5 5 1 1 6 2 1 6 Join Other 3 aaaaa… 3 aaaaa… 3 aaaaa… 3 aaaaa… 3 aaaaa… 3 aaaaa… 5 aaaaa… 6 aaaaa… 6 aaaaa…
Performance Evaluation ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Performance Evaluation (cont.) ,[object Object],Selectivity=0.01,Record=2KB Selectivity=0.2,Record=2KB
Performance Evaluation (cont.) ,[object Object],Cache=50MB,Record=2KB Cache=128KB,Record=2KB
Performance Evaluation (cont.) ,[object Object],Cache=1MB,Record=2KB Cache=1MB,Record=32KB
Summary ,[object Object],[object Object],[object Object],[object Object],[object Object]
Q&A Thank you!

Contenu connexe

Tendances (8)

OSCh10
OSCh10OSCh10
OSCh10
 
Memory management
Memory managementMemory management
Memory management
 
Ppt cache vs virtual memory without animation
Ppt cache vs virtual memory without animationPpt cache vs virtual memory without animation
Ppt cache vs virtual memory without animation
 
Accordion HBaseCon 2017
Accordion HBaseCon 2017Accordion HBaseCon 2017
Accordion HBaseCon 2017
 
Lecture 25
Lecture 25Lecture 25
Lecture 25
 
Paging and Segmentation in Operating System
Paging and Segmentation in Operating SystemPaging and Segmentation in Operating System
Paging and Segmentation in Operating System
 
Virtual memory managment
Virtual memory managmentVirtual memory managment
Virtual memory managment
 
Memory managment
Memory managmentMemory managment
Memory managment
 

En vedette

Database Introduction - Join Query
Database Introduction - Join QueryDatabase Introduction - Join Query
Database Introduction - Join Query
Dudy Ali
 
Normalization of database_tables_chapter_4
Normalization of database_tables_chapter_4Normalization of database_tables_chapter_4
Normalization of database_tables_chapter_4
Farhan Chishti
 
Distributed_Database_System
Distributed_Database_SystemDistributed_Database_System
Distributed_Database_System
Philip Zhong
 

En vedette (19)

Database Join
Database JoinDatabase Join
Database Join
 
SQL Join Basic
SQL Join BasicSQL Join Basic
SQL Join Basic
 
Relational Algebra,Types of join
Relational Algebra,Types of joinRelational Algebra,Types of join
Relational Algebra,Types of join
 
Database Introduction - Join Query
Database Introduction - Join QueryDatabase Introduction - Join Query
Database Introduction - Join Query
 
Scrum Model
Scrum ModelScrum Model
Scrum Model
 
joins in database
 joins in database joins in database
joins in database
 
Joins in databases
Joins in databases Joins in databases
Joins in databases
 
Database - Normalization
Database - NormalizationDatabase - Normalization
Database - Normalization
 
Everything about Database JOINS and Relationships
Everything about Database JOINS and RelationshipsEverything about Database JOINS and Relationships
Everything about Database JOINS and Relationships
 
SQL
SQLSQL
SQL
 
Database Normalization
Database NormalizationDatabase Normalization
Database Normalization
 
Normalization in Database
Normalization in DatabaseNormalization in Database
Normalization in Database
 
Normalization of database tables
Normalization of database tablesNormalization of database tables
Normalization of database tables
 
Types Of Join In Sql Server - Join With Example In Sql Server
Types Of Join In Sql Server - Join With Example In Sql ServerTypes Of Join In Sql Server - Join With Example In Sql Server
Types Of Join In Sql Server - Join With Example In Sql Server
 
Database Normalization
Database NormalizationDatabase Normalization
Database Normalization
 
A Join Operator for Property Graphs
A Join Operator for Property GraphsA Join Operator for Property Graphs
A Join Operator for Property Graphs
 
Database Normalization 1NF, 2NF, 3NF, BCNF, 4NF, 5NF
Database Normalization 1NF, 2NF, 3NF, BCNF, 4NF, 5NFDatabase Normalization 1NF, 2NF, 3NF, BCNF, 4NF, 5NF
Database Normalization 1NF, 2NF, 3NF, BCNF, 4NF, 5NF
 
Normalization of database_tables_chapter_4
Normalization of database_tables_chapter_4Normalization of database_tables_chapter_4
Normalization of database_tables_chapter_4
 
Distributed_Database_System
Distributed_Database_SystemDistributed_Database_System
Distributed_Database_System
 

Similaire à Sub join a query optimization algorithm for flash-based database

A novel method to extend flash memory lifetime in flash based dbms
A novel method to extend flash memory lifetime in flash based dbmsA novel method to extend flash memory lifetime in flash based dbms
A novel method to extend flash memory lifetime in flash based dbms
Zhichao Liang
 
Strata + Hadoop 2015 Slides
Strata + Hadoop 2015 SlidesStrata + Hadoop 2015 Slides
Strata + Hadoop 2015 Slides
Jun Liu
 
Thomas+Niewel+ +Oracletuning
Thomas+Niewel+ +OracletuningThomas+Niewel+ +Oracletuning
Thomas+Niewel+ +Oracletuning
afa reg
 
Designing Information Structures For Performance And Reliability
Designing Information Structures For Performance And ReliabilityDesigning Information Structures For Performance And Reliability
Designing Information Structures For Performance And Reliability
bryanrandol
 
Troubleshooting SQL Server
Troubleshooting SQL ServerTroubleshooting SQL Server
Troubleshooting SQL Server
Stephen Rose
 

Similaire à Sub join a query optimization algorithm for flash-based database (20)

A novel method to extend flash memory lifetime in flash based dbms
A novel method to extend flash memory lifetime in flash based dbmsA novel method to extend flash memory lifetime in flash based dbms
A novel method to extend flash memory lifetime in flash based dbms
 
IO Dubi Lebel
IO Dubi LebelIO Dubi Lebel
IO Dubi Lebel
 
Strata + Hadoop 2015 Slides
Strata + Hadoop 2015 SlidesStrata + Hadoop 2015 Slides
Strata + Hadoop 2015 Slides
 
Thomas+Niewel+ +Oracletuning
Thomas+Niewel+ +OracletuningThomas+Niewel+ +Oracletuning
Thomas+Niewel+ +Oracletuning
 
Designing Information Structures For Performance And Reliability
Designing Information Structures For Performance And ReliabilityDesigning Information Structures For Performance And Reliability
Designing Information Structures For Performance And Reliability
 
Oracle ebs capacity_analysisusingstatisticalmethods
Oracle ebs capacity_analysisusingstatisticalmethodsOracle ebs capacity_analysisusingstatisticalmethods
Oracle ebs capacity_analysisusingstatisticalmethods
 
Troubleshooting SQL Server
Troubleshooting SQL ServerTroubleshooting SQL Server
Troubleshooting SQL Server
 
Performance and predictability
Performance and predictabilityPerformance and predictability
Performance and predictability
 
Java File I/O Performance Analysis - Part I - JCConf 2018
Java File I/O Performance Analysis - Part I - JCConf 2018Java File I/O Performance Analysis - Part I - JCConf 2018
Java File I/O Performance Analysis - Part I - JCConf 2018
 
Sql server performance tuning and optimization
Sql server performance tuning and optimizationSql server performance tuning and optimization
Sql server performance tuning and optimization
 
Why is my_oracle_e-biz_database_slow_a_million_dollar_question
Why is my_oracle_e-biz_database_slow_a_million_dollar_questionWhy is my_oracle_e-biz_database_slow_a_million_dollar_question
Why is my_oracle_e-biz_database_slow_a_million_dollar_question
 
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
 
MySQL Cluster Performance Tuning - 2013 MySQL User Conference
MySQL Cluster Performance Tuning - 2013 MySQL User ConferenceMySQL Cluster Performance Tuning - 2013 MySQL User Conference
MySQL Cluster Performance Tuning - 2013 MySQL User Conference
 
[db tech showcase Tokyo 2018] #dbts2018 #B17 『オラクル パフォーマンス チューニング - 神話、伝説と解決策』
[db tech showcase Tokyo 2018] #dbts2018 #B17 『オラクル パフォーマンス チューニング - 神話、伝説と解決策』[db tech showcase Tokyo 2018] #dbts2018 #B17 『オラクル パフォーマンス チューニング - 神話、伝説と解決策』
[db tech showcase Tokyo 2018] #dbts2018 #B17 『オラクル パフォーマンス チューニング - 神話、伝説と解決策』
 
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
 
MySQL & Expression Engine EEUK2013
MySQL & Expression Engine EEUK2013MySQL & Expression Engine EEUK2013
MySQL & Expression Engine EEUK2013
 
Caching Methodology & Strategies
Caching Methodology & StrategiesCaching Methodology & Strategies
Caching Methodology & Strategies
 
Caching methodology and strategies
Caching methodology and strategiesCaching methodology and strategies
Caching methodology and strategies
 
Sql server troubleshooting
Sql server troubleshootingSql server troubleshooting
Sql server troubleshooting
 
11g R2
11g R211g R2
11g R2
 

Plus de Zhichao Liang

Plus de Zhichao Liang (13)

微软Bot framework简介
微软Bot framework简介微软Bot framework简介
微软Bot framework简介
 
青云虚拟机部署私有Docker Registry
青云虚拟机部署私有Docker Registry青云虚拟机部署私有Docker Registry
青云虚拟机部署私有Docker Registry
 
开源Pass平台flynn功能简介
开源Pass平台flynn功能简介开源Pass平台flynn功能简介
开源Pass平台flynn功能简介
 
青云CoreOS虚拟机部署kubernetes
青云CoreOS虚拟机部署kubernetes 青云CoreOS虚拟机部署kubernetes
青云CoreOS虚拟机部署kubernetes
 
Introduction of own cloud
Introduction of own cloudIntroduction of own cloud
Introduction of own cloud
 
Power drill列存储底层设计
Power drill列存储底层设计Power drill列存储底层设计
Power drill列存储底层设计
 
C store底层存储设计
C store底层存储设计C store底层存储设计
C store底层存储设计
 
Storage Class Memory: Technology Overview & System Impacts
Storage Class Memory: Technology Overview & System ImpactsStorage Class Memory: Technology Overview & System Impacts
Storage Class Memory: Technology Overview & System Impacts
 
A simple introduction to redis
A simple introduction to redisA simple introduction to redis
A simple introduction to redis
 
Memcached简介
Memcached简介Memcached简介
Memcached简介
 
Some key value stores using log-structure
Some key value stores using log-structureSome key value stores using log-structure
Some key value stores using log-structure
 
Hush…tell you something novel about flash memory
Hush…tell you something novel about flash memoryHush…tell you something novel about flash memory
Hush…tell you something novel about flash memory
 
Survey of distributed storage system
Survey of distributed storage systemSurvey of distributed storage system
Survey of distributed storage system
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

Sub join a query optimization algorithm for flash-based database

  • 1. Sub-Join: A Query Optimization Algorithm for Flash-based Database Zhichao Liang, Da Zhou, Xiaofeng Meng 2009-10-11
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.

Notes de l'éditeur

  1. Mehul A. Shah 提出一种按列存储 (PAX Layout)[3] 的方式来提高连接的速度。传统的数据库主要采用行存储模式,即数据表根据元组按照行的方式进行存储。采用按列存储的方式,查询时只需要扫描进行连接操作的列,从而有效地减少了读入的数据量。虽然这个过程中需要比按行存储执行更多的随机读取操作,但是由于固态硬盘具有高速的随机读取速度,所以 PAX Layout 方法能够有效地提高在连接过程中对表的扫描速度。 Mehul A. Shah 同样提出一种 RARE-join (RAndom Read Efficient Join) 的连接算法。 RARE-join 首先对连接列构建索引,然后根据索引连接的结果去原表中读取数据,其中只与结果相关的列和行才被读取出来,从而有效地减少读取的数据。 RARE-join 充分利用了固态硬盘的随机读取性能提高了查询的性能。 Daniel Myers 提出采用 subset[4, 5] 的方法来提高查询中的外排序性能。传统的外排序算法将元组读入内存后进行排序,然后将排序后的所有元组的所有列全部写入存储设备。这个过程中非连接列的数据也会被多次写入到存储设备中。 Daniel Myers 针对固态硬盘的特性在外排序的中间结果写出上进行优化,只将连接列的数据写入到存储设备,从而有效地减少数据的写操作。在排序结束后,再从原始表中获取其它数据,完成最初的排序过程。在从原始表中获取其它数据时,需要进行大量的随机读,这刚好是固态硬盘的最大优点之一。 Daniel Myers 充分利用了固态硬盘的高速读性能和避免其低下的写性能,从而提高了外连接的效率。 在 Daniel Myers 的基础上, Yu Li 提出针对固态硬盘特点而优化的 DigestJoin 查询优化算法 [6] 。 DigestJoin 首先将两个进行查询连接的表进行抽取,获取连接列的 Digest 表,然后对 Digest 表进行嵌套循环连接。在得到连接结果后,采取基于查询图 (Join Graph) 的方法来读入连接查询中位于原始表的数据。基于查询图的方法能够有效地减少数据页的重复读,从而减少读的次数。此外, DigestJoin 采取页载入 (Page Loading) 方式来读入原始表的数据。页载入的方法在读取数据时同时将位于同一页的其它数据也读入内存。如果接下来刚好需要这些数据的话,就可以减少读入次数,从而有效地提高查询的效率。