SlideShare une entreprise Scribd logo
1  sur  30
Presenter : mehdi varse
Varse.mehdi@gmail.com
High performance databases
in the name of god
1
outline
• performance metrics
• Explain the issue
• Database Tuning
• In-memory database
• parallel database systems
• Distributed database systems
• New High-Performance databases
• High-Performance Database requirements
2
The Problem
• at least 2.5 Exabyte of data is produced every day
• Google processes 3.5 billion requests per day
• Registration of one million customer transactions every hour at
Wal-Mart
• Updates/Posts :
Google: 34,000 searches per second
Yahoo: 3,200 searches per second
• Facebook status updates: 700 per second
• Twitter tweets: 600 per second
• Buzz posts: 55 per second
3
performance metrics
to Monitor in Enterprise Applications
• Business Transactions
• Query Performance
• User and Query Conflicts
• Capacity
• Configuration
• NoSQL Databases
4
Database Tuning
5
Database Tuning is the activity of making a
database application run more quickly. “More
quickly” usually means higher throughput, though
it may mean lower response time for time-critical
applications.
6
Application
Programmer
(e.g., business analyst,
Data architect)
Sophisticated
Application
Programmer
(e.g., SAP admin)
DBA,
Tuner
Hardware
[Processor(s), Disk(s), Memory]
Operating System
Concurrency Control Recovery
Storage SubsystemIndexes
Query Processor
Application
Memory tuning
• The main memory is the one of most important features that
affect database performance
7
Query Cache
• The query cache stores results of SELECT queries
• It is useful if the change is small
• Sample :
 on a Linux Alpha 2×500MHz system with 2GB RAM and a 64MB query cache:
Searches for a single row in a single-row table are 238% faster with the query
cache than without it
8
Database caching
• Database caching is a process included in the design of computer
applications
• database caching is used to achieve high scalability and
performance.
• Database caching improves scalability by distributing query
workload from backend to multiple cheap front-end systems.
9
Changing Database engine
10
giving you support
for
the ACID property
In-memory database
• An in-memory database system is a database management system
that stores data entirely in main memory.
• Used in Applications where response time is critical
Sqlite in memory: rc = sqlite3_open(":memory:", &db);
• in-memory databases will be able to run at full speed and maintain
data in the event of power failure.
• Sample of in-memory databases:
Redis(VMware / Pivotal Software - 2009)
SQLite
11
parallel database systems
goals parallel database systems :
 high performance
 Scalable
 fault tolerant database management
three key components of a high performance parallel DBMS:
 data partitioning strategies
 algorithms for parallel processing of a join operator
 Need a framework that controls the placement of data
Examples : Oracle parallel Server , IBM’s DB2 parallel Edition
12
parallel database systems
The hardware platform
13
Designing distributed database systems
• It may be stored in multiple computers, located in the same physical
location; or may be dispersed over a network of interconnected computers
• Unlike parallel systems, in which the processors are tightly coupled and
constitute a single database system, a distributed database system consists
of loosely coupled sites that share no physical components.
14
NoSQL Databases
• originally referring “no sql” OR “not only sql”
• designed to manage the scalability and performance issues
• support eventual consistency rather than ACID
• divided into four categories :
I. Key-value stores such as redis
II. document databases such as mongodb
III. graph databases such as neo4j
IV. column-oriented databases such as cassandra
15
High-Performance Database requirements
• Select one or more database with respect to your data types
• According to the selected database provide hardware
platforms(memory,disc and cpu)
• Use high speed network to connect nodes If you want to use the
distributed database
• Tune your database for optimal use of resources
• optimize your queries
16
Review data stores used in Facebook
• MYSQL:
storage such as wall posts, user information, timeline etc
 This data is replicated between their various data centers.
• MEMCACHED:
Facebook makes heavy use of Memcached
a memory caching system to reduce reading time
• HAYSTACK:
each uploaded photo, Facebook generates and stores four images of different sizes
current growth rate is 220 million new photos per week
Implements a HTTP based photo server which stores photos in a generic object store
called Haystack
17
Review databases used in Facebook
• CASSANDRA:
The Apache Cassandra database is the right choice when you need scalability and
high-availability without compromising performance
 Facebook uses it for its Inbox search.
18
Cassandra
high performance database
• Distributed
• High Performance
• Extremely Scalable
• Fault tolerant(No single point of failure)
19
Cassandra
Architecture Overview
• Cassandra was designed with the understanding that
system/hardware failure can and do occur.
• Peer-to-peer, distributed system
• All nodes the same
• Custom data replication to ensure fault tolerance
• Read/Write-anywhere design
20
Conclusion
• In Small Scales relational databases act better than nosql databases
• If you need to execute complex queries, relational databases is best
choose
• If you need to large scale or distributed database you can use the
nosql databases
21
References
• Jose M. Faleiro, Daniel J. Abadi, “FIT: A Distributed Database Performance Tradeoff”, IEEE,2015
• KLAUS ELHARDT , “A Database Cache for High Performance and Fast Restart in Database Systems”
22
23
Attachment
24
performance metrics
Business Transactions
Business Transactions provide insight into real user behavior: they capture real-
time performance that real users are experiencing as they interact with your
application. involves capturing the response time of a business transaction
25
performance metrics
Query Performance
• Selecting More Data Than Needed
• Inefficient Joins Between Tables
• Too Few or Too Many Indexes
• Too Much Literal SQL Causing Parse Contention
The most obvious place to look for poor query performance is in the query
itself. Problems can result from queries that take too long to identify the
required data or bring the data back. Look for the following issues in queries.
26
performance metrics
User and Query Conflicts
• Page/row Locking Due to Slow Queries
• Transactional Locks and Deadlocks
• Batch Activities Causing Resource Contention for
Online Users
Databases are designed to be multi-user, but the activities of multiple users
can cause conflicts.
27
performance metrics
Capacity
• Not Enough CPUs or CPU Speed Too Slow
• Slow Disk
• Full or Misconfigured Disks
• Not Enough Memory
• Slow Network
Not all database performance issues are database issues. Some problems
result from running the database on inadequate hardware.
28
performance metrics
Configuration
• Buffer Cache Too Small
• No Query Caching
• I/O Contention Due to Temporary Table Creation on
Disk
Every database has a large number of configuration settings. Default values
may not be enough to give your database the performance it needs.
29
performance metrics
NoSQL Databases
• Finicky Transactions
• Complex Databases
• Consistent JOINS
• Flexibility in Schema Design
• Resource Intensive
NoSQL has much appeal because of its ability to handle large amounts of data
very rapidly. However, some disadvantages should be assessed when weighing
if NoSQL is right for your use-case scenario.
30

Contenu connexe

Tendances

Tendances (20)

Monitorando performance no Azure SQL Database
Monitorando performance no Azure SQL DatabaseMonitorando performance no Azure SQL Database
Monitorando performance no Azure SQL Database
 
Building Data Warehouse in SQL Server
Building Data Warehouse in SQL ServerBuilding Data Warehouse in SQL Server
Building Data Warehouse in SQL Server
 
Choosing the right Cloud Database
Choosing the right Cloud DatabaseChoosing the right Cloud Database
Choosing the right Cloud Database
 
Performance tuning and optimization (ppt)
Performance tuning and optimization (ppt)Performance tuning and optimization (ppt)
Performance tuning and optimization (ppt)
 
NoSQL Now! NoSQL Architecture Patterns
NoSQL Now! NoSQL Architecture PatternsNoSQL Now! NoSQL Architecture Patterns
NoSQL Now! NoSQL Architecture Patterns
 
Introduction to snowflake
Introduction to snowflakeIntroduction to snowflake
Introduction to snowflake
 
Scalable relational database with SQL Azure
Scalable relational database with SQL AzureScalable relational database with SQL Azure
Scalable relational database with SQL Azure
 
SQL Server 2014 New Features
SQL Server 2014 New FeaturesSQL Server 2014 New Features
SQL Server 2014 New Features
 
SQL Server 2016 new features
SQL Server 2016 new featuresSQL Server 2016 new features
SQL Server 2016 new features
 
NoSQL Data Architecture Patterns
NoSQL Data ArchitecturePatternsNoSQL Data ArchitecturePatterns
NoSQL Data Architecture Patterns
 
Sql server 2016 new features
Sql server 2016 new featuresSql server 2016 new features
Sql server 2016 new features
 
Tech-Spark: Azure SQL Databases
Tech-Spark: Azure SQL DatabasesTech-Spark: Azure SQL Databases
Tech-Spark: Azure SQL Databases
 
SQL Server 2016 novelties
SQL Server 2016 noveltiesSQL Server 2016 novelties
SQL Server 2016 novelties
 
SQL Server 2016 BI updates
SQL Server 2016 BI updatesSQL Server 2016 BI updates
SQL Server 2016 BI updates
 
SQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 QuestionsSQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 Questions
 
Temporal Tables, Transparent Archiving in DB2 for z/OS and IDAA
Temporal Tables, Transparent Archiving in DB2 for z/OS and IDAATemporal Tables, Transparent Archiving in DB2 for z/OS and IDAA
Temporal Tables, Transparent Archiving in DB2 for z/OS and IDAA
 
Everything you need to know about SQL Server 2016
Everything you need to know about SQL Server 2016Everything you need to know about SQL Server 2016
Everything you need to know about SQL Server 2016
 
SQL Server 2019 CTP2.4
SQL Server 2019 CTP2.4SQL Server 2019 CTP2.4
SQL Server 2019 CTP2.4
 
Mysql For Developers
Mysql For DevelopersMysql For Developers
Mysql For Developers
 
SQL Server 2016 - Stretch DB
SQL Server 2016 - Stretch DB SQL Server 2016 - Stretch DB
SQL Server 2016 - Stretch DB
 

En vedette

Developing for Android Wear - Part 1
Developing for Android Wear - Part 1Developing for Android Wear - Part 1
Developing for Android Wear - Part 1
Justin Munger
 
Bronco Money Matters Report Dec15
Bronco Money Matters Report Dec15Bronco Money Matters Report Dec15
Bronco Money Matters Report Dec15
Rahim Osman
 
A Better Interface Builder Experience
A Better Interface Builder ExperienceA Better Interface Builder Experience
A Better Interface Builder Experience
Justin Munger
 
Fad-Free Architecture
Fad-Free ArchitectureFad-Free Architecture
Fad-Free Architecture
Justin Munger
 
Developing For Android Wear - Part 2
Developing For Android Wear - Part 2Developing For Android Wear - Part 2
Developing For Android Wear - Part 2
Justin Munger
 
Диагностика товарного рынка Украины: макроэкономические показатели и тенденци...
Диагностика товарного рынка Украины: макроэкономические показатели и тенденци...Диагностика товарного рынка Украины: макроэкономические показатели и тенденци...
Диагностика товарного рынка Украины: макроэкономические показатели и тенденци...
KNEU
 

En vedette (11)

Developing for Android Wear - Part 1
Developing for Android Wear - Part 1Developing for Android Wear - Part 1
Developing for Android Wear - Part 1
 
Bronco Money Matters Report Dec15
Bronco Money Matters Report Dec15Bronco Money Matters Report Dec15
Bronco Money Matters Report Dec15
 
A Better Interface Builder Experience
A Better Interface Builder ExperienceA Better Interface Builder Experience
A Better Interface Builder Experience
 
Segregación escolar
Segregación escolarSegregación escolar
Segregación escolar
 
Fad-Free Architecture
Fad-Free ArchitectureFad-Free Architecture
Fad-Free Architecture
 
Developing For Android Wear - Part 2
Developing For Android Wear - Part 2Developing For Android Wear - Part 2
Developing For Android Wear - Part 2
 
Presentacion pensamiento sistemico. Tomás Agreda
Presentacion pensamiento sistemico. Tomás AgredaPresentacion pensamiento sistemico. Tomás Agreda
Presentacion pensamiento sistemico. Tomás Agreda
 
(E book) nuevas dinamicas y juegos grupales(2)
(E book) nuevas dinamicas y juegos grupales(2)(E book) nuevas dinamicas y juegos grupales(2)
(E book) nuevas dinamicas y juegos grupales(2)
 
острый тонзиллит
острый тонзиллитострый тонзиллит
острый тонзиллит
 
Диагностика товарного рынка Украины: макроэкономические показатели и тенденци...
Диагностика товарного рынка Украины: макроэкономические показатели и тенденци...Диагностика товарного рынка Украины: макроэкономические показатели и тенденци...
Диагностика товарного рынка Украины: макроэкономические показатели и тенденци...
 
Path testing
Path testingPath testing
Path testing
 

Similaire à high performance databases

NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabases
Adi Challa
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
Qian Lin
 
Storage Systems For Scalable systems
Storage Systems For Scalable systemsStorage Systems For Scalable systems
Storage Systems For Scalable systems
elliando dias
 
Colorado Springs Open Source Hadoop/MySQL
Colorado Springs Open Source Hadoop/MySQL Colorado Springs Open Source Hadoop/MySQL
Colorado Springs Open Source Hadoop/MySQL
David Smelker
 

Similaire à high performance databases (20)

ScaleBase Webinar: Scaling MySQL - Sharding Made Easy!
ScaleBase Webinar: Scaling MySQL - Sharding Made Easy!ScaleBase Webinar: Scaling MySQL - Sharding Made Easy!
ScaleBase Webinar: Scaling MySQL - Sharding Made Easy!
 
Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overview
 
Revision
RevisionRevision
Revision
 
Architecture Patterns - Open Discussion
Architecture Patterns - Open DiscussionArchitecture Patterns - Open Discussion
Architecture Patterns - Open Discussion
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabases
 
Database Administration & Management - 01
Database Administration & Management - 01Database Administration & Management - 01
Database Administration & Management - 01
 
DBAM-01.pdf
DBAM-01.pdfDBAM-01.pdf
DBAM-01.pdf
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
 
25 snowflake
25 snowflake25 snowflake
25 snowflake
 
MySQL: Know more about open Source Database
MySQL: Know more about open Source DatabaseMySQL: Know more about open Source Database
MySQL: Know more about open Source Database
 
Storage Systems For Scalable systems
Storage Systems For Scalable systemsStorage Systems For Scalable systems
Storage Systems For Scalable systems
 
Colorado Springs Open Source Hadoop/MySQL
Colorado Springs Open Source Hadoop/MySQL Colorado Springs Open Source Hadoop/MySQL
Colorado Springs Open Source Hadoop/MySQL
 
Business intelligence and data warehouses
Business intelligence and data warehousesBusiness intelligence and data warehouses
Business intelligence and data warehouses
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache Kudu
 
Find a needle in Haystack: Facebook's storage system
Find a needle in Haystack: Facebook's storage systemFind a needle in Haystack: Facebook's storage system
Find a needle in Haystack: Facebook's storage system
 
MySQL Performance Tuning at COSCUP 2014
MySQL Performance Tuning at COSCUP 2014MySQL Performance Tuning at COSCUP 2014
MySQL Performance Tuning at COSCUP 2014
 
Nosql data models
Nosql data modelsNosql data models
Nosql data models
 
dbms introduction.pptx
dbms introduction.pptxdbms introduction.pptx
dbms introduction.pptx
 
Foundations of business intelligence databases and information management
Foundations of business intelligence databases and information managementFoundations of business intelligence databases and information management
Foundations of business intelligence databases and information management
 
Overview of Data Base Systems Concepts and Architecture
Overview of Data Base Systems Concepts and ArchitectureOverview of Data Base Systems Concepts and Architecture
Overview of Data Base Systems Concepts and Architecture
 

Dernier

Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
ssuser89054b
 
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoorTop Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
dharasingh5698
 

Dernier (20)

Unit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfUnit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdf
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoorTop Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 

high performance databases

  • 1. Presenter : mehdi varse Varse.mehdi@gmail.com High performance databases in the name of god 1
  • 2. outline • performance metrics • Explain the issue • Database Tuning • In-memory database • parallel database systems • Distributed database systems • New High-Performance databases • High-Performance Database requirements 2
  • 3. The Problem • at least 2.5 Exabyte of data is produced every day • Google processes 3.5 billion requests per day • Registration of one million customer transactions every hour at Wal-Mart • Updates/Posts : Google: 34,000 searches per second Yahoo: 3,200 searches per second • Facebook status updates: 700 per second • Twitter tweets: 600 per second • Buzz posts: 55 per second 3
  • 4. performance metrics to Monitor in Enterprise Applications • Business Transactions • Query Performance • User and Query Conflicts • Capacity • Configuration • NoSQL Databases 4
  • 5. Database Tuning 5 Database Tuning is the activity of making a database application run more quickly. “More quickly” usually means higher throughput, though it may mean lower response time for time-critical applications.
  • 6. 6 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA, Tuner Hardware [Processor(s), Disk(s), Memory] Operating System Concurrency Control Recovery Storage SubsystemIndexes Query Processor Application
  • 7. Memory tuning • The main memory is the one of most important features that affect database performance 7
  • 8. Query Cache • The query cache stores results of SELECT queries • It is useful if the change is small • Sample :  on a Linux Alpha 2×500MHz system with 2GB RAM and a 64MB query cache: Searches for a single row in a single-row table are 238% faster with the query cache than without it 8
  • 9. Database caching • Database caching is a process included in the design of computer applications • database caching is used to achieve high scalability and performance. • Database caching improves scalability by distributing query workload from backend to multiple cheap front-end systems. 9
  • 10. Changing Database engine 10 giving you support for the ACID property
  • 11. In-memory database • An in-memory database system is a database management system that stores data entirely in main memory. • Used in Applications where response time is critical Sqlite in memory: rc = sqlite3_open(":memory:", &db); • in-memory databases will be able to run at full speed and maintain data in the event of power failure. • Sample of in-memory databases: Redis(VMware / Pivotal Software - 2009) SQLite 11
  • 12. parallel database systems goals parallel database systems :  high performance  Scalable  fault tolerant database management three key components of a high performance parallel DBMS:  data partitioning strategies  algorithms for parallel processing of a join operator  Need a framework that controls the placement of data Examples : Oracle parallel Server , IBM’s DB2 parallel Edition 12
  • 13. parallel database systems The hardware platform 13
  • 14. Designing distributed database systems • It may be stored in multiple computers, located in the same physical location; or may be dispersed over a network of interconnected computers • Unlike parallel systems, in which the processors are tightly coupled and constitute a single database system, a distributed database system consists of loosely coupled sites that share no physical components. 14
  • 15. NoSQL Databases • originally referring “no sql” OR “not only sql” • designed to manage the scalability and performance issues • support eventual consistency rather than ACID • divided into four categories : I. Key-value stores such as redis II. document databases such as mongodb III. graph databases such as neo4j IV. column-oriented databases such as cassandra 15
  • 16. High-Performance Database requirements • Select one or more database with respect to your data types • According to the selected database provide hardware platforms(memory,disc and cpu) • Use high speed network to connect nodes If you want to use the distributed database • Tune your database for optimal use of resources • optimize your queries 16
  • 17. Review data stores used in Facebook • MYSQL: storage such as wall posts, user information, timeline etc  This data is replicated between their various data centers. • MEMCACHED: Facebook makes heavy use of Memcached a memory caching system to reduce reading time • HAYSTACK: each uploaded photo, Facebook generates and stores four images of different sizes current growth rate is 220 million new photos per week Implements a HTTP based photo server which stores photos in a generic object store called Haystack 17
  • 18. Review databases used in Facebook • CASSANDRA: The Apache Cassandra database is the right choice when you need scalability and high-availability without compromising performance  Facebook uses it for its Inbox search. 18
  • 19. Cassandra high performance database • Distributed • High Performance • Extremely Scalable • Fault tolerant(No single point of failure) 19
  • 20. Cassandra Architecture Overview • Cassandra was designed with the understanding that system/hardware failure can and do occur. • Peer-to-peer, distributed system • All nodes the same • Custom data replication to ensure fault tolerance • Read/Write-anywhere design 20
  • 21. Conclusion • In Small Scales relational databases act better than nosql databases • If you need to execute complex queries, relational databases is best choose • If you need to large scale or distributed database you can use the nosql databases 21
  • 22. References • Jose M. Faleiro, Daniel J. Abadi, “FIT: A Distributed Database Performance Tradeoff”, IEEE,2015 • KLAUS ELHARDT , “A Database Cache for High Performance and Fast Restart in Database Systems” 22
  • 23. 23
  • 25. performance metrics Business Transactions Business Transactions provide insight into real user behavior: they capture real- time performance that real users are experiencing as they interact with your application. involves capturing the response time of a business transaction 25
  • 26. performance metrics Query Performance • Selecting More Data Than Needed • Inefficient Joins Between Tables • Too Few or Too Many Indexes • Too Much Literal SQL Causing Parse Contention The most obvious place to look for poor query performance is in the query itself. Problems can result from queries that take too long to identify the required data or bring the data back. Look for the following issues in queries. 26
  • 27. performance metrics User and Query Conflicts • Page/row Locking Due to Slow Queries • Transactional Locks and Deadlocks • Batch Activities Causing Resource Contention for Online Users Databases are designed to be multi-user, but the activities of multiple users can cause conflicts. 27
  • 28. performance metrics Capacity • Not Enough CPUs or CPU Speed Too Slow • Slow Disk • Full or Misconfigured Disks • Not Enough Memory • Slow Network Not all database performance issues are database issues. Some problems result from running the database on inadequate hardware. 28
  • 29. performance metrics Configuration • Buffer Cache Too Small • No Query Caching • I/O Contention Due to Temporary Table Creation on Disk Every database has a large number of configuration settings. Default values may not be enough to give your database the performance it needs. 29
  • 30. performance metrics NoSQL Databases • Finicky Transactions • Complex Databases • Consistent JOINS • Flexibility in Schema Design • Resource Intensive NoSQL has much appeal because of its ability to handle large amounts of data very rapidly. However, some disadvantages should be assessed when weighing if NoSQL is right for your use-case scenario. 30