SlideShare a Scribd company logo
1 of 33
Showdown:  DB2 vs. Oracle Database for OLTP   Conor O’Mahony Email: [email_address] Twitter: conor_omahony Blog: db2news.wordpress.com Conor O’Mahony Email: [email_address] Twitter: conor_omahony Blog: database-diary.com
Agenda ,[object Object],[object Object],[object Object],[object Object],[object Object]
Technology for OLTP Performance ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Efficient I/O ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Large Memory and Efficient Memory Usage ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
User Scalability ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Agenda ,[object Object],[object Object],[object Object],[object Object],[object Object]
Transactional Performance ,[object Object],[object Object],[object Object]
Longevity in TPC-C Performance Results as of April 21, 2008
Apples-to-Apples Comparison ,[object Object],[object Object],[object Object],16%  Faster Results current as of Feb 24, 2008 Check  http://www.tpc.org  for latest results
Longevity in SAP 3-Tier SD Performance Results as of Jan 8, 2008
SAP SD 3-tier ,[object Object],[object Object],[object Object],[object Object],[object Object]
2-tier SAP SD Benchmarks ,[object Object],[object Object],Results as of April 8, 2008
Agenda ,[object Object],[object Object],[object Object],[object Object],[object Object]
Oracle RAC - Single Instance Wants to Read a Page ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Buffer Cache Instance 2 Buffer Cache system 1 2 3 4 GCS GCS 5 6 501 501 Instance 1
What Happens in DB2 pureScale to Read a Page ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Group Buffer Pool CF Buffer Pool Member 1 db2agent 1 2 3 4 501 501 PowerHA pureScale
Oracle RAC - Single Instance Wants to Read a Page ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Buffer Cache Instance 2 Buffer Cache system 1 2 3 4 GCS GCS 5 6 501 501 Instance 1
What Happens in DB2 pureScale to Read a Page ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Group Buffer Pool CF Buffer Pool Member 1 db2agent 1 2 3 4 501 501 PowerHA pureScale
The Advantage of DB2 Read and Register with RDMA ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Much more scalable, does not require locality of data Direct remote memory write with request I don’t have it,  get it from disk I want page 501.  Put into slot 42  of my buffer pool. Direct remote memory write of response Member 1 CF 1, Eaton, 10210, SW 2, Smith, 10111, NE 3, Jones, 11251, NW db2agent CF thread
Transparent Application Scalability ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
2, 4 and 8 Members Over 95% Scalability Scalability for OLTP Applications 64 Members 95% Scalability 16 Members Over 95% Scalability 32 Members Over 95% Scalability 88 Members 90% Scalability 112 Members 89% Scalability 128 Members 84% Scalability
Agenda ,[object Object],[object Object],[object Object],[object Object],[object Object]
Online Recovery ,[object Object],[object Object],[object Object],[object Object],[object Object],Shared Data CF CF Log Log Log Log DB2 DB2 DB2 DB2 % of Data Available Time (~seconds) Only data in-flight updates locked during recovery Database member failure 100 50
Steps Involved in DB2 pureScale Member Failure ,[object Object],[object Object],[object Object],[object Object],[object Object]
Failure Detection for Failed Member ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Member Failure Summary DB2 Single Database View CF Shared Data ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],DB2 DB2 DB2 CF Updated Pages  Global Locks Primary CF Secondary CF Updated Pages  Global Locks Log CS CS Clients CS CS CS CS Log Log Log Log Records  Pages kill -9
Steps involved in a RAC node failure ,[object Object],[object Object],[object Object],[object Object],Unlike DB2 pureScale, Oracle RAC does not centralize lock or data cache
With RAC – Access to GRD and Disks are Frozen   ,[object Object],No more I/O until pages that  need recovery  are locked No Lock Updates GRD GRD Instance 1 fails Instance  1 Instance  2 Instance  3 I/O Requests  are Frozen GRD
With RAC – Pages that Need Recovery are Locked GRD GRD Instance 1 fails I/O Requests  are Frozen Instance  1 Instance  2 Instance  3 Recovery Instance reads log of failed node Recovery instance locks pages that need recovery x x x x x x redo log redo log redo log GRD Must read log and lock pages before freeze is lifted.
DB2 pureScale – No Freeze at All   Member 1 fails Member 1 Member 2 Member 3 No I/O  Freeze CF knows what rows on these pages had in-flight updates at time of failure  x x x x x x x x x x x x CF always knows what changes are in flight CF Central Lock  Manager
Agenda ,[object Object],[object Object],[object Object],[object Object],[object Object]
Sample of Feedback… ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Thank You! ,[object Object],[object Object],Conor O’Mahony Email: [email_address] Twitter: conor_omahony Blog: database-diary.com

More Related Content

What's hot

What's hot (20)

Cost Savings at High Performance with Redis Labs and AWS
Cost Savings at High Performance with Redis Labs and AWSCost Savings at High Performance with Redis Labs and AWS
Cost Savings at High Performance with Redis Labs and AWS
 
MySQL InnoDB Cluster 소개
MySQL InnoDB Cluster 소개MySQL InnoDB Cluster 소개
MySQL InnoDB Cluster 소개
 
PostgreSQL worst practices, version PGConf.US 2017 by Ilya Kosmodemiansky
PostgreSQL worst practices, version PGConf.US 2017 by Ilya KosmodemianskyPostgreSQL worst practices, version PGConf.US 2017 by Ilya Kosmodemiansky
PostgreSQL worst practices, version PGConf.US 2017 by Ilya Kosmodemiansky
 
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...
 
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
 
Hit Refresh with Oracle GoldenGate Microservices
Hit Refresh with Oracle GoldenGate MicroservicesHit Refresh with Oracle GoldenGate Microservices
Hit Refresh with Oracle GoldenGate Microservices
 
MySQL GTID 시작하기
MySQL GTID 시작하기MySQL GTID 시작하기
MySQL GTID 시작하기
 
MySQL InnoDB Cluster and Group Replication in a nutshell hands-on tutorial
MySQL InnoDB Cluster and Group Replication in a nutshell  hands-on tutorialMySQL InnoDB Cluster and Group Replication in a nutshell  hands-on tutorial
MySQL InnoDB Cluster and Group Replication in a nutshell hands-on tutorial
 
Redo log improvements MYSQL 8.0
Redo log improvements MYSQL 8.0Redo log improvements MYSQL 8.0
Redo log improvements MYSQL 8.0
 
Implementing High Availability Caching with Memcached
Implementing High Availability Caching with MemcachedImplementing High Availability Caching with Memcached
Implementing High Availability Caching with Memcached
 
Wars of MySQL Cluster ( InnoDB Cluster VS Galera )
Wars of MySQL Cluster ( InnoDB Cluster VS Galera ) Wars of MySQL Cluster ( InnoDB Cluster VS Galera )
Wars of MySQL Cluster ( InnoDB Cluster VS Galera )
 
Deep Dive into MySQL InnoDB Cluster Read Scale-out Capabilities.pdf
Deep Dive into MySQL InnoDB Cluster Read Scale-out Capabilities.pdfDeep Dive into MySQL InnoDB Cluster Read Scale-out Capabilities.pdf
Deep Dive into MySQL InnoDB Cluster Read Scale-out Capabilities.pdf
 
MySQL InnoDB Cluster - New Features in 8.0 Releases - Best Practices
MySQL InnoDB Cluster - New Features in 8.0 Releases - Best PracticesMySQL InnoDB Cluster - New Features in 8.0 Releases - Best Practices
MySQL InnoDB Cluster - New Features in 8.0 Releases - Best Practices
 
Performance Tuning Oracle Weblogic Server 12c
Performance Tuning Oracle Weblogic Server 12cPerformance Tuning Oracle Weblogic Server 12c
Performance Tuning Oracle Weblogic Server 12c
 
HA/DR options with SQL Server in Azure and hybrid
HA/DR options with SQL Server in Azure and hybridHA/DR options with SQL Server in Azure and hybrid
HA/DR options with SQL Server in Azure and hybrid
 
산동네 게임 DBA 이야기
산동네 게임 DBA 이야기산동네 게임 DBA 이야기
산동네 게임 DBA 이야기
 
Oracle on kubernetes 101 - Dec/2021
Oracle on kubernetes 101 - Dec/2021Oracle on kubernetes 101 - Dec/2021
Oracle on kubernetes 101 - Dec/2021
 
Oracle database high availability solutions
Oracle database high availability solutionsOracle database high availability solutions
Oracle database high availability solutions
 
MySQL High Availability Solutions
MySQL High Availability SolutionsMySQL High Availability Solutions
MySQL High Availability Solutions
 
MySQL Performance Tuning. Part 1: MySQL Configuration (includes MySQL 5.7)
MySQL Performance Tuning. Part 1: MySQL Configuration (includes MySQL 5.7)MySQL Performance Tuning. Part 1: MySQL Configuration (includes MySQL 5.7)
MySQL Performance Tuning. Part 1: MySQL Configuration (includes MySQL 5.7)
 

Viewers also liked

Kimball Vs Inmon
Kimball Vs InmonKimball Vs Inmon
Kimball Vs Inmon
guest2308b5
 
Comparison of dbms
Comparison of dbmsComparison of dbms
Comparison of dbms
Tech_MX
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
pcherukumalla
 

Viewers also liked (19)

Business Case: IBM DB2 versus Oracle Database - Conor O'Mahony
Business Case: IBM DB2 versus Oracle Database - Conor O'MahonyBusiness Case: IBM DB2 versus Oracle Database - Conor O'Mahony
Business Case: IBM DB2 versus Oracle Database - Conor O'Mahony
 
Db2 tutorial
Db2 tutorialDb2 tutorial
Db2 tutorial
 
Kimball Vs Inmon
Kimball Vs InmonKimball Vs Inmon
Kimball Vs Inmon
 
Comparison of dbms
Comparison of dbmsComparison of dbms
Comparison of dbms
 
080827 abramson inmon vs kimball
080827 abramson   inmon vs kimball080827 abramson   inmon vs kimball
080827 abramson inmon vs kimball
 
Inmon & kimball method
Inmon & kimball methodInmon & kimball method
Inmon & kimball method
 
Data warehouse inmon versus kimball 2
Data warehouse inmon versus kimball 2Data warehouse inmon versus kimball 2
Data warehouse inmon versus kimball 2
 
3 tier data warehouse
3 tier data warehouse3 tier data warehouse
3 tier data warehouse
 
Oltp vs olap
Oltp vs olapOltp vs olap
Oltp vs olap
 
Hotel Management System
Hotel Management SystemHotel Management System
Hotel Management System
 
PPT FOR ONLINE HOTEL MANAGEMENT
PPT FOR ONLINE HOTEL MANAGEMENTPPT FOR ONLINE HOTEL MANAGEMENT
PPT FOR ONLINE HOTEL MANAGEMENT
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Tutorial introduction to revenue management for hotels hospitality seminar w...
Tutorial introduction to revenue management for hotels hospitality seminar  w...Tutorial introduction to revenue management for hotels hospitality seminar  w...
Tutorial introduction to revenue management for hotels hospitality seminar w...
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Introduction to Data Vault Modeling
Introduction to Data Vault ModelingIntroduction to Data Vault Modeling
Introduction to Data Vault Modeling
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
 
Hotel Reservation System Project
Hotel Reservation System ProjectHotel Reservation System Project
Hotel Reservation System Project
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Mining
 
Project Proposal document for Hotel Management System
Project Proposal document for Hotel Management SystemProject Proposal document for Hotel Management System
Project Proposal document for Hotel Management System
 

Similar to Showdown: IBM DB2 versus Oracle Database for OLTP

Oracle 10g rac_overview
Oracle 10g rac_overviewOracle 10g rac_overview
Oracle 10g rac_overview
Robel Parvini
 
Tsm7.1 seminar Stavanger
Tsm7.1 seminar StavangerTsm7.1 seminar Stavanger
Tsm7.1 seminar Stavanger
Solv AS
 
Serverless (Distributed computing)
Serverless (Distributed computing)Serverless (Distributed computing)
Serverless (Distributed computing)
Sri Prasanna
 
Clustered Architecture Patterns Delivering Scalability And Availability
Clustered Architecture Patterns Delivering Scalability And AvailabilityClustered Architecture Patterns Delivering Scalability And Availability
Clustered Architecture Patterns Delivering Scalability And Availability
ConSanFrancisco123
 

Similar to Showdown: IBM DB2 versus Oracle Database for OLTP (20)

Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large Scale
 
MYSQL
MYSQLMYSQL
MYSQL
 
Oracle 10g rac_overview
Oracle 10g rac_overviewOracle 10g rac_overview
Oracle 10g rac_overview
 
Tsm7.1 seminar Stavanger
Tsm7.1 seminar StavangerTsm7.1 seminar Stavanger
Tsm7.1 seminar Stavanger
 
Handling Data in Mega Scale Systems
Handling Data in Mega Scale SystemsHandling Data in Mega Scale Systems
Handling Data in Mega Scale Systems
 
NFSv4 Replication for Grid Computing
NFSv4 Replication for Grid ComputingNFSv4 Replication for Grid Computing
NFSv4 Replication for Grid Computing
 
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...
 
11g R2
11g R211g R2
11g R2
 
Serverless (Distributed computing)
Serverless (Distributed computing)Serverless (Distributed computing)
Serverless (Distributed computing)
 
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
 
Sql server 2016 it just runs faster sql bits 2017 edition
Sql server 2016 it just runs faster   sql bits 2017 editionSql server 2016 it just runs faster   sql bits 2017 edition
Sql server 2016 it just runs faster sql bits 2017 edition
 
Data Grids with Oracle Coherence
Data Grids with Oracle CoherenceData Grids with Oracle Coherence
Data Grids with Oracle Coherence
 
Clustered Architecture Patterns Delivering Scalability And Availability
Clustered Architecture Patterns Delivering Scalability And AvailabilityClustered Architecture Patterns Delivering Scalability And Availability
Clustered Architecture Patterns Delivering Scalability And Availability
 
Data center disaster recovery.ppt
Data center disaster recovery.ppt Data center disaster recovery.ppt
Data center disaster recovery.ppt
 
Operating system ppt
Operating system pptOperating system ppt
Operating system ppt
 
Operating system ppt
Operating system pptOperating system ppt
Operating system ppt
 
Operating system ppt
Operating system pptOperating system ppt
Operating system ppt
 
Operating system ppt
Operating system pptOperating system ppt
Operating system ppt
 
11g Identity Management - InSync10
11g Identity Management - InSync1011g Identity Management - InSync10
11g Identity Management - InSync10
 
DB2 for z/O S Data Sharing
DB2 for z/O S  Data  SharingDB2 for z/O S  Data  Sharing
DB2 for z/O S Data Sharing
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 

Showdown: IBM DB2 versus Oracle Database for OLTP

  • 1. Showdown: DB2 vs. Oracle Database for OLTP Conor O’Mahony Email: [email_address] Twitter: conor_omahony Blog: db2news.wordpress.com Conor O’Mahony Email: [email_address] Twitter: conor_omahony Blog: database-diary.com
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9. Longevity in TPC-C Performance Results as of April 21, 2008
  • 10.
  • 11. Longevity in SAP 3-Tier SD Performance Results as of Jan 8, 2008
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21. 2, 4 and 8 Members Over 95% Scalability Scalability for OLTP Applications 64 Members 95% Scalability 16 Members Over 95% Scalability 32 Members Over 95% Scalability 88 Members 90% Scalability 112 Members 89% Scalability 128 Members 84% Scalability
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29. With RAC – Pages that Need Recovery are Locked GRD GRD Instance 1 fails I/O Requests are Frozen Instance 1 Instance 2 Instance 3 Recovery Instance reads log of failed node Recovery instance locks pages that need recovery x x x x x x redo log redo log redo log GRD Must read log and lock pages before freeze is lifted.
  • 30. DB2 pureScale – No Freeze at All Member 1 fails Member 1 Member 2 Member 3 No I/O Freeze CF knows what rows on these pages had in-flight updates at time of failure x x x x x x x x x x x x CF always knows what changes are in flight CF Central Lock Manager
  • 31.
  • 32.
  • 33.

Editor's Notes

  1. What are the most important features in an RDBMS for OLTP transactions? In order to deliver very high throughput levels for transactional systems, the RDBMS must be able to efficiently perform I/O operations without holding up the transaction. It must be able to utilize memory more efficiently and must also be able to effectively handle large numbers of users. These three critical areas enable a database server to deliver very high levels of performance. We will explore each of these three areas in detail on the following 3 slides.
  2. In very high volume transactional systems the logger can quickly become the bottleneck. DB2 (and other RDBMSs) can do most of the work a transaction requires completely in memory (updates occur in the buffer pool in memory, authorizations and access plans are all cached in memory, etc.). However there is one thing that cannot happen in memory. Whenever a transaction performs a COMMIT the information that tells the RDBMS how to redo that transaction (i.e. the log information) must be flushed to disk. If the committed transaction is not recorded in the log files on disk then it would be possible to lose committed transactions. Therefore all database servers write records to the log files whenever a user commits a transaction. In order to achieve very high concurrency and high throughput, it is essential that the logger be as efficient as possible since these I/Os can quickly become the bottleneck (since disk access is significantly slower than memory access). There is a very strong proof point that demonstrates DB2 has a much more efficient logger than our competitors. TPC-C is an industry standard transaction processing benchmark that all major database vendors participate in. Each vendor runs their own benchmarks to try to demonstrate they have the best RDBMS. DB2 comes out on top more often than any other database vendor (but more on that later). One interesting thing to note about TPC-C is that one of the requirements of the benchmark is that you also publish how much log space you consume during the benchmark run. By comparing the log space consumed (and knowing that this standard benchmark requires every vendor to run the exact same transactions over and over again) we can compare the efficiency of the 3 database vendor’s loggers. The most current TPC-C results (as of March 18, 2008) are shown on this chart. You can see that for each transaction (standard TPC-C transaction) DB2 produced 2.4 KB of log. Oracle’s top result is with 10gR2 (no 11g top result as of 3/18/2008) consumes 2x that much log space meaning that their the DB2 logger is twice as efficient and therefore can deliver higher levels of throughput. Oracle ran a TPC-C benchmark with RAC and consumed 20x more space than DB2. Ask Oracle why RAC consumes so much log space for the same transactions! Microsoft SQL Server 2005 result, was even worse than Oracle consuming more than 2.5x that of DB2. These benchmarks are the most highly tuned database systems available (tuned by database vendor benchmark experts from the vendor). This reduction in logging is one of the reasons why DB2 delivers better OLTP performance compared to Oracle and Microsoft.
  3. Efficient use of memory is also critical for high volume transaction processing. Given a limited amount of physical memory, you want your database to utilize it to the fullest in order to improve throughput of your system. DB2 has two unique advantages over Oracle in this area. The first is that DB2 allows for multiple buffer pools. In Oracle you can have only one buffer pool per page size (i.e. 1 4KB pool, 1 8KB pool, 1 16KB pool and 1 32KB pool). This can severely limit your ability to effectively utilize the memory on the server to tune the system for optimal performance. DB2 however, allows for as many buffer pools of any page size that you like. For example you can have 4 buffer pools of 4KB and another 5 buffer pools of 8KB, etc. You can choose the buffer pool configuration that best suites your transaction processing needs. As an example, on a server with 2TB of real memory in a TPC-C benchmark DB2 allocated several buffer pools of different page sizes whose total size in aggregate was 1.9TB. Now with the new threaded engine in DB2 9.5 there is even more advantage over Oracle. By using threads rather than processes for user connections, the amount of memory consumed per connection is significantly lower. This allows more user connections for a given amount of memory and leaves more memory available to other areas of DB2 (like the buffer pool). This better memory utilization results again in higher throughput and better performance. Later in this presentation we will talk about Self Tuning Memory Manager (STMM) which will show that not only does DB2 better exploit memory for higher performance, but it does so with less administration required.
  4. The final area that is critical to high transaction performance is the ability to support large numbers of concurrent users. Both DB2 and Oracle have the ability to do connection concentration to reduce memory requirements on the server. However, only DB2 has the threaded engine mentioned on the previous slide. This enables DB2 to scale higher than Oracle on the same server with the same amount of memory and therefore deliver higher throughput.
  5. There are several transaction processing benchmarks that demonstrate DB2’s performance leadership over Oracle. The first is TPC-C which is an industry standard transaction processing benchmark. SAP Sales and Distribution (SD) is also a widely used performance benchmark which simulates real world SAP transactions. The third transaction processing benchmark is called SPECjAppServer which measures the performance of a web based java application on the database system. We will discuss each of these benchmarks on the following slides.
  6. Benchmarks are often a leapfrog game where on any given day, one database vendor can be in front of the rest if they run on some newly announced hardware or the latest software versions. This chart represents days of leadership for TPC-C since Jan 1, 2003 through April 21, 2008. It measures how long each of the vendors have held the top spot in TPC-C. Over this 5 year period of time, DB2 has been in a leadership position almost 2x longer than Oracle and in fact has lead longer than all other database vendors combined.
  7. It is not very often that you get an Apples to Apples comparison where two database vendors run their benchmarks on the exact same hardware. This result is slightly dated (using DB2 v8 against Oracle 10g) however it shows that on exactly the same hardware, DB2 delivered 16% better performance than Oracle. In fact you would need 10 CPUs of Oracle to match the performance of 8 CPUs of DB2 on this class of server.
  8. In fact, Oracle has rarely been able to challenge DB2 over the past 5 years on the SAP SD 3-tier benchmark. This chart represents days of leadership for SAP SD 3-tier since Jan 1, 2003. As you can see, DB2 has held the lead over the last 5 years 8 times longer than Oracle (the only other competitor to lead in this timeframe) .
  9. This result shows the top SAP SD 3-tier benchmark results as of March 18, 2008. SAP Sales and Distribution 3-tier represents a configuration where the database software is running on it’s own server hardware and there are several SAP application servers in the middle tier. This is the configuration that most enterprise customers would run their SAP workloads on and DB2 has demonstrated clear performance leadership in this area.
  10. On the SAP SD 2-tier benchmark DB2 leads Oracle by 18% using half the number of processor cores. On April 8, 2008 DB2 9.5 running on a 64 core IBM Power 595 with AIX 6.1 delivered 35,400 SD Users. Oracle’s top result is 30,000 SD users with 10g running on a 128 core HP Integrity Superdome with HP-UX.
  11. A server process that wants to access a data page, for example page 501, will first check to see if that page is in its local buffer pool (step  ). If this page is not found, the server process will send an inter-process communication (IPC) request to a GCS process in order to ask the master node for that data page (  ). This results in the server process yielding the CPU and the CPU performing a context switch to potentially re-establish the GCS process on the CPU to process the interrupt. High levels of context switching can be very costly to perform. The GCS process then sends an IP request to the master node for the data block being requested (  ). Because IP calls are processed in the operating system kernel, the GCS process has to copy the requested information into kernel memory and then execute expensive IP stack calls to push the request to the remote node. Even if an InfiniBand network is being used, Oracle still uses IP over InfiniBand or in some cases Reliable Datagram Sockets (RDS). Use of a socket protocol even over InfiniBand is costly due to processor interrupts, IP stack traversal, etc. Next, the remote master GCS process will receive an interrupt and will be scheduled on the CPU to process the request. It will check to see if any other members have the page in its buffer cache. In this example, no member has the page so the GCS process will send an IP message back to the requester telling it to read the page from disk (  ). The GCS processes on the requesting node will be interrupted again to process the incoming IP request, and will in turn send an IPC interrupt (  ) to the server process to inform it that no other node in the cluster has the page. The server process will then read the page from disk into its own buffer cache (  ).
  12. This slide illustrates the advantage of DB2 pureScale for very efficient access to data. A comparison to Oracle RAC will follow in the next section. The steps listed above show you how DB2 pureScale communicates with the CF to declare its intent to access a data page. Steps 2 and 3 are the critical success factors to DB2 pureScale efficiency. That is, when there is a need to communicate with the centralized CF, that communication uses RDMA. Essentially, the process on member 1 writes directly into the memory of the CF with its request. This is done without going through the IP socket stack, without context switching and in many cases without having to yield the CPU (the round trip communication time between the two servers can be as little as 15 microseconds).
  13. A server process that wants to access a data page, for example page 501, will first check to see if that page is in its local buffer pool (step  ). If this page is not found, the server process will send an inter-process communication (IPC) request to a GCS process in order to ask the master node for that data page (  ). This results in the server process yielding the CPU and the CPU performing a context switch to potentially re-establish the GCS process on the CPU to process the interrupt. High levels of context switching can be very costly to perform. The GCS process then sends an IP request to the master node for the data block being requested (  ). Because IP calls are processed in the operating system kernel, the GCS process has to copy the requested information into kernel memory and then execute expensive IP stack calls to push the request to the remote node. Even if an InfiniBand network is being used, Oracle still uses IP over InfiniBand or in some cases Reliable Datagram Sockets (RDS). Use of a socket protocol even over InfiniBand is costly due to processor interrupts, IP stack traversal, etc. Next, the remote master GCS process will receive an interrupt and will be scheduled on the CPU to process the request. It will check to see if any other members have the page in its buffer cache. In this example, no member has the page so the GCS process will send an IP message back to the requester telling it to read the page from disk (  ). The GCS processes on the requesting node will be interrupted again to process the incoming IP request, and will in turn send an IPC interrupt (  ) to the server process to inform it that no other node in the cluster has the page. The server process will then read the page from disk into its own buffer cache (  ).
  14. This slide illustrates the advantage of DB2 pureScale for very efficient access to data. A comparison to Oracle RAC will follow in the next section. The steps listed above show you how DB2 pureScale communicates with the CF to declare its intent to access a data page. Steps 2 and 3 are the critical success factors to DB2 pureScale efficiency. That is, when there is a need to communicate with the centralized CF, that communication uses RDMA. Essentially, the process on member 1 writes directly into the memory of the CF with its request. This is done without going through the IP socket stack, without context switching and in many cases without having to yield the CPU (the round trip communication time between the two servers can be as little as 15 microseconds).
  15. To dive deeper into the “secret sauce” let’s look at exactly how the member communicates with the CF. If an agent on member 1 wants to read a page, that agent will write directly into the member of the CF telling the CF exactly what page it wants and even telling it what slot in its buffer pool that the page will go into on Member 1. If the CF does not have the page, it writes a message right into the memory of Member 1 to indicate that it doesn’t have the page. If the CF does have the page, it writes the data page directly into memory on Member 1 without any context switching or IP stack calls.
  16. As previously mentioned, the critical success factor for scalability in an active-active cluster is to ensure that when a transaction requests a piece of data, it can get that data with the lowest amount of latency. With DB2 pureScale, by centralizing data that is of interest to more than one member in the cluster, and by accessing that data using RDMA in an interrupt free processing environment, you can see near linear scalability even out to dozens of nodes. More importantly, you do not need to design your application to be cluster aware. There is no need to route transactions that access the same data pages to a single node. In practice, this is not the case with Oracle RAC. There are many stories on the internet, and in published books on Oracle RAC that tell customers to avoid hot pages being passed between nodes by using one of the methods described in the last 4 bullets of the above slide. These methods require costly DBA and application developer interventions, as well as potential application rework as the size of the cluster changes.
  17. To demonstrate the scalability of DB2 pureScale, the lab set up a configuration comprised of 128 members (note that for server consolidation environments it is possible to put multiple members on an SMP server). A workload was created where the read to write ratios are typically 90:10. As well, to prove the scalability of architecture, the application has no cluster awareness. In fact the application updates or selects a random row and therefore every row in the database will be touched by all members in the cluster (we did this to show that locality of data is not as essential for scaling as with other shared disk architectures) The results of this 128 member test show that there is near linear scaling even out to 128 members in the cluster. Up to 64 members in the cluster, the scalability (compared to the 1 member result) is still above 95% and at 128 members the scalability was at 84%. Note that this is a validation of the architecture and includes some capabilities under development that will not be in the December GA code.
  18. The second key feature of DB2 pureScale is the high availability it provides. Again the secret to its success is the centralized locking and caching. When one member fails, all other members in the cluster can continue to process transactions. The only data that is unavailable are the actually pages that were being updated in flight when the member failed. And if those pages are hot then they will be in the CF memory which means the recovery of pages needed by other members will be very fast.
  19. There are 3 things that occur during an instance failure at a high level Failure detection Pull pages that need to be fixed directly from CF memory Fix the pages In each of these steps, DB2 pureScale has been optimized with the goal of getting these pages fixed and having them accessible in under 20 seconds (all the while the rest of the data in the database is completely available).
  20. Failure detection was a large part of the investment that went into DB2 pureScale. Software failure in a DB2 pureScale environment has been architected to be caught in a fraction of a second and to begin the driving of recovery processing within that second. Hardware failure is a more difficult challenge, but thanks to some innovative techniques, DB2 pureScale has built in a set of algorithms that can detect node failures in as little as 3 seconds without false failovers. When we talk about having the rows available again within 20 seconds of a failure, we mean from the time the failure occurred, not the time the failure was detected. Other vendors may exclude this time to give better numbers but from an end user this time is critical and so we include it.
  21. Here is a detailed walk through of what happens when a node fails. Run this in slide show mode to see the steps. Note that we call this process “Online Failover” because the other transactions on other nodes are not impacted in any way from processing (which is different than Oracle RAC as you will see in future slides. As well, the data that needs to be fixed will primarily be in memory on the CF so it will be working at memory speeds. In the event of a hardware failure, we take the additional step of automatically fencing off the storage access from the failed member to prevent split brain issues.
  22. In Oracle there are a similar set of steps to recover from a filed instance. However the middle two are where things are very different when compared to DB2 pureScale: Node failure detection Global lock remastering Lock pages that need recovery Fix those pages The biggest difference is that with DB2 pureScale there is centralized locking so there is no need to remaster global locks. Also pureScale does not need to find the pages to lock (it is already aware of the pages that need to be fixed).
  23. In Oracle RAC, each data page (called a data block in Oracle) is mastered by one of the instances in the cluster. Oracle employs a distributed locking mechanism and therefore each instance in the cluster is responsible for managing and granting lock requests for the pages that it masters. In the event of a node failure, the data pages for the failed node become momentarily orphaned while RAC goes through a lock redistribution process to assign new ownership of these orphaned pages to the surviving nodes in the cluster. This is called Global Resource Directory (GRD) reconfiguration and while this is occurring, any request to read a page, as well as any request to lock a page, is momentarily frozen. Applications can continue to process on the surviving nodes, however, during this time, they cannot perform any I/O operations or request any new locks. This results in many applications experiencing a freeze as shown in this slide.
  24. The second step in the Oracle RAC node recovery process is to lock all the data pages that need recovery. This must be done before the GRD freeze described earlier is released. If an instance was allowed to read a page from disk before the appropriate page locks were acquired, the update from the failed instance could be lost. The recovery instance performs a first pass read of the redo log file from the failed instance and locks any pages that need recovery as shown in Figure 2. This may require a significant number of random I/O operations as the log file, and potentially the pages that need recovery, may not be in the memory of any of the surviving nodes. The GRD freeze is lifted and the stalled applications can continue processing only after all these I/O operations are performed by the recovery instance and the appropriate pages are locked. Depending on the amount of work that the failed node was doing at the time of the failure, this process can take from tens of seconds up to as much as a minute before it completes. This GRD freeze and the fact that I/O operations cannot be performed during this period or new lock requests granted, is documented in several published books on Oracle RAC.
  25. In comparison, DB2 pureScale environments require no global freeze in the cluster. The CF is aware at all times which pages would need recovery should any member fail. If a member fails, all other members in the cluster can continue to run transactions and perform I/O operations. Only requests to access pages that need recovery will be blocked while the recovery process cleans up from the failed member as shown on this slide (and the process is likely to happen from memory).