SlideShare une entreprise Scribd logo
1  sur  66
October 25–29, 2009 • Mandalay Bay • Las Vegas, Nevada
0
DB2 Best Practices for Optimal Performance
Sunil Kamath
Senior Technical Staff Member
IBM Toronto Labs
sunil.kamath@ca.ibm.com
Download this slide
http://ouo.io/Fyhxku
Agenda
Basics
– Sizing workloads
– Best Practices for Physical Design
Benchmarks
DB2 9.7 Performance Improvements
Summary
1
– Scan Sharing
– XML in DPF
– Statement Concentrator
– Currently Committed
– LOB Inlining
– Compression
– Index Compression
– Temp Table
Compression
– XML Compression
– Range Partitioning with
local indexes
Performance “Truisms”
There is always a bottleneck!
Remember the 5 fundamental bottleneck areas:
1. Application
2. CPU
3. Memory
4. Disk
5. Network
Balance is key!
2
Ideally one should understand:
– The application
– Load process requirements
– Number of concurrent users/jobs
– Largest tables' sizes
– Typical query scenarios
– Size of answer sets being generated
– Response time objectives for loads and queries
– Availability requirements
– …
Sizing a Configuration
3
Sizing “Rules of Thumb”
Platform choice
CPU
Memory
Disk
– Space
– Spindles
4
Platform Selection
DB2 is highly optimized for all major platforms
– AIX, Linux, Windows, Solaris, HP-UX
– 64 bit is strongly recommended
Much more than a performance question
– Integration with other systems
– Skills / Ease of Use
– $$$
Often more than 1 “good” choice
5
Selecting DB2 with and without Data Partitioning (InfoSphere
Warehouse)
Differences becoming smaller
– Function and manageability gaps
Data Partitioning is less common for
– OLTP,ERP,CRM
Data Partitioning is most common for
– Data Warehousing
6
Memory! How Much Do I Need?
Highly dependent on many factors
– Depends on number of users (connections)
– Depends on the query workload
– Depends on whether or not other software is sharing the machines
being measured
Advisable to allocate 5% of active data for bufferpool sizing
New systems use 64-bit processors
– If using 32-bit Windows/Linux/DB2
just use 4GB.
7
Disk! How Many GB Do I Need?
More than you think!
Don’t forget about
– Working storage
– Tempspace
– Indexes, MQT’s etc.
But big drives tend to give lots of space
– 146/300GB drives now standard
Raw data x 4 (unmirrored)*
Raw data x 5 (RAID5)*
Raw data x 8 (RAID10)*
* Assumes no compression8
Disk! How Many Spindles Do I Need?
Need to define a balanced system
– Don't want too few large disks
• Causes I/O bottleneck
Different kinds of requirements
– IOPS
• Latency
– MB/sec
• Throughput
Don’t share disks for table/indexes with logs
Don’t know how many disks in the SAN?
– Make friends with storage Admin!
9
Basic Rules of Thumb (RoT)
Meant to be approximate guidelines:
– 150-200 GB active data per core
– 50 concurrent connections per core
– 8 GB RAM per core
– 1500-2000 IOPS per core
The above guidelines works for most virtualization
environments as well
These RoT are NOT meant to be a replacement or alternative
to real workload sizing
10
Additional Considerations for Virtualized environments
Performance overhead with Hypervisor
– Varies with type of hypervisor and environment
Effect of over committing CPU at “system” level
Effect of over committing memory at “system” level
Effects of sharing same disks for multiple workloads
11
Building Your Database
12
Physical Database Design
Create 1 database for each DB2 instance
Issue “create database” with
– Unicode codeset
• Default starting with DB2 9.5
– Automatic Storage
• Storage paths for tables/indexes etc
• DBPATH for log etc.
– Suitable pagesize
Example
– CREATE DB <DBNAME> AUTOMATIC STORAGE YES
ON /fs1/mdmdb, /fs2/mdmdb, /fs3/mdmdb, /fs4/mdmdb
DBPATH on /fs0/mdmdb
USING CODESET UTF-8 TERRITORY <TERRITORY>
COLLATE USING UCA400_NO PAGESIZE 8K;
Suggestion: Make everything explicit to facilitate understanding
13
Selecting a Page Size
Use a single page size if possible
– For example, 8K or 16K
With LARGE tablespaces there is ample capacity for growth
OLTP
– Smaller page sizes may be better (e.g. 8K)
Warehouse
– Larger pages sizes often beneficial (e.g. 16K)
XML
– Use 32K page size
Choosing an appropriate pagesize should depend on access pattern of rows
(sequential Vs random)
With DB2 9.7, the tablespace limits have increased by 4x; For example, with 4K
page size, the max tablespace size is now 8 TB
14
Tablespace Design
Use automatic storage
– Significant enhancements in DB2 9.7
Use Large tablespaces
– Default since DB2 9.5
Disable file system caching via DDL as appropriate
Ensure temp tablespaces exist
– 1 for each page size, ideally just 1
Keep number of tablespaces reasonably small
– 1 for look up tables in single node nodegroup
– 1 for each fact table (largest tables)
– 1 for all others
Create separate tablespaces for indexes, LOBs
Large tablespaces further help exploit
table/index/temp compression
15
Choosing DMS vs. SMS
Goal:
– Performance of RAW
– Simplicity/usability of SMS
DMS FILE is the preferred choice
– Performance is near DMS RAW
• Especially when bypassing filesystem caching
– Ease of use/management is similar to SMS
• Can gradually extend the size
– Flexible
• Can add/drop containers
• Can separate data/index/long objects into their own table space
– Potential to transition to Automatic Storage
Automatic storage is built on top of DMS FILE
– But it automates container specification / management
16
Choosing DMS FILE vs. Automatic Storage
Goal:
– To maximize simplicity/usability
Automatic Storage is the preferred choice with DB2 9.5
– Strategic direction
• Receives bulk of development investment
– Key enabler/prerequisite for future availability/scalability
enhancements
– Performance is equivalent to DMS FILE
– Ease of use/management is superior
• No need to specify any containers
• Makes it easy to have many table spaces
– Flexible
• Can add/drop storage paths
17
Consider Schema optimizations
Decide on how to structure your data
– Consider distributing your data across nodes
• Using DPF hash-partitioning
– Consider partitioning your data by ranges
• Using table range partitioning
– Consider organizing your data
• Using MDC (multi dimensional clustering)
Auxiliary data structures
– Do the right indexes exist ?
• Clustered, clustering, include columns for unique index
– Would Materialized query tables (MQT) help?
You can feed dynamic snapshot into design advisor
18
Table Design
OK to have multiple tables in a tablespace
Once defined, use ALTER table to select options
– APPEND MODE - use for tables where inserts are at end of table (ALTER
TABLE ... APPEND ON)
• This also enables concurrent append points for high concurrent INSERT activity
– LOCKSIZE - use to select table level locking (ALTER TABLE ... LOCKSIZE
TABLE)
– PCTFREE - use to reserve space during load/reorg (ALTER TABLE
...PCTFREE 10)
Add pk/fk constraints after index creation
19
Table Design - Compression
Compress base table data at row level
– Build a static dictionary, one per table
On-disk and in-memory image is smaller
Need to uncompress data before processing
Classic tradeoff: more CPU for less disk I/O
– Great for IO-bound systems that have spare CPU cycles
Large, rarely referenced tables are ideal
20
Index Design
In general, every table should have at least 1 index
– Ideally a unique index / primary key index
Choose appropriate options
– PCTFREE - should be 0 for read-only table
– PAGE SPLIT HIGH/LOW – for ascending inserts especially
– CLUSTER - define a clustering index
– INCLUDE columns - extra cols in unique index for index-only access
– COLLECT STATISTICS while creating an index
With DB2 9.7 indexes can be compressed too!
21
Benchmarks
DB2 is the performance leader
TPoX
22
World Record Performance With TPC-C
4,033,378
3,210,540
6,085,166
200,000
1,200,000
2,200,000
3,200,000
4,200,000
5,200,000
6,200,000
7,200,000
tpmC
DB2 8.2 on 64-way POWER5
DB2 9.1 on 64-way POWER5+
DB2 9.5 on 64-way POWER6
64x 1.9GHz
POWER5
2 TB RAM
6400 disks
64x 2.3GHz
POWER5+
2 TB RAM
6400 disks
TPC Benchmark, TPC-C, tpmC, are trademarks of the Transaction Processing Performance Council.
• DB2 8.2 on IBM System p5 595 (64 core POWER5 1.9GHz): 3,210,540 tpmC @ $5.07/tpmC available: May 14, 2005
• DB2 9.1 on IBM System p5 595 (64 core POWER5+ 2.3GHz): 4,033,378 tpmC @ 2.97/tpmC available: January 22, 2007
• DB2 9.5 on IBM POWER 595 (64 core POWER6 5.0GHz): 6,085,166 tpmC @ 2.81/tpmC available: December 10, 2008
Results current as of June 24, 2009 Check
http://www.tpc.org for latest results
64x 5GHz
POWER6
4 TB RAM
10,900 disks
• Higher is
better
23
World Record TPC-C Performance on x64 with
RedHat Linux
1,200,632
1,020,000
841,809
220,000
420,000
620,000
820,000
1,420,000
1,220,000
DB2 9.5 SQL Server 2005
tpmC
IBM x3950 M2
Intel Xeon7460
RHEL 5.2
IBM x3950 M2
Intel Xeon7350
Win2003
TPC Benchmark, TPC-C, tpmC, are trademarks of the Transaction Processing Performance Council.
•DB2 9.5 on IBM System x3950 M2 (8 Processor 48 core Intel Xeon 7460 2.66GHz): 1,200,632 tpmC @ $1.99/tpmC
available: December 10, 2008
• SQL Server 2005 on HP DL580G5G4 (8 Processor 32 core Intel Xeon 7350 2.93GHz): 841,809 tpmC @$3.46/tpmC
available: April 1, 2008
• Higher
is better
Results current as of June 24, 2009.
Check http://www.tpc.org for latest results
24
World record 10 TB TPC-H result on IBM Balanced
Warehouse E7100
IBM System p6 570 & DB2 9.5 create top 10TB TPC-H performance
208457
108099
343551
60,000
0
180,00
0
120,00
0
300,00
0
240,00
0
360,00
0
QphH
IBM p6 570/DB2 9.5
HP Integrity Superdome-DC Itanium/Oracle 11g
Sun Fire 25K/Oracle 10g
•Significant proof-point for the IBM
Balanced Warehouse E7100
•DB2 Warehouse 9.5 takes DB2
performance on AIX to new levels
•65% faster than Oracle 11g best
result
•Loaded 10TB data @ 6 TB / hour
(incl. data load, index creation,
runstats)
• Higher
is better
TPC Benchmark, TPC-H, QphH, are trademarks of the Transaction Processing Performance Council.
•DB2 Warehouse 9.5 on IBM System p6 570 (128 core p6 4.7GHz), 343551 QphH@10000GB,
32.89 USD per QphH@10000GB available: April 15, 2008
•Oracle 10g Enterprise Ed R2 w/ Partitioning on HP Integrity Superdome-DC Itanium 2 (128 core Intel Dual Core Itanium
2 9140 1.6 GHz), 208457 QphH@10000GB, 27.97 USD per QphH@10000GB, available: September 10, 2008
•Oracle 10g Enterprise Ed R2 w/ Partitioning on Sun Fire E25K (144 core Sun UltraSparc IV+ - 1500 MHz): 108099
QphH @53.80 USD per QphH@10000GB available: January 23, 2006
Results current as of June 24, 2009
Check http://www.tpc.org for latest results
25
World record SAP 3-tier SD Benchmark
This benchmark represents a 3
tier SAP R/3 environment in
which the database resides on
its own server where database
performance is the critical factor
DB2 outperforms Oracle by 68%
and SQL Server by 80%
– DB2 running on 32-way p5 595
– Oracle and SQL Server 2000
running on 64-way HP
Top SAP SD 3-tier Results byDBMS Vendor
168300
100000
93000
0
20000
40000
60000
80000
100000
120000
140000
160000
180000
SDUsers
DB2 8.2 on 32way p5 595
SQL Server on 64-way HPIntegrity
Oracle 10g on 64way HP Integrity
Results current as of June 24, 2009
Check http://www.sap.com/benchmark for latest results
26
• Higher
is better
More SAP performance than any 8-socket server
Result comparable to a 32-socket 128-core Sun M9000
32-core
Sun T5440
4-sockets 8-sockets 32-sockets
24-core
Opteron
32-core
Power 750
48-core
Opteron
48-core
Opteron
128-core
Sun M9000
Power 750
Express
15,600
SAP SD 2-Tier Users on
The IBM Power 750 Express
With DB2 9.7 on AIX 6.1
27 http://www.sap.com/benchmark for latest results
Results current as of March 03, 2010 Check
Best SAP SD 2-Tier performance with SAP 6 ERP 4
20% more performance, 1/4 the number of cores vs. Sun M9000
4p/32c/128t 8p/64c/256t
Sun M9000
SPARC
32p/128c/256-t
32 sockets
Sun M9000
SPARC
64p/256c/512t
64 sockets
IBM Power System 780, 8p / 64c / 256t, POWER7, 3.8 GHz, 1024 GB memory, 37,000 SD users, dialog resp.: 0.98s, line items/hour: 4,043,670, Dialog steps/hour: 12,131,000, SAPS: 202,180,
DB time (dialog/ update):0.013s / 0.031s, CPU utilization: 99%, OS: AIX 6.1, DB2 9.7, cert# 2010013. SUN M9000, 64p / 256c / 512t, 1156 GB memory, 32,000 SD users, SPARC64 VII, 2.88
SAP SD
Users
All results are with SAP ERP 6 EHP4
Sun T5440
SPARC
4p/32c/256t
IBM X3850
Nehalem-EX
4p/32c/64t
4 sockets
Power 750 Sun X4640
Opteron
8p/48c/48t
Fujitsu 1800E
Nehalem-EX
8p/64c/128t
8 sockets
Power 780
37,000SAP users on SAP SD 2 Tier
Power 780
with DB2
#1
4-so ket
Windows
#1
#1Overall
4-socket
Power 750
with DB2
System
x3850 X5
with DB2
GHz, Solaris 10, Oracle 10g , cert# 2009046.
28 Results current as of April 07, 2010. Check
Benchmark
Multi-tier end-to-end performance
benchmark for Java EE 5
Single node result: 1014.40 EjOPS
8 nodes cluster result: 7903.16
EjOPS
– Approx. 38,500 tx/sec,
135,000 SQL/sec
– WAS 7 on 8x HS22 Blades
(Intel Xeon X5570 2-socket/8-
core)
– DB2 9.7 FP1 on x3850 M2
(Intel Xeon X7460 4-socket/24-
core),
SLES 10 SP2
Result published on January 7, 2010
First to Publish SPECjEnterprise2010
29 Results as of January 7, 2010
More Efficient performance than Ever
30
3,000
Infor Baan ERP 2-Tier Users on
The IBM Power 750 Express
using DB2 9.7.
 More performance, with less space and far less energy
consumption than ever
Infor ERP LN Benchmark results on P6 / P7
P6 P7
System p 570 p 750
Processor Speed 5 GHz 3.55 GHz
No. of chips or sockets 8 2
cores / chip 2 8
Total number of cores 16 16
Total Memory 256 GB 256 GB
AIXversion 6.1 6.1
DB2 Version 9.7 GA 9.7 GA
# Infor Baan Users 2800 3000
# users / core 175 187.5
# users / chip 350 1500
Performance Improvements
DB2 9.7 has tremendous new capabilities that can
substantially improve performance
When you think about the new features …
– “It depends”
– We don’t know everything (yet)
– Your mileage will vary
– Please provide feedback!
31
Active
Subagents
db2agntp
Process/Thread
Coordinator
Agen s
db2pcl
Cl
nr
db2pfchr
db2loggw
db2dlock
db2agntp
db2loggr
Prefetche
rs
Page
eaners
Buffer Pool(s)
Deadlock
Detector
L
Subsyste
m
L
o
g Buffer
Database Level
Idle
Big -
bl oc
sts
k,
ogging
Wr ite
Lo g
Req ue
sts
syn
ef etc h
Req ue sts
c
IO Pr
Data DisksLog
Disks
Commo
n
iCel ervnetr
subagent
UDB Client Library
UDB S
OrgPer-instance
Listeners
Instance Level
db2tcpcm db2ipccmdb2agent (idle)
db2agent A
anization
Idle Agent Pool
Idle, pooled agent or
t
Per-application
Per-databaseSingle, Multi-threaded Process
db2sysc
32
TCPIP (remote clients) or Shared Memory & Semaphores (local clients)
DB2 Threaded Architecture
Performance Advantages of the Threaded Architecture
Context switching between threads is generally faster than between
processes
– No need to switch address space
– Less cache “pollution”
Operating system threads require less context than processes
– Share address space, context information (such as uid, file handle table,
etc)
– Memory savings
Significantly fewer system file descriptors used
– All threads in a process can share the same file descriptors
– No need to have each agent maintain its own file descriptor table
33
From the existing DB2 9 Deep Compression …
Reduce storage costs
Improve performance
Easy to implement
1.5 Times
Better
3.3 Times
Better
2.0 Times
Better
8.7 Times
Better
DB2 9 Other
“With DB2 9, we’re seeing compression rates up to 83% on the Data
Warehouse. The projected cost savings are more than $2 million initially
with ongoing savings of $500,000 a year.” - Michael Henson
“We achieved a 43 per cent saving in total storage requirements when using DB2 with
Deep Compression for its SAP NetWeaver BI application, when compared with the former
Oracle database, The total size of the database shrank from 8TB to 4.5TB, and
response times were improved by 15 per cent. Some batch applications and change
runs were reduced by a factor of ten when using IBM DB2.” - Markus Dell ermann
34
Index Compression
What is Index Compression?
The ability to decrease the storage
requirements from indexes through
compression.
By default, if the table is
compressed the indexes created
for the table will also be
compressed.
– including the XML indexes
Index compression can be
explicitly enabled/disabled when
creating or altering an index.
Why do we need Index Compression?
Index compression reduces disk cost
and TCO (total cost of ownership)
Index compression can improve
runtime performance of queries that
are I/O bound.
When does Index Compression work
best?
– Indexes for tables declared in a
large RID DMS tablespaces (default
since DB2 9).
– Indexes that have low key
cardinality & high cluster ratio.
35
Index Compression
Page Header
Index Page (pre DB2 9.7)
Fixed Slot Directory (maximum size reserved)
AAAB, 1, CCC
AAAB, 1, CCD
BBBZ, 1, ZZZ
1055, 1056
3011, 3025, 3026, 3027, 3029, 3033, 3035, 3036, 3037
3009, 3012, 3013, 3015, 3016, 3017, 3109
BBBZ, 1, ZZCCAAAE 6008, 6009, 6010, 6011
Index Key RID List
How does Index
Compression Work?
• DB2 will consider multiple
compression algorithms to
attain maximum index
space savings through
index compression.
36
Index Compression
Page Header
Index Page (DB2 9.7)
Saved Space from
Variable Slot Directory
AAAB, 1, CCC
AAAB, 1, CCD
BBBZ, 1, ZZZ
1055, 1056
3011, 3025, 3026, 3027, 3029, 3033, 3035, 3036, 3037
3009, 3012, 3013, 3015, 3016, 3017, 3109
BBBZ, 1, ZZCCAAAE 6008, 6009, 6010, 6011
Variable Slot Directory
• In 9.7, a slot directory is
dynamically adjusted in order
to fit as many keys into an
index page as possible.
Variable Slot
Directory
Index Key RID List
37
1055, 1 Saved
3011, 14, 1, 1, 2, 4, 2, 1, 1
3009, 3, 1, 2, 1, 1, 92
Saved from RID List
Saved
Saved
Index Compression
Page Header
Index Page (DB2 9.7)
Saved Space from
Variable Slot Directory
RID Deltas
AAAB, 1, CCC
AAAB, 1, CCD
BBBZ, 1, ZZZ
BBBZ, 1, ZZCCAAAE 6008, 1, 1, 1
Variable Slot
Directory
First RID
Index Key
Compressed
RID
RID List Compression
38
• Instead of saving the full version of a
RID, we can save some space by
storing the delta between two RIDs.
• RID List compression is enabled when
there are 3 or more RIDs in an index
page.
Saved
Saved from RID List and Prefix Compression
Saved
Saved
Index Compression
C 1055, 1
D 3011, 14, 1, 1, 2, 4, 2, 1, 1
Z 3009, 3, 1, 2, 1, 1, 92
CCAAAE 6008, 1, 1, 1
COMMON
PREFIX
Index Page (DB2 9.7)
Page Header
Saved Space from
Variable Slot Directory
Prefix Compression
Compressed
• Instead of saving all key values, we can save some
space by storing a common prefix and suffix records.
• During index creation or insertion, DB2 will compare
the new key with adjacent index keys and find the
longest common prefixes between them.
Variable Slot
Directory
AAAB, 1, CC
BBBZ, 1, ZZ
0,
2
SUFFIX
RECORDS
Key Compressed
RID
39
Simple Index Compr ession Tests - E lapsed Time
49.12
49.24
83.99
53.89
28.31
33.67
68.3
44.07
0 10 20 30 40
Seconds
50 60 70
80 90
Simple Select
Simple Insert
Simple Update
Simple Delete
Without Index Compression With Index Compression
Index Compression
34.5 34.8
16.2 20.8 23.6
33.9
6.8 10.5
1.6
2.0
2.6
2.5
3.1
3.3
52.2
52.1
0%
20%
40%
Select: Select:
Base Ixcomp
Insert: Insert:
Base Ixcomp
Update: Update:
Base Ixcomp
Delete: Delete:
Base Ixcomp
MachineUtiliza
tion
user system idle iowait
ComplexQueryDatabase
WarehouseTested
* Higher is better
SimpleIndexCompressionTests
16.7 17.5
37.1 36.4
49.1
46.3
48.2
45.0
11.7 11.4
33.3 30.9 25.9
18.5
38.0 34.2
60%
80%
100%
Estimated In dex C ompression Savin gs
16%
10% 20% 30% 40% 50% 60% 70%
Percentage Com pressed (Indexes)
20%
24%
31%
50%
55%
57%
0%
W arehouse #1
W arehouse #2
W arehouse #3
W arehouse #4
W arehouse #5
W arehouse #6
W arehouse #7
Average 36%
Runs
18% Faster
Runs
19% Faster
Runs
As fast
• Lower is better
Results in a Nutshell
• Index compression uses idle CPU
cycles and idle cycles spent waiting
for I/O to compress & decompress
index data.
• When we are not CPU bound, we are
able to achieve better performance in
all inserts, deletes and updates.
Runs
40
16% Faster
Temp Table Compression
What is Temp Table Compression?
The ability to decrease storage
requirements by compressing temp
table data
Temp tables created as a result of
the following operations are
compressed by default:
– Temps from Sorts
– Created Global Temp Tables
– Declared Global Temp Tables
– Table queues (TQ)
Why do we need Temp Table
Compression on relational
databases?
Temp table spaces can account
for up to 1/3 of the overall
tablespace storage in some
database environments.
Temp compression reduces disk
cost and TCO (total cost of
ownership)
41
Temp Table Compression
Canada|Ontario|Toronto|Matthew
Canada|Ontario|Toronto|Mark
USA|Illinois|Chicago|Luke
USA|Illinois|Chicago|John
0x12f0 – CanadaOntarioToronto …
0xe57a – Mathew …
0xff0a – Mark …
0x15ab – USAIllinoixChicago …
0xdb0a – Luke …
0x544d – John …
Create dictionary from sample data
String of data across a row
How does Temp Table Compression Work?
– It extends the existing row-level compression mechanism that currently
applies to permanent tables, into temp tables.
0x12f0,0xe57a
0x12f0,0xff0a
0x15ab,0xdb0a
0x15ab,0x544d
Saved data (compressed)
Lempel-Ziv Algorithm
42
Query Workload CPU Analysis for Temp Compression
39.26
46.50
1.7
1.3
29.00
29.50
22.19
14.61
0%
20%
40%
60%
80%
100%
Baseline Temp Compression
user sys idle iowait
Temp Table Compression
SpaceSavingsforComplexWarehouseQuerieswithTemp
Compression
78.3
50.2
0.0
20.0
40.0
60.0
80.0
100.0
WithoutTempCompTotalBytesStored WithTempCompBytesStored
Size(Gigabyt
es)
Saves
35%
Space
Effective
CPU
Usage
• Lower is better
ElapsedTimeforComplexWarehouseQuerieswithTemp
Compression
183.98
175.56
120.00
130.00
140.00
150.00
160.00
170.00
180.00
190.00
200.00
WithoutTempCompRuntime WithTempCompRuntime
Minu
tes
5%
Faster
• Lower is better
Results in a Nutshell
For affected temp compression
enabled complex queries, an average
of 35% temp tablespace space
savings was observed. For the
100GB warehouse database setup,
this sums up to over 28GB of saved
temp space.
43
XML Data Compression
What is XML Data Compression?
The ability to decrease the storage
requirements of XML data through
compression.
XML Compression extends row
compression support to the XML
documents.
If row compression is enabled for
the table, the XML data will be also
compressed. If row compression is
not enabled, the XML data will not
be compressed either.
Why do we need XML Data
Compression?
Compressing XML data can improve
storage efficiency and runtime
performance of queries that are I/O
bound.
XML compression reduces disk cost and
TCO (total cost of ownership) for
databases with XML data
44
XML Data Compression
Relational
Data
Data (uncompressed)
< 32KB
XML Data
32KB – 2GB
XML Data
Comp.
Data
Data (compressed)
Inlined
< 32KB
XML Data
Compressed
32KB – 2GB
XML Data
Dictionary
#1
Dictionary
#2
How does XML Data Compression
Work?
– Small XML documents (< 32k) can be
inlined with any relational data in the
row and the entire row is compressed.
• Available since DB2 9.5
– Larger XML documents that reside in
a data area separate from relational
data can also be compressed. By
default, DB2 places XML data in the
XDA to handle documents up to 2GB
in size.
– XML compression relies on a separate
dictionary than the one used for row
compression.
45
XML Data Compression
X M L C o m p re s s io n S a v in g s
4 3 %
6 1 %
6 3 %
6 3 %
7 4 %
7 7 %
7 7 %
0 % 2 0 % 4 0 % 6 0 %
P e r c e n ta g e C o m p r e s s e d
8 0 %
X M L D B Test # 1
X M L D B Test # 2
X M L D B Test # 3
X M L D B Test # 4
X M L D B Test # 5
X M L D B Test # 6
X M L D B Test # 7
XMLDatabaseTested
Results in a Nutshell
Significantly improved query
performance for I/O-bound
workloads.
Achieved 30% faster
maintenance operations
such as RUNSTATS, index
creation, and import.
Average compression
savings of ⅔ across 7
different XML customer
databases and about ¾
space savings for 3 of those
7 databases.
Average Elapsed Time for SQLXML and Xquery Queries over an XML
and Relational Data database using XDA Compression
31.1
19.7
0
5
10
15
20
25
30
35
Without XML Compression With XML Compression
Time(sec)
Average 67%
• Lower is better
• Higher is better
37%
Faster
46
Range Partitioning with Local Indexes
47
What does Range Partitioning
with Local Indexes mean?
– A partitioned index is an index
which is divided up across
multiple storage objects, one per
data partition, and is partitioned in
the same manner as the table
data
– Local Indexes can be created
using the PARTITIONED
keyword when creating an index
on a partitioned table (Note:
MDC block indexes are
partitioned by default)
Why do we need Range
Partitioning with local Indexes?
– Improved ATTACH and DETACH
partition operations
– More efficient access plans
– More efficient REORGs.
When does Range Partitioning with
Local Indexes work best?
– When frequents roll-in and roll-out of
data are performed
– When one tablespace is defined per
range.
Index siz e com parison: Leaf page count
18,409
13,476
0
4,000
8,000
12,000
16,000
20,000
global index on RP table local index on RP table
Indexleafpages
Results in a Nutshell
Partition maintenance with ATTACH:
– 20x speedup compared to DB2 9.5
global index because of reduced
index maintenance.
– 3000x less log space used than with
DB 9.5 global indexes.
Asynchronous index maintenance on
DETACH is eliminated.
Local indexes occupy fewer disk
pages than 9.5 global indexes.
– 25% space savings is typical.
– 12% query speedup over global
indexes for index queries – fewer
page reads.
25%
Space
Savings
• Lower is better
Local Indexes
* Lower is better
Range Partitioning with Local Indexes
Total Time and Log Space required to ATTACH 1.2
million rows
651.84
0.05
0.03
0.21
1.E-02
1.E-01
1.E+00
1.E+01
1.E+02
1.E+03
V9.5 Global Indexes V9.7 Local Indexes V9.7 Local IndexesNo Indexes - Baseline
built during ATTACHbuilt before ATTACH
LogSpacerequired(MB)
180.00
160.00
140.00
120.00
100.00
80.00
60.00
40.00
20.00
0.00
Attach/SetIntegritytime(sec)
Log Space used,
MB
Attach/Set Integrity
time (sec)
48
Scan Sharing
What is Scan Sharing?
It is the ability of one scan to exploit
the work done by another scan This
feature targets heavy scans such
as table scans or MDC block index
scans of large tables.
Scan Sharing is enabled by default
on DB2 9.7
Why do we need Scan Sharing?
Improved concurrency
Faster query response times
Increased throughput
When does Scan Sharing work
best?
Scan Sharing works best on
workloads that involve several
clients running similar queries
(simple or complex), which involve
the same heavy scanning
mechanism (table scans or MDC
block index scans).
49
Scan Sharing
How does Scan Sharing work?
– When applying scan sharing, scans
may start somewhere other than the
usual beginning, to take advantage of
pages that are already in the buffer
pool from scans that are already
running.
– When a sharing scan reaches the end
of file, it will start over at the beginning
and finish when it reaches the point
that it started.
– Eligibility for scan sharing and for
wrapping are determined
automatically in the SQL compiler.
– In DB2 9.7, scan sharing is supported
for table scans and block index
scans.
Unshared Scan
Shared Scan
A
scan
B
scan
Re-read pages
causing extra I/O
A
scan
Shared
A & B scan
B
scan
50
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7 8
1 2 3
4 5 6 7 8
1 2 3
Block Index Scan Test : Q1 and Q6 Interleaved
Q1
Q6
Q1
Q6
Q1
Q6
Q1
Q6
Q1
Q6
Q1
Q6
Q1
Q6
Q1
Q6
Q1
Q6
QueryRan
staggeringevery10sec
0 50 100 150 200 250 300 350 400 450 500 550 600
Scan Sharing
Q1
Q6
Q1
Q6
Q1
Q6
Q1
Q6
Q1
Q6
Q1
Q6
Q1
Q6
Q1
Q6
Q1
Q6
QueryRan
staggeringevery10sec
0 50 100 150 200 250 300 350 400 450 500 550 600
No Scan Sharing
Q1 : CPU Intensive
Q6 : IO Intensive
Scan Sharing Tests on Table Scan
1,284.6
90.3
0.0
200.0
400.0
600.0
800.0
1,000.0
1,200.0
1,400.0
No Scan Sharing Scan Sharing
Average of running 100 Instances of Q1
Seconds
Scan Sharing
• Lower is better
• Lower is better
Runs
14x
Faster!
• MDC Block Index Scan Sharing
shows 47% average query
improvement gain.
• The fastest query shows up to
56% runtime gain with scan
sharing.
• 100 concurrent table scans
now run 14 times faster
with scan sharing!
Runs
47%
51
Faster!
Complex Queries per Hour Throughputfor a 10GBWarehouse
Database: 16 Parallel Streams
381.92
636.43
0
100
200
300
400
500
600
700
Scan SharingOFF Scan SharingON
Scan Sharing
• Higher is better
67%
Throughput
Improved
Results in a Nutshell
When running 16 concurrent streams of complex queries in parallel, a 67% increase in
throughput is attained when using scan sharing.
Scan sharing works fully on UR and CS isolation and by design, has limited applicability on
RR and RS isolation levels.
52
XML Scalability on Infosphere Warehouse (a.k.a DPF)
What does it mean?
Tables containing XML
column definitions can now
be stored and distributed on
any partition.
XML data processing is
optimized based on their
partitions.
Why do we need XML in database partitioned environments?
As customers adopt the XML datatype in their warehouses, XML data
needs to scale just as relational data
XML data also achieves the same benefit from performance
improvements attained from the parallelization in DPF environments.
53
XML Scalability on Infosphere Warehouse (a.k.a DPF)
Simple query: Elapsed time speedup from 4 to 8 partitions
0
0.5
1
1.5
2
2.5
count w ith count, no grouped agg
index index
update colo join noncolo join
Elapsedtime4P/8P
rel xml xmlrel
*
Results in a Nutshell
Table results show the elapsed time
performance speedup of complex
queries from a 4 partition setup to an
8 partition setup. Queries tested
have a similar star-schema balance
for relational and XML.
Each query run in 2 or 3 equivalent
variants:
– Completely relational (“rel”)
– Completely XML (“xml”)
– XML extraction/predicates with
relational joins (“xmlrel”) (join
queries only)
Queries/updates/deletes scale as
well as relational ones.
Average XML query-speedup is 96%
of relational
Complex query: Elapsed time speedup from 4 to 8 partitions
0
0.5
1
1.5
2
2.5
3
3.5
1 2 3 4 5 6 7 8 9 10
Query number
Elapsedtime4P/8P
rel xml xmlrel
54
Statement Concentrator
Why do we need the statement
concentrator?
This feature is aimed at OLTP workloads
where simple statements are repeatedly
generated with different literal values. In
these workloads, the cost of recompiling
the statements many times adds a
significant overhead.
Statement concentrator avoids this
compilation overhead by allowing the
compiled statement to be reused,
regardless of the values of the literals.
What is the statement
concentrator?
It is a technology that allows
dynamic SQL statements
that are identical, except for
the value of its literals, to
share the same access plan.
The statement concentrator
is disabled by default, and
can be enabled either
through the database
configuration parameter
(STMT_CONC) or from the
prepare attribute
55
Statement Concentrator
Effect of the Statement Concentrator on Prepare
times for 20,000 statements using 20 users
436
23
0
100
200
300
400
500
Concentrator off Concentrator on
PrepareTime(sec)
19x
Reduction
in Prepare
time!
• Lower is better
Results in a Nutshell
The statement
concentrator allows
prepare time to run up to
25x faster for a single user
and 19x faster for 20
users.
The statement
concentrator improved
throughput by 35% in a
typical OLTP workload
using 25 users
Effect of the Statement Concentrator for an OLTP workload
133
180
200
180
160
140
120
100
80
60
40
20
0
Concentrator Off Concentrator On
Throughpu
t
• Higher is better
35%
Throughput
Improved!
56
Currently Committed
What is Currently Committed?
Currently Committed semantics
have been introduced in DB2 9.7
to improve concurrency where
readers are not blocked by
writers to release row locks when
using Cursor Stability (CS)
isolation.
The readers are given the last
committed version of data, that
is, the version prior to the start of
a write operation.
Currently Committed is
controlled with the
CUR_COMMIT database
configuration parameter
Why do we need the Currently
Committed feature?
Customers running high
throughput database applications
cannot tolerate waiting on locks
during transaction processing and
require non-blocking behavior for
read transactions.
57
Currently Committed
Results in a Nutshell
By enabling currently
committed, we use CPU that
was previously idle (18%),
leading to an increase of over
28% in throughput.
Throughput of OLTP Workload using Currently
Committed
981.25
1,260.89
0
300
600
900
1,200
1,500
Currently Commit Disabled Currently Commit Enabled
Transactionspersecond
CPU Analysis - CPU Analysis on Currently Committed
45.0
58.9
12.9
17.2
33.5
5.0
8.7
19.0
0%
20%
40%
60%
80%
100%
CC Disabled CC Enabled
user system idle iowait
Effective
CPU
usage
Allows
28% more
throughput
• Higher is better
With currently committed
enabled, we see reduced
LOCK WAIT time by
nearly 20%.
We observe expected
increases in LSN GAP
cleaners and increased
logging.
58
LOB Inlining
Why do we need the LOB Inlining
feature?
Performance will increase for queries
that access inlined LOB data as no
additional I/O is required to fetch the
LOB data.
LOBS are prime candidates for
compression given their size and the
type of data they represent. By
inlining LOBS, this data is then
eligible for compression, allowing
further space savings and I/O from
this feature.
What is LOB INLINING?
LOB inlining allows customers to
store LOB data within a formatted
data row in a data page instead of
creating separate LOB object.
Once the LOB data is inlined into
the base table row, LOB data is
then eligible to be compressed.
59
LOB Inlining
Inlined LOB vs. Non-Inlined LOB
75% 75%
64%
55%
70%
65%
7%
22%
30%
10%
0%
80%
70%
60%
50%
40%
30%
20%
8kLob
16kLob
32kLob
Size of LOB
Insert Performance Select Performance Update Performance
%Improvement
Results in a Nutshell
INSERT and SELECT
operations are the ones
with more benefit. The
smaller the LOB the
bigger the benefit of the
inlining
For UPDATE operations
the larger the LOB the
better the improvements
We can expect the inlined
LOBs will have the same
performance as a
varchar(N+4)
60
* Higher is better
Summary of Key DB2 9.7 Performance Features
Compression for indexes, temp tablespaces and XML data results on space
savings and better performance
Range Partitioning with local indexes results in space savings and better
performance including increased concurrency for certain operations like
REORG and set integrity. It also makes roll-in and roll-out of data more
efficient.
Scan Sharing improves workloads that have multiple heavy scans in the
same table.
XML Scalability allows customers to exploit the same benefits in data
warehouses as they exist for relational data
Statement Concentrator improves the performance of queries that use
literals reducing their prepare times
Currently Committed increases throughput and reduces the contention on
locks
LOB Inlining allows this type of data to be eligible for compression
61
A glimpse at the Future
Expect more leadership benchmark results on POWER7
and Nehalam EX
Preparing for new workloads
– Combined OLTP and Analytics
Preparing for new operating environments
– Virtualization
– Cloud
– Power-aware
Preparing for new hardware
– SSD storage
– POWER7
– Nehalem EX
62
Conclusion
DB2 is the performance benchmark leader
New features in DB2 9.7 that further boost performance
– For BOTH the OLTP and Data warehouse areas
Performance is a critical and integral part of DB2!
– Maintaining excellent performance
• On current hardware
• Over the course of DB2 maintenance
– Preparing for future hardware/OS technology
63
Appendix – Mandatory SAP publication data
Required SAP Information
For more information regarding these results and SAP benchmarks, visit www.sap.com/benchmark.
These benchmark fully complies with the SAP Benchmark Council regulations and has been audited and certified by SAP AG
SAP 3-tier SD Benchmark:
168,300 SD benchmark users. SAP R/3 4.7. 3-tier with database server: IBM eServer p5 Model 595, 32-way SMP, POWER5 1.9 GHz, 32 KB(D) + 64 KB(I)
L1 cache per processor, 1.92 MB L2 cache and 36 MB L3 cache per 2 processors. DB2 v8.2.2, AIX 5.3 (cert # 2005021)
100,000 SD benchmark users. SAP R/3 4.7. 3-tier with database server: HP Integrity Model SD64A, 64-way SMP, Intel Itanium 2 1.6 GHz, 32 KB L1 cache,
256 KB L2 cache, 9 MB L3 cache. Oracle 10g, HP-UX11i (cert # 2004068)
93,000 SD benchmark users. SAP R/3 4.7. 3-tier with database server: HP Integrity Superdome 64P Server, 64-way SMP, Intel Itanium 2 1.6 GHz, 32 KB L1
cache, 256 KB L2 cache, 9 MB L3 cache . SQL Server 2005, Windows 2003 (cert # 2005045)
SAP 3-tier BW Benchmark:
311,004 throughput./hour query navigation steps.. SAP BW 3.5. Cluster of 32 servers, each with IBM x346 Model 884041U, 1 processor/ 1 core/ 2 threads,
Intel XEON 3.6 GHz, L1 Execution Trace Cache, 2 MB L2 cache, 2 GB main memory. DB2 8.2.3 SLES 9. (cert # 2005043)
SAP TRBK Benchmark:
15,519,000. Day processing no. of postings to bank accounts/hour. SAP Deposit Management 4.0. IBM System p570, 4 core, POWER6, 64GB RAM. DB2 9
on AIX 5.3. (cert # 2007050)
10,012,000 Day processing no. of postings to bank accounts/hour. SAP Account Management 3.0. Sun Fire E6900, 16 core, UltraSPARC1V, 56GB RAM,
Oracle 10g on Solaris 10, (cert # 2006018)
8,279,000 Day processing no. of postings to bank accounts/hour/ SAP Account Management 3.0. HP rx8620, 16 core, HP mx2 DC,64 GB RAM, SQL Server
on Windows Server (cert # 2005052)
SD 2-tier SD Benchmark:
39,100 SD benchmark users, SAP ECC 6.0. Sun SPARC Enterprise Server M9000, 64 processors / 256 cores / 512 threads, SPARC64 VII, 2.52 GHz, 64
KB(D) + 64 KB(I) L1 cache per core, 6 MB L2 cache per processor, 1024 GB main memory, Oracle 10g on Solaris 10. (cert # 2008-042-1)
35,400 SD benchmark users, SAP ECC 6.0. IBM Power 595, 32 processors / 64 cores / 128 threads, POWER6 5.0 GHz, 128 KB L1 cache and 4 MB L2
cache per core, 32 MB L3 cache per processor, 512 GB main memory. DB2 9.5, AIX 6.1. (Cert# 2008019).
30,000 SD benchmark users. SAP ECC 6.0. HP Integrity SD64B , 64 processors/128 cores/256 threads, Dual-Core Intel Itanium 2 9050 1.6 GHz, 32 KB(I) +
32 KB(D) L1 cache, 2 MB(I) + 512 KB(D) L2 cache, 24 MB L3 cache, 512 GB main memory. Oracle 10g on HP-UX 11iV3. (cert # 2006089)
23,456 SD benchmark users. SAP ECC 5.0. Central server: IBM System p5 Model 595, 64-way SMP, POWER5+ 2.3GHz, 32 KB(D) + 64 KB(I) L1 cache per
processor, 1.92 MB L2 cache and 36 MB L3 cache per 2 processors. DB2 9, AIX 5.3 (cert # 2006045)
20,000 SD benchmark users. SAP ECC 4.7. IBM eServer p5 Model 595, 64-way SMP, POWER5, 1.9 GHz, 32 KB(D) + 64 KB(I) L1 cache per processor, 1.92
MB L2 cache and 36 MB L3 cache per 2 processors, 512 GB main memory. (cert # 2004062)
These benchmarks fully comply with SAP Benchmark Council's issued benchmark regulations and have been audited and certified by SAP. For more
information, see http://www.sap.com/benchmark
64

Contenu connexe

Tendances

Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
Christalin Nelson
 

Tendances (20)

Best Practices For Optimizing DB2 Performance Final
Best Practices For Optimizing DB2 Performance FinalBest Practices For Optimizing DB2 Performance Final
Best Practices For Optimizing DB2 Performance Final
 
Db2 recovery IDUG EMEA 2013
Db2 recovery IDUG EMEA 2013Db2 recovery IDUG EMEA 2013
Db2 recovery IDUG EMEA 2013
 
DB2 TABLESPACES
DB2 TABLESPACESDB2 TABLESPACES
DB2 TABLESPACES
 
DB2 Interview Questions - Part 1
DB2 Interview Questions - Part 1DB2 Interview Questions - Part 1
DB2 Interview Questions - Part 1
 
DB2 LUW Auditing
DB2 LUW AuditingDB2 LUW Auditing
DB2 LUW Auditing
 
Db2 and storage management (mullins)
Db2 and storage management (mullins)Db2 and storage management (mullins)
Db2 and storage management (mullins)
 
A DBA’s guide to using TSA
A DBA’s guide to using TSAA DBA’s guide to using TSA
A DBA’s guide to using TSA
 
Z4R: Intro to Storage and DFSMS for z/OS
Z4R: Intro to Storage and DFSMS for z/OSZ4R: Intro to Storage and DFSMS for z/OS
Z4R: Intro to Storage and DFSMS for z/OS
 
DB2 10 & 11 for z/OS System Performance Monitoring and Optimisation
DB2 10 & 11 for z/OS System Performance Monitoring and OptimisationDB2 10 & 11 for z/OS System Performance Monitoring and Optimisation
DB2 10 & 11 for z/OS System Performance Monitoring and Optimisation
 
DB2 Security Model
DB2 Security ModelDB2 Security Model
DB2 Security Model
 
Best practices for DB2 for z/OS log based recovery
Best practices for DB2 for z/OS log based recoveryBest practices for DB2 for z/OS log based recovery
Best practices for DB2 for z/OS log based recovery
 
Table Partitioning in SQL Server: A Magic Solution for Better Performance? (P...
Table Partitioning in SQL Server: A Magic Solution for Better Performance? (P...Table Partitioning in SQL Server: A Magic Solution for Better Performance? (P...
Table Partitioning in SQL Server: A Magic Solution for Better Performance? (P...
 
Introduction of ISPF
Introduction of ISPFIntroduction of ISPF
Introduction of ISPF
 
Db2
Db2Db2
Db2
 
TSO Productivity
TSO ProductivityTSO Productivity
TSO Productivity
 
DB2 for z/OS Bufferpool Tuning win by Divide and Conquer or Lose by Multiply ...
DB2 for z/OS Bufferpool Tuning win by Divide and Conquer or Lose by Multiply ...DB2 for z/OS Bufferpool Tuning win by Divide and Conquer or Lose by Multiply ...
DB2 for z/OS Bufferpool Tuning win by Divide and Conquer or Lose by Multiply ...
 
DB2 on Mainframe
DB2 on MainframeDB2 on Mainframe
DB2 on Mainframe
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
DB2UDB_the_Basics
DB2UDB_the_BasicsDB2UDB_the_Basics
DB2UDB_the_Basics
 
IBM DB2 LUW UDB DBA Training by www.etraining.guru
IBM DB2 LUW UDB DBA Training by www.etraining.guruIBM DB2 LUW UDB DBA Training by www.etraining.guru
IBM DB2 LUW UDB DBA Training by www.etraining.guru
 

En vedette

Ibm db2 interview questions and answers
Ibm db2 interview questions and answersIbm db2 interview questions and answers
Ibm db2 interview questions and answers
Sweta Singh
 

En vedette (12)

Novinky v PostgreSQL 9.4 a JSONB
Novinky v PostgreSQL 9.4 a JSONBNovinky v PostgreSQL 9.4 a JSONB
Novinky v PostgreSQL 9.4 a JSONB
 
Practical experiences and best practices for SSD and IBM i
Practical experiences and best practices for SSD and IBM iPractical experiences and best practices for SSD and IBM i
Practical experiences and best practices for SSD and IBM i
 
Online Training in IBM DB2 LUW/UDB DBA in Hyderabad
Online Training in IBM DB2 LUW/UDB DBA in HyderabadOnline Training in IBM DB2 LUW/UDB DBA in Hyderabad
Online Training in IBM DB2 LUW/UDB DBA in Hyderabad
 
online training for IBM DB2 LUW UDB DBA
online training for IBM DB2 LUW UDB DBAonline training for IBM DB2 LUW UDB DBA
online training for IBM DB2 LUW UDB DBA
 
PostgreSQL na EXT4, XFS, BTRFS a ZFS / FOSDEM PgDay 2016
PostgreSQL na EXT4, XFS, BTRFS a ZFS / FOSDEM PgDay 2016PostgreSQL na EXT4, XFS, BTRFS a ZFS / FOSDEM PgDay 2016
PostgreSQL na EXT4, XFS, BTRFS a ZFS / FOSDEM PgDay 2016
 
Ibm db2
Ibm db2Ibm db2
Ibm db2
 
PostgreSQL on EXT4, XFS, BTRFS and ZFS
PostgreSQL on EXT4, XFS, BTRFS and ZFSPostgreSQL on EXT4, XFS, BTRFS and ZFS
PostgreSQL on EXT4, XFS, BTRFS and ZFS
 
Ibm db2
Ibm db2Ibm db2
Ibm db2
 
PostgreSQL performance improvements in 9.5 and 9.6
PostgreSQL performance improvements in 9.5 and 9.6PostgreSQL performance improvements in 9.5 and 9.6
PostgreSQL performance improvements in 9.5 and 9.6
 
PostgreSQL 9.6 Performance-Scalability Improvements
PostgreSQL 9.6 Performance-Scalability ImprovementsPostgreSQL 9.6 Performance-Scalability Improvements
PostgreSQL 9.6 Performance-Scalability Improvements
 
Couchbase Performance Benchmarking
Couchbase Performance BenchmarkingCouchbase Performance Benchmarking
Couchbase Performance Benchmarking
 
Ibm db2 interview questions and answers
Ibm db2 interview questions and answersIbm db2 interview questions and answers
Ibm db2 interview questions and answers
 

Similaire à Presentation db2 best practices for optimal performance

Storage and performance- Batch processing, Whiptail
Storage and performance- Batch processing, WhiptailStorage and performance- Batch processing, Whiptail
Storage and performance- Batch processing, Whiptail
Internet World
 
We4IT lcty 2013 - infra-man - domino run faster
We4IT lcty 2013 - infra-man - domino run faster We4IT lcty 2013 - infra-man - domino run faster
We4IT lcty 2013 - infra-man - domino run faster
We4IT Group
 

Similaire à Presentation db2 best practices for optimal performance (20)

Presentation db2 best practices for optimal performance
Presentation   db2 best practices for optimal performancePresentation   db2 best practices for optimal performance
Presentation db2 best practices for optimal performance
 
High Performance Hardware for Data Analysis
High Performance Hardware for Data AnalysisHigh Performance Hardware for Data Analysis
High Performance Hardware for Data Analysis
 
Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis
 
Gluster for Geeks: Performance Tuning Tips & Tricks
Gluster for Geeks: Performance Tuning Tips & TricksGluster for Geeks: Performance Tuning Tips & Tricks
Gluster for Geeks: Performance Tuning Tips & Tricks
 
Dba tuning
Dba tuningDba tuning
Dba tuning
 
Linux Huge Pages
Linux Huge PagesLinux Huge Pages
Linux Huge Pages
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
 
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
 
High Performance Hardware for Data Analysis
High Performance Hardware for Data AnalysisHigh Performance Hardware for Data Analysis
High Performance Hardware for Data Analysis
 
High Performance Hardware for Data Analysis
High Performance Hardware for Data AnalysisHigh Performance Hardware for Data Analysis
High Performance Hardware for Data Analysis
 
Add Memory, Improve Performance, and Lower Costs with IBM MAX5 Technology
Add Memory, Improve Performance, and Lower Costs with IBM MAX5 TechnologyAdd Memory, Improve Performance, and Lower Costs with IBM MAX5 Technology
Add Memory, Improve Performance, and Lower Costs with IBM MAX5 Technology
 
505 kobal exadata
505 kobal exadata505 kobal exadata
505 kobal exadata
 
Storage in hadoop
Storage in hadoopStorage in hadoop
Storage in hadoop
 
IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...
IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...
IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...
 
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
 
Tuning Linux for your database FLOSSUK 2016
Tuning Linux for your database FLOSSUK 2016Tuning Linux for your database FLOSSUK 2016
Tuning Linux for your database FLOSSUK 2016
 
Storage and performance- Batch processing, Whiptail
Storage and performance- Batch processing, WhiptailStorage and performance- Batch processing, Whiptail
Storage and performance- Batch processing, Whiptail
 
Tuning Linux Windows and Firebird for Heavy Workload
Tuning Linux Windows and Firebird for Heavy WorkloadTuning Linux Windows and Firebird for Heavy Workload
Tuning Linux Windows and Firebird for Heavy Workload
 
We4IT lcty 2013 - infra-man - domino run faster
We4IT lcty 2013 - infra-man - domino run faster We4IT lcty 2013 - infra-man - domino run faster
We4IT lcty 2013 - infra-man - domino run faster
 
45 ways to speed up firebird database
45 ways to speed up firebird database45 ways to speed up firebird database
45 ways to speed up firebird database
 

Plus de solarisyougood

Plus de solarisyougood (20)

Emc vipr srm workshop
Emc vipr srm workshopEmc vipr srm workshop
Emc vipr srm workshop
 
Emc recoverpoint technical
Emc recoverpoint technicalEmc recoverpoint technical
Emc recoverpoint technical
 
Emc vmax3 technical deep workshop
Emc vmax3 technical deep workshopEmc vmax3 technical deep workshop
Emc vmax3 technical deep workshop
 
EMC Atmos for service providers
EMC Atmos for service providersEMC Atmos for service providers
EMC Atmos for service providers
 
Cisco prime network 4.1 technical overview
Cisco prime network 4.1 technical overviewCisco prime network 4.1 technical overview
Cisco prime network 4.1 technical overview
 
Designing your xen desktop 7.5 environment with training guide
Designing your xen desktop 7.5 environment with training guideDesigning your xen desktop 7.5 environment with training guide
Designing your xen desktop 7.5 environment with training guide
 
Ibm aix technical deep dive workshop advanced administration and problem dete...
Ibm aix technical deep dive workshop advanced administration and problem dete...Ibm aix technical deep dive workshop advanced administration and problem dete...
Ibm aix technical deep dive workshop advanced administration and problem dete...
 
Ibm power ha v7 technical deep dive workshop
Ibm power ha v7 technical deep dive workshopIbm power ha v7 technical deep dive workshop
Ibm power ha v7 technical deep dive workshop
 
Power8 hardware technical deep dive workshop
Power8 hardware technical deep dive workshopPower8 hardware technical deep dive workshop
Power8 hardware technical deep dive workshop
 
Power systems virtualization with power kvm
Power systems virtualization with power kvmPower systems virtualization with power kvm
Power systems virtualization with power kvm
 
Power vc for powervm deep dive tips &amp; tricks
Power vc for powervm deep dive tips &amp; tricksPower vc for powervm deep dive tips &amp; tricks
Power vc for powervm deep dive tips &amp; tricks
 
Emc data domain technical deep dive workshop
Emc data domain  technical deep dive workshopEmc data domain  technical deep dive workshop
Emc data domain technical deep dive workshop
 
Ibm flash system v9000 technical deep dive workshop
Ibm flash system v9000 technical deep dive workshopIbm flash system v9000 technical deep dive workshop
Ibm flash system v9000 technical deep dive workshop
 
Emc vnx2 technical deep dive workshop
Emc vnx2 technical deep dive workshopEmc vnx2 technical deep dive workshop
Emc vnx2 technical deep dive workshop
 
Emc isilon technical deep dive workshop
Emc isilon technical deep dive workshopEmc isilon technical deep dive workshop
Emc isilon technical deep dive workshop
 
Emc ecs 2 technical deep dive workshop
Emc ecs 2 technical deep dive workshopEmc ecs 2 technical deep dive workshop
Emc ecs 2 technical deep dive workshop
 
Emc vplex deep dive
Emc vplex deep diveEmc vplex deep dive
Emc vplex deep dive
 
Cisco mds 9148 s training workshop
Cisco mds 9148 s training workshopCisco mds 9148 s training workshop
Cisco mds 9148 s training workshop
 
Cisco cloud computing deploying openstack
Cisco cloud computing deploying openstackCisco cloud computing deploying openstack
Cisco cloud computing deploying openstack
 
Se training storage grid webscale technical overview
Se training   storage grid webscale technical overviewSe training   storage grid webscale technical overview
Se training storage grid webscale technical overview
 

Dernier

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Dernier (20)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 

Presentation db2 best practices for optimal performance

  • 1. October 25–29, 2009 • Mandalay Bay • Las Vegas, Nevada 0 DB2 Best Practices for Optimal Performance Sunil Kamath Senior Technical Staff Member IBM Toronto Labs sunil.kamath@ca.ibm.com
  • 3. Agenda Basics – Sizing workloads – Best Practices for Physical Design Benchmarks DB2 9.7 Performance Improvements Summary 1 – Scan Sharing – XML in DPF – Statement Concentrator – Currently Committed – LOB Inlining – Compression – Index Compression – Temp Table Compression – XML Compression – Range Partitioning with local indexes
  • 4. Performance “Truisms” There is always a bottleneck! Remember the 5 fundamental bottleneck areas: 1. Application 2. CPU 3. Memory 4. Disk 5. Network Balance is key! 2
  • 5. Ideally one should understand: – The application – Load process requirements – Number of concurrent users/jobs – Largest tables' sizes – Typical query scenarios – Size of answer sets being generated – Response time objectives for loads and queries – Availability requirements – … Sizing a Configuration 3
  • 6. Sizing “Rules of Thumb” Platform choice CPU Memory Disk – Space – Spindles 4
  • 7. Platform Selection DB2 is highly optimized for all major platforms – AIX, Linux, Windows, Solaris, HP-UX – 64 bit is strongly recommended Much more than a performance question – Integration with other systems – Skills / Ease of Use – $$$ Often more than 1 “good” choice 5
  • 8. Selecting DB2 with and without Data Partitioning (InfoSphere Warehouse) Differences becoming smaller – Function and manageability gaps Data Partitioning is less common for – OLTP,ERP,CRM Data Partitioning is most common for – Data Warehousing 6
  • 9. Memory! How Much Do I Need? Highly dependent on many factors – Depends on number of users (connections) – Depends on the query workload – Depends on whether or not other software is sharing the machines being measured Advisable to allocate 5% of active data for bufferpool sizing New systems use 64-bit processors – If using 32-bit Windows/Linux/DB2 just use 4GB. 7
  • 10. Disk! How Many GB Do I Need? More than you think! Don’t forget about – Working storage – Tempspace – Indexes, MQT’s etc. But big drives tend to give lots of space – 146/300GB drives now standard Raw data x 4 (unmirrored)* Raw data x 5 (RAID5)* Raw data x 8 (RAID10)* * Assumes no compression8
  • 11. Disk! How Many Spindles Do I Need? Need to define a balanced system – Don't want too few large disks • Causes I/O bottleneck Different kinds of requirements – IOPS • Latency – MB/sec • Throughput Don’t share disks for table/indexes with logs Don’t know how many disks in the SAN? – Make friends with storage Admin! 9
  • 12. Basic Rules of Thumb (RoT) Meant to be approximate guidelines: – 150-200 GB active data per core – 50 concurrent connections per core – 8 GB RAM per core – 1500-2000 IOPS per core The above guidelines works for most virtualization environments as well These RoT are NOT meant to be a replacement or alternative to real workload sizing 10
  • 13. Additional Considerations for Virtualized environments Performance overhead with Hypervisor – Varies with type of hypervisor and environment Effect of over committing CPU at “system” level Effect of over committing memory at “system” level Effects of sharing same disks for multiple workloads 11
  • 15. Physical Database Design Create 1 database for each DB2 instance Issue “create database” with – Unicode codeset • Default starting with DB2 9.5 – Automatic Storage • Storage paths for tables/indexes etc • DBPATH for log etc. – Suitable pagesize Example – CREATE DB <DBNAME> AUTOMATIC STORAGE YES ON /fs1/mdmdb, /fs2/mdmdb, /fs3/mdmdb, /fs4/mdmdb DBPATH on /fs0/mdmdb USING CODESET UTF-8 TERRITORY <TERRITORY> COLLATE USING UCA400_NO PAGESIZE 8K; Suggestion: Make everything explicit to facilitate understanding 13
  • 16. Selecting a Page Size Use a single page size if possible – For example, 8K or 16K With LARGE tablespaces there is ample capacity for growth OLTP – Smaller page sizes may be better (e.g. 8K) Warehouse – Larger pages sizes often beneficial (e.g. 16K) XML – Use 32K page size Choosing an appropriate pagesize should depend on access pattern of rows (sequential Vs random) With DB2 9.7, the tablespace limits have increased by 4x; For example, with 4K page size, the max tablespace size is now 8 TB 14
  • 17. Tablespace Design Use automatic storage – Significant enhancements in DB2 9.7 Use Large tablespaces – Default since DB2 9.5 Disable file system caching via DDL as appropriate Ensure temp tablespaces exist – 1 for each page size, ideally just 1 Keep number of tablespaces reasonably small – 1 for look up tables in single node nodegroup – 1 for each fact table (largest tables) – 1 for all others Create separate tablespaces for indexes, LOBs Large tablespaces further help exploit table/index/temp compression 15
  • 18. Choosing DMS vs. SMS Goal: – Performance of RAW – Simplicity/usability of SMS DMS FILE is the preferred choice – Performance is near DMS RAW • Especially when bypassing filesystem caching – Ease of use/management is similar to SMS • Can gradually extend the size – Flexible • Can add/drop containers • Can separate data/index/long objects into their own table space – Potential to transition to Automatic Storage Automatic storage is built on top of DMS FILE – But it automates container specification / management 16
  • 19. Choosing DMS FILE vs. Automatic Storage Goal: – To maximize simplicity/usability Automatic Storage is the preferred choice with DB2 9.5 – Strategic direction • Receives bulk of development investment – Key enabler/prerequisite for future availability/scalability enhancements – Performance is equivalent to DMS FILE – Ease of use/management is superior • No need to specify any containers • Makes it easy to have many table spaces – Flexible • Can add/drop storage paths 17
  • 20. Consider Schema optimizations Decide on how to structure your data – Consider distributing your data across nodes • Using DPF hash-partitioning – Consider partitioning your data by ranges • Using table range partitioning – Consider organizing your data • Using MDC (multi dimensional clustering) Auxiliary data structures – Do the right indexes exist ? • Clustered, clustering, include columns for unique index – Would Materialized query tables (MQT) help? You can feed dynamic snapshot into design advisor 18
  • 21. Table Design OK to have multiple tables in a tablespace Once defined, use ALTER table to select options – APPEND MODE - use for tables where inserts are at end of table (ALTER TABLE ... APPEND ON) • This also enables concurrent append points for high concurrent INSERT activity – LOCKSIZE - use to select table level locking (ALTER TABLE ... LOCKSIZE TABLE) – PCTFREE - use to reserve space during load/reorg (ALTER TABLE ...PCTFREE 10) Add pk/fk constraints after index creation 19
  • 22. Table Design - Compression Compress base table data at row level – Build a static dictionary, one per table On-disk and in-memory image is smaller Need to uncompress data before processing Classic tradeoff: more CPU for less disk I/O – Great for IO-bound systems that have spare CPU cycles Large, rarely referenced tables are ideal 20
  • 23. Index Design In general, every table should have at least 1 index – Ideally a unique index / primary key index Choose appropriate options – PCTFREE - should be 0 for read-only table – PAGE SPLIT HIGH/LOW – for ascending inserts especially – CLUSTER - define a clustering index – INCLUDE columns - extra cols in unique index for index-only access – COLLECT STATISTICS while creating an index With DB2 9.7 indexes can be compressed too! 21
  • 24. Benchmarks DB2 is the performance leader TPoX 22
  • 25. World Record Performance With TPC-C 4,033,378 3,210,540 6,085,166 200,000 1,200,000 2,200,000 3,200,000 4,200,000 5,200,000 6,200,000 7,200,000 tpmC DB2 8.2 on 64-way POWER5 DB2 9.1 on 64-way POWER5+ DB2 9.5 on 64-way POWER6 64x 1.9GHz POWER5 2 TB RAM 6400 disks 64x 2.3GHz POWER5+ 2 TB RAM 6400 disks TPC Benchmark, TPC-C, tpmC, are trademarks of the Transaction Processing Performance Council. • DB2 8.2 on IBM System p5 595 (64 core POWER5 1.9GHz): 3,210,540 tpmC @ $5.07/tpmC available: May 14, 2005 • DB2 9.1 on IBM System p5 595 (64 core POWER5+ 2.3GHz): 4,033,378 tpmC @ 2.97/tpmC available: January 22, 2007 • DB2 9.5 on IBM POWER 595 (64 core POWER6 5.0GHz): 6,085,166 tpmC @ 2.81/tpmC available: December 10, 2008 Results current as of June 24, 2009 Check http://www.tpc.org for latest results 64x 5GHz POWER6 4 TB RAM 10,900 disks • Higher is better 23
  • 26. World Record TPC-C Performance on x64 with RedHat Linux 1,200,632 1,020,000 841,809 220,000 420,000 620,000 820,000 1,420,000 1,220,000 DB2 9.5 SQL Server 2005 tpmC IBM x3950 M2 Intel Xeon7460 RHEL 5.2 IBM x3950 M2 Intel Xeon7350 Win2003 TPC Benchmark, TPC-C, tpmC, are trademarks of the Transaction Processing Performance Council. •DB2 9.5 on IBM System x3950 M2 (8 Processor 48 core Intel Xeon 7460 2.66GHz): 1,200,632 tpmC @ $1.99/tpmC available: December 10, 2008 • SQL Server 2005 on HP DL580G5G4 (8 Processor 32 core Intel Xeon 7350 2.93GHz): 841,809 tpmC @$3.46/tpmC available: April 1, 2008 • Higher is better Results current as of June 24, 2009. Check http://www.tpc.org for latest results 24
  • 27. World record 10 TB TPC-H result on IBM Balanced Warehouse E7100 IBM System p6 570 & DB2 9.5 create top 10TB TPC-H performance 208457 108099 343551 60,000 0 180,00 0 120,00 0 300,00 0 240,00 0 360,00 0 QphH IBM p6 570/DB2 9.5 HP Integrity Superdome-DC Itanium/Oracle 11g Sun Fire 25K/Oracle 10g •Significant proof-point for the IBM Balanced Warehouse E7100 •DB2 Warehouse 9.5 takes DB2 performance on AIX to new levels •65% faster than Oracle 11g best result •Loaded 10TB data @ 6 TB / hour (incl. data load, index creation, runstats) • Higher is better TPC Benchmark, TPC-H, QphH, are trademarks of the Transaction Processing Performance Council. •DB2 Warehouse 9.5 on IBM System p6 570 (128 core p6 4.7GHz), 343551 QphH@10000GB, 32.89 USD per QphH@10000GB available: April 15, 2008 •Oracle 10g Enterprise Ed R2 w/ Partitioning on HP Integrity Superdome-DC Itanium 2 (128 core Intel Dual Core Itanium 2 9140 1.6 GHz), 208457 QphH@10000GB, 27.97 USD per QphH@10000GB, available: September 10, 2008 •Oracle 10g Enterprise Ed R2 w/ Partitioning on Sun Fire E25K (144 core Sun UltraSparc IV+ - 1500 MHz): 108099 QphH @53.80 USD per QphH@10000GB available: January 23, 2006 Results current as of June 24, 2009 Check http://www.tpc.org for latest results 25
  • 28. World record SAP 3-tier SD Benchmark This benchmark represents a 3 tier SAP R/3 environment in which the database resides on its own server where database performance is the critical factor DB2 outperforms Oracle by 68% and SQL Server by 80% – DB2 running on 32-way p5 595 – Oracle and SQL Server 2000 running on 64-way HP Top SAP SD 3-tier Results byDBMS Vendor 168300 100000 93000 0 20000 40000 60000 80000 100000 120000 140000 160000 180000 SDUsers DB2 8.2 on 32way p5 595 SQL Server on 64-way HPIntegrity Oracle 10g on 64way HP Integrity Results current as of June 24, 2009 Check http://www.sap.com/benchmark for latest results 26 • Higher is better
  • 29. More SAP performance than any 8-socket server Result comparable to a 32-socket 128-core Sun M9000 32-core Sun T5440 4-sockets 8-sockets 32-sockets 24-core Opteron 32-core Power 750 48-core Opteron 48-core Opteron 128-core Sun M9000 Power 750 Express 15,600 SAP SD 2-Tier Users on The IBM Power 750 Express With DB2 9.7 on AIX 6.1 27 http://www.sap.com/benchmark for latest results Results current as of March 03, 2010 Check
  • 30. Best SAP SD 2-Tier performance with SAP 6 ERP 4 20% more performance, 1/4 the number of cores vs. Sun M9000 4p/32c/128t 8p/64c/256t Sun M9000 SPARC 32p/128c/256-t 32 sockets Sun M9000 SPARC 64p/256c/512t 64 sockets IBM Power System 780, 8p / 64c / 256t, POWER7, 3.8 GHz, 1024 GB memory, 37,000 SD users, dialog resp.: 0.98s, line items/hour: 4,043,670, Dialog steps/hour: 12,131,000, SAPS: 202,180, DB time (dialog/ update):0.013s / 0.031s, CPU utilization: 99%, OS: AIX 6.1, DB2 9.7, cert# 2010013. SUN M9000, 64p / 256c / 512t, 1156 GB memory, 32,000 SD users, SPARC64 VII, 2.88 SAP SD Users All results are with SAP ERP 6 EHP4 Sun T5440 SPARC 4p/32c/256t IBM X3850 Nehalem-EX 4p/32c/64t 4 sockets Power 750 Sun X4640 Opteron 8p/48c/48t Fujitsu 1800E Nehalem-EX 8p/64c/128t 8 sockets Power 780 37,000SAP users on SAP SD 2 Tier Power 780 with DB2 #1 4-so ket Windows #1 #1Overall 4-socket Power 750 with DB2 System x3850 X5 with DB2 GHz, Solaris 10, Oracle 10g , cert# 2009046. 28 Results current as of April 07, 2010. Check
  • 31. Benchmark Multi-tier end-to-end performance benchmark for Java EE 5 Single node result: 1014.40 EjOPS 8 nodes cluster result: 7903.16 EjOPS – Approx. 38,500 tx/sec, 135,000 SQL/sec – WAS 7 on 8x HS22 Blades (Intel Xeon X5570 2-socket/8- core) – DB2 9.7 FP1 on x3850 M2 (Intel Xeon X7460 4-socket/24- core), SLES 10 SP2 Result published on January 7, 2010 First to Publish SPECjEnterprise2010 29 Results as of January 7, 2010
  • 32. More Efficient performance than Ever 30 3,000 Infor Baan ERP 2-Tier Users on The IBM Power 750 Express using DB2 9.7.  More performance, with less space and far less energy consumption than ever Infor ERP LN Benchmark results on P6 / P7 P6 P7 System p 570 p 750 Processor Speed 5 GHz 3.55 GHz No. of chips or sockets 8 2 cores / chip 2 8 Total number of cores 16 16 Total Memory 256 GB 256 GB AIXversion 6.1 6.1 DB2 Version 9.7 GA 9.7 GA # Infor Baan Users 2800 3000 # users / core 175 187.5 # users / chip 350 1500
  • 33. Performance Improvements DB2 9.7 has tremendous new capabilities that can substantially improve performance When you think about the new features … – “It depends” – We don’t know everything (yet) – Your mileage will vary – Please provide feedback! 31
  • 34. Active Subagents db2agntp Process/Thread Coordinator Agen s db2pcl Cl nr db2pfchr db2loggw db2dlock db2agntp db2loggr Prefetche rs Page eaners Buffer Pool(s) Deadlock Detector L Subsyste m L o g Buffer Database Level Idle Big - bl oc sts k, ogging Wr ite Lo g Req ue sts syn ef etc h Req ue sts c IO Pr Data DisksLog Disks Commo n iCel ervnetr subagent UDB Client Library UDB S OrgPer-instance Listeners Instance Level db2tcpcm db2ipccmdb2agent (idle) db2agent A anization Idle Agent Pool Idle, pooled agent or t Per-application Per-databaseSingle, Multi-threaded Process db2sysc 32 TCPIP (remote clients) or Shared Memory & Semaphores (local clients) DB2 Threaded Architecture
  • 35. Performance Advantages of the Threaded Architecture Context switching between threads is generally faster than between processes – No need to switch address space – Less cache “pollution” Operating system threads require less context than processes – Share address space, context information (such as uid, file handle table, etc) – Memory savings Significantly fewer system file descriptors used – All threads in a process can share the same file descriptors – No need to have each agent maintain its own file descriptor table 33
  • 36. From the existing DB2 9 Deep Compression … Reduce storage costs Improve performance Easy to implement 1.5 Times Better 3.3 Times Better 2.0 Times Better 8.7 Times Better DB2 9 Other “With DB2 9, we’re seeing compression rates up to 83% on the Data Warehouse. The projected cost savings are more than $2 million initially with ongoing savings of $500,000 a year.” - Michael Henson “We achieved a 43 per cent saving in total storage requirements when using DB2 with Deep Compression for its SAP NetWeaver BI application, when compared with the former Oracle database, The total size of the database shrank from 8TB to 4.5TB, and response times were improved by 15 per cent. Some batch applications and change runs were reduced by a factor of ten when using IBM DB2.” - Markus Dell ermann 34
  • 37. Index Compression What is Index Compression? The ability to decrease the storage requirements from indexes through compression. By default, if the table is compressed the indexes created for the table will also be compressed. – including the XML indexes Index compression can be explicitly enabled/disabled when creating or altering an index. Why do we need Index Compression? Index compression reduces disk cost and TCO (total cost of ownership) Index compression can improve runtime performance of queries that are I/O bound. When does Index Compression work best? – Indexes for tables declared in a large RID DMS tablespaces (default since DB2 9). – Indexes that have low key cardinality & high cluster ratio. 35
  • 38. Index Compression Page Header Index Page (pre DB2 9.7) Fixed Slot Directory (maximum size reserved) AAAB, 1, CCC AAAB, 1, CCD BBBZ, 1, ZZZ 1055, 1056 3011, 3025, 3026, 3027, 3029, 3033, 3035, 3036, 3037 3009, 3012, 3013, 3015, 3016, 3017, 3109 BBBZ, 1, ZZCCAAAE 6008, 6009, 6010, 6011 Index Key RID List How does Index Compression Work? • DB2 will consider multiple compression algorithms to attain maximum index space savings through index compression. 36
  • 39. Index Compression Page Header Index Page (DB2 9.7) Saved Space from Variable Slot Directory AAAB, 1, CCC AAAB, 1, CCD BBBZ, 1, ZZZ 1055, 1056 3011, 3025, 3026, 3027, 3029, 3033, 3035, 3036, 3037 3009, 3012, 3013, 3015, 3016, 3017, 3109 BBBZ, 1, ZZCCAAAE 6008, 6009, 6010, 6011 Variable Slot Directory • In 9.7, a slot directory is dynamically adjusted in order to fit as many keys into an index page as possible. Variable Slot Directory Index Key RID List 37
  • 40. 1055, 1 Saved 3011, 14, 1, 1, 2, 4, 2, 1, 1 3009, 3, 1, 2, 1, 1, 92 Saved from RID List Saved Saved Index Compression Page Header Index Page (DB2 9.7) Saved Space from Variable Slot Directory RID Deltas AAAB, 1, CCC AAAB, 1, CCD BBBZ, 1, ZZZ BBBZ, 1, ZZCCAAAE 6008, 1, 1, 1 Variable Slot Directory First RID Index Key Compressed RID RID List Compression 38 • Instead of saving the full version of a RID, we can save some space by storing the delta between two RIDs. • RID List compression is enabled when there are 3 or more RIDs in an index page.
  • 41. Saved Saved from RID List and Prefix Compression Saved Saved Index Compression C 1055, 1 D 3011, 14, 1, 1, 2, 4, 2, 1, 1 Z 3009, 3, 1, 2, 1, 1, 92 CCAAAE 6008, 1, 1, 1 COMMON PREFIX Index Page (DB2 9.7) Page Header Saved Space from Variable Slot Directory Prefix Compression Compressed • Instead of saving all key values, we can save some space by storing a common prefix and suffix records. • During index creation or insertion, DB2 will compare the new key with adjacent index keys and find the longest common prefixes between them. Variable Slot Directory AAAB, 1, CC BBBZ, 1, ZZ 0, 2 SUFFIX RECORDS Key Compressed RID 39
  • 42. Simple Index Compr ession Tests - E lapsed Time 49.12 49.24 83.99 53.89 28.31 33.67 68.3 44.07 0 10 20 30 40 Seconds 50 60 70 80 90 Simple Select Simple Insert Simple Update Simple Delete Without Index Compression With Index Compression Index Compression 34.5 34.8 16.2 20.8 23.6 33.9 6.8 10.5 1.6 2.0 2.6 2.5 3.1 3.3 52.2 52.1 0% 20% 40% Select: Select: Base Ixcomp Insert: Insert: Base Ixcomp Update: Update: Base Ixcomp Delete: Delete: Base Ixcomp MachineUtiliza tion user system idle iowait ComplexQueryDatabase WarehouseTested * Higher is better SimpleIndexCompressionTests 16.7 17.5 37.1 36.4 49.1 46.3 48.2 45.0 11.7 11.4 33.3 30.9 25.9 18.5 38.0 34.2 60% 80% 100% Estimated In dex C ompression Savin gs 16% 10% 20% 30% 40% 50% 60% 70% Percentage Com pressed (Indexes) 20% 24% 31% 50% 55% 57% 0% W arehouse #1 W arehouse #2 W arehouse #3 W arehouse #4 W arehouse #5 W arehouse #6 W arehouse #7 Average 36% Runs 18% Faster Runs 19% Faster Runs As fast • Lower is better Results in a Nutshell • Index compression uses idle CPU cycles and idle cycles spent waiting for I/O to compress & decompress index data. • When we are not CPU bound, we are able to achieve better performance in all inserts, deletes and updates. Runs 40 16% Faster
  • 43. Temp Table Compression What is Temp Table Compression? The ability to decrease storage requirements by compressing temp table data Temp tables created as a result of the following operations are compressed by default: – Temps from Sorts – Created Global Temp Tables – Declared Global Temp Tables – Table queues (TQ) Why do we need Temp Table Compression on relational databases? Temp table spaces can account for up to 1/3 of the overall tablespace storage in some database environments. Temp compression reduces disk cost and TCO (total cost of ownership) 41
  • 44. Temp Table Compression Canada|Ontario|Toronto|Matthew Canada|Ontario|Toronto|Mark USA|Illinois|Chicago|Luke USA|Illinois|Chicago|John 0x12f0 – CanadaOntarioToronto … 0xe57a – Mathew … 0xff0a – Mark … 0x15ab – USAIllinoixChicago … 0xdb0a – Luke … 0x544d – John … Create dictionary from sample data String of data across a row How does Temp Table Compression Work? – It extends the existing row-level compression mechanism that currently applies to permanent tables, into temp tables. 0x12f0,0xe57a 0x12f0,0xff0a 0x15ab,0xdb0a 0x15ab,0x544d Saved data (compressed) Lempel-Ziv Algorithm 42
  • 45. Query Workload CPU Analysis for Temp Compression 39.26 46.50 1.7 1.3 29.00 29.50 22.19 14.61 0% 20% 40% 60% 80% 100% Baseline Temp Compression user sys idle iowait Temp Table Compression SpaceSavingsforComplexWarehouseQuerieswithTemp Compression 78.3 50.2 0.0 20.0 40.0 60.0 80.0 100.0 WithoutTempCompTotalBytesStored WithTempCompBytesStored Size(Gigabyt es) Saves 35% Space Effective CPU Usage • Lower is better ElapsedTimeforComplexWarehouseQuerieswithTemp Compression 183.98 175.56 120.00 130.00 140.00 150.00 160.00 170.00 180.00 190.00 200.00 WithoutTempCompRuntime WithTempCompRuntime Minu tes 5% Faster • Lower is better Results in a Nutshell For affected temp compression enabled complex queries, an average of 35% temp tablespace space savings was observed. For the 100GB warehouse database setup, this sums up to over 28GB of saved temp space. 43
  • 46. XML Data Compression What is XML Data Compression? The ability to decrease the storage requirements of XML data through compression. XML Compression extends row compression support to the XML documents. If row compression is enabled for the table, the XML data will be also compressed. If row compression is not enabled, the XML data will not be compressed either. Why do we need XML Data Compression? Compressing XML data can improve storage efficiency and runtime performance of queries that are I/O bound. XML compression reduces disk cost and TCO (total cost of ownership) for databases with XML data 44
  • 47. XML Data Compression Relational Data Data (uncompressed) < 32KB XML Data 32KB – 2GB XML Data Comp. Data Data (compressed) Inlined < 32KB XML Data Compressed 32KB – 2GB XML Data Dictionary #1 Dictionary #2 How does XML Data Compression Work? – Small XML documents (< 32k) can be inlined with any relational data in the row and the entire row is compressed. • Available since DB2 9.5 – Larger XML documents that reside in a data area separate from relational data can also be compressed. By default, DB2 places XML data in the XDA to handle documents up to 2GB in size. – XML compression relies on a separate dictionary than the one used for row compression. 45
  • 48. XML Data Compression X M L C o m p re s s io n S a v in g s 4 3 % 6 1 % 6 3 % 6 3 % 7 4 % 7 7 % 7 7 % 0 % 2 0 % 4 0 % 6 0 % P e r c e n ta g e C o m p r e s s e d 8 0 % X M L D B Test # 1 X M L D B Test # 2 X M L D B Test # 3 X M L D B Test # 4 X M L D B Test # 5 X M L D B Test # 6 X M L D B Test # 7 XMLDatabaseTested Results in a Nutshell Significantly improved query performance for I/O-bound workloads. Achieved 30% faster maintenance operations such as RUNSTATS, index creation, and import. Average compression savings of ⅔ across 7 different XML customer databases and about ¾ space savings for 3 of those 7 databases. Average Elapsed Time for SQLXML and Xquery Queries over an XML and Relational Data database using XDA Compression 31.1 19.7 0 5 10 15 20 25 30 35 Without XML Compression With XML Compression Time(sec) Average 67% • Lower is better • Higher is better 37% Faster 46
  • 49. Range Partitioning with Local Indexes 47 What does Range Partitioning with Local Indexes mean? – A partitioned index is an index which is divided up across multiple storage objects, one per data partition, and is partitioned in the same manner as the table data – Local Indexes can be created using the PARTITIONED keyword when creating an index on a partitioned table (Note: MDC block indexes are partitioned by default) Why do we need Range Partitioning with local Indexes? – Improved ATTACH and DETACH partition operations – More efficient access plans – More efficient REORGs. When does Range Partitioning with Local Indexes work best? – When frequents roll-in and roll-out of data are performed – When one tablespace is defined per range.
  • 50. Index siz e com parison: Leaf page count 18,409 13,476 0 4,000 8,000 12,000 16,000 20,000 global index on RP table local index on RP table Indexleafpages Results in a Nutshell Partition maintenance with ATTACH: – 20x speedup compared to DB2 9.5 global index because of reduced index maintenance. – 3000x less log space used than with DB 9.5 global indexes. Asynchronous index maintenance on DETACH is eliminated. Local indexes occupy fewer disk pages than 9.5 global indexes. – 25% space savings is typical. – 12% query speedup over global indexes for index queries – fewer page reads. 25% Space Savings • Lower is better Local Indexes * Lower is better Range Partitioning with Local Indexes Total Time and Log Space required to ATTACH 1.2 million rows 651.84 0.05 0.03 0.21 1.E-02 1.E-01 1.E+00 1.E+01 1.E+02 1.E+03 V9.5 Global Indexes V9.7 Local Indexes V9.7 Local IndexesNo Indexes - Baseline built during ATTACHbuilt before ATTACH LogSpacerequired(MB) 180.00 160.00 140.00 120.00 100.00 80.00 60.00 40.00 20.00 0.00 Attach/SetIntegritytime(sec) Log Space used, MB Attach/Set Integrity time (sec) 48
  • 51. Scan Sharing What is Scan Sharing? It is the ability of one scan to exploit the work done by another scan This feature targets heavy scans such as table scans or MDC block index scans of large tables. Scan Sharing is enabled by default on DB2 9.7 Why do we need Scan Sharing? Improved concurrency Faster query response times Increased throughput When does Scan Sharing work best? Scan Sharing works best on workloads that involve several clients running similar queries (simple or complex), which involve the same heavy scanning mechanism (table scans or MDC block index scans). 49
  • 52. Scan Sharing How does Scan Sharing work? – When applying scan sharing, scans may start somewhere other than the usual beginning, to take advantage of pages that are already in the buffer pool from scans that are already running. – When a sharing scan reaches the end of file, it will start over at the beginning and finish when it reaches the point that it started. – Eligibility for scan sharing and for wrapping are determined automatically in the SQL compiler. – In DB2 9.7, scan sharing is supported for table scans and block index scans. Unshared Scan Shared Scan A scan B scan Re-read pages causing extra I/O A scan Shared A & B scan B scan 50 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3
  • 53. Block Index Scan Test : Q1 and Q6 Interleaved Q1 Q6 Q1 Q6 Q1 Q6 Q1 Q6 Q1 Q6 Q1 Q6 Q1 Q6 Q1 Q6 Q1 Q6 QueryRan staggeringevery10sec 0 50 100 150 200 250 300 350 400 450 500 550 600 Scan Sharing Q1 Q6 Q1 Q6 Q1 Q6 Q1 Q6 Q1 Q6 Q1 Q6 Q1 Q6 Q1 Q6 Q1 Q6 QueryRan staggeringevery10sec 0 50 100 150 200 250 300 350 400 450 500 550 600 No Scan Sharing Q1 : CPU Intensive Q6 : IO Intensive Scan Sharing Tests on Table Scan 1,284.6 90.3 0.0 200.0 400.0 600.0 800.0 1,000.0 1,200.0 1,400.0 No Scan Sharing Scan Sharing Average of running 100 Instances of Q1 Seconds Scan Sharing • Lower is better • Lower is better Runs 14x Faster! • MDC Block Index Scan Sharing shows 47% average query improvement gain. • The fastest query shows up to 56% runtime gain with scan sharing. • 100 concurrent table scans now run 14 times faster with scan sharing! Runs 47% 51 Faster!
  • 54. Complex Queries per Hour Throughputfor a 10GBWarehouse Database: 16 Parallel Streams 381.92 636.43 0 100 200 300 400 500 600 700 Scan SharingOFF Scan SharingON Scan Sharing • Higher is better 67% Throughput Improved Results in a Nutshell When running 16 concurrent streams of complex queries in parallel, a 67% increase in throughput is attained when using scan sharing. Scan sharing works fully on UR and CS isolation and by design, has limited applicability on RR and RS isolation levels. 52
  • 55. XML Scalability on Infosphere Warehouse (a.k.a DPF) What does it mean? Tables containing XML column definitions can now be stored and distributed on any partition. XML data processing is optimized based on their partitions. Why do we need XML in database partitioned environments? As customers adopt the XML datatype in their warehouses, XML data needs to scale just as relational data XML data also achieves the same benefit from performance improvements attained from the parallelization in DPF environments. 53
  • 56. XML Scalability on Infosphere Warehouse (a.k.a DPF) Simple query: Elapsed time speedup from 4 to 8 partitions 0 0.5 1 1.5 2 2.5 count w ith count, no grouped agg index index update colo join noncolo join Elapsedtime4P/8P rel xml xmlrel * Results in a Nutshell Table results show the elapsed time performance speedup of complex queries from a 4 partition setup to an 8 partition setup. Queries tested have a similar star-schema balance for relational and XML. Each query run in 2 or 3 equivalent variants: – Completely relational (“rel”) – Completely XML (“xml”) – XML extraction/predicates with relational joins (“xmlrel”) (join queries only) Queries/updates/deletes scale as well as relational ones. Average XML query-speedup is 96% of relational Complex query: Elapsed time speedup from 4 to 8 partitions 0 0.5 1 1.5 2 2.5 3 3.5 1 2 3 4 5 6 7 8 9 10 Query number Elapsedtime4P/8P rel xml xmlrel 54
  • 57. Statement Concentrator Why do we need the statement concentrator? This feature is aimed at OLTP workloads where simple statements are repeatedly generated with different literal values. In these workloads, the cost of recompiling the statements many times adds a significant overhead. Statement concentrator avoids this compilation overhead by allowing the compiled statement to be reused, regardless of the values of the literals. What is the statement concentrator? It is a technology that allows dynamic SQL statements that are identical, except for the value of its literals, to share the same access plan. The statement concentrator is disabled by default, and can be enabled either through the database configuration parameter (STMT_CONC) or from the prepare attribute 55
  • 58. Statement Concentrator Effect of the Statement Concentrator on Prepare times for 20,000 statements using 20 users 436 23 0 100 200 300 400 500 Concentrator off Concentrator on PrepareTime(sec) 19x Reduction in Prepare time! • Lower is better Results in a Nutshell The statement concentrator allows prepare time to run up to 25x faster for a single user and 19x faster for 20 users. The statement concentrator improved throughput by 35% in a typical OLTP workload using 25 users Effect of the Statement Concentrator for an OLTP workload 133 180 200 180 160 140 120 100 80 60 40 20 0 Concentrator Off Concentrator On Throughpu t • Higher is better 35% Throughput Improved! 56
  • 59. Currently Committed What is Currently Committed? Currently Committed semantics have been introduced in DB2 9.7 to improve concurrency where readers are not blocked by writers to release row locks when using Cursor Stability (CS) isolation. The readers are given the last committed version of data, that is, the version prior to the start of a write operation. Currently Committed is controlled with the CUR_COMMIT database configuration parameter Why do we need the Currently Committed feature? Customers running high throughput database applications cannot tolerate waiting on locks during transaction processing and require non-blocking behavior for read transactions. 57
  • 60. Currently Committed Results in a Nutshell By enabling currently committed, we use CPU that was previously idle (18%), leading to an increase of over 28% in throughput. Throughput of OLTP Workload using Currently Committed 981.25 1,260.89 0 300 600 900 1,200 1,500 Currently Commit Disabled Currently Commit Enabled Transactionspersecond CPU Analysis - CPU Analysis on Currently Committed 45.0 58.9 12.9 17.2 33.5 5.0 8.7 19.0 0% 20% 40% 60% 80% 100% CC Disabled CC Enabled user system idle iowait Effective CPU usage Allows 28% more throughput • Higher is better With currently committed enabled, we see reduced LOCK WAIT time by nearly 20%. We observe expected increases in LSN GAP cleaners and increased logging. 58
  • 61. LOB Inlining Why do we need the LOB Inlining feature? Performance will increase for queries that access inlined LOB data as no additional I/O is required to fetch the LOB data. LOBS are prime candidates for compression given their size and the type of data they represent. By inlining LOBS, this data is then eligible for compression, allowing further space savings and I/O from this feature. What is LOB INLINING? LOB inlining allows customers to store LOB data within a formatted data row in a data page instead of creating separate LOB object. Once the LOB data is inlined into the base table row, LOB data is then eligible to be compressed. 59
  • 62. LOB Inlining Inlined LOB vs. Non-Inlined LOB 75% 75% 64% 55% 70% 65% 7% 22% 30% 10% 0% 80% 70% 60% 50% 40% 30% 20% 8kLob 16kLob 32kLob Size of LOB Insert Performance Select Performance Update Performance %Improvement Results in a Nutshell INSERT and SELECT operations are the ones with more benefit. The smaller the LOB the bigger the benefit of the inlining For UPDATE operations the larger the LOB the better the improvements We can expect the inlined LOBs will have the same performance as a varchar(N+4) 60 * Higher is better
  • 63. Summary of Key DB2 9.7 Performance Features Compression for indexes, temp tablespaces and XML data results on space savings and better performance Range Partitioning with local indexes results in space savings and better performance including increased concurrency for certain operations like REORG and set integrity. It also makes roll-in and roll-out of data more efficient. Scan Sharing improves workloads that have multiple heavy scans in the same table. XML Scalability allows customers to exploit the same benefits in data warehouses as they exist for relational data Statement Concentrator improves the performance of queries that use literals reducing their prepare times Currently Committed increases throughput and reduces the contention on locks LOB Inlining allows this type of data to be eligible for compression 61
  • 64. A glimpse at the Future Expect more leadership benchmark results on POWER7 and Nehalam EX Preparing for new workloads – Combined OLTP and Analytics Preparing for new operating environments – Virtualization – Cloud – Power-aware Preparing for new hardware – SSD storage – POWER7 – Nehalem EX 62
  • 65. Conclusion DB2 is the performance benchmark leader New features in DB2 9.7 that further boost performance – For BOTH the OLTP and Data warehouse areas Performance is a critical and integral part of DB2! – Maintaining excellent performance • On current hardware • Over the course of DB2 maintenance – Preparing for future hardware/OS technology 63
  • 66. Appendix – Mandatory SAP publication data Required SAP Information For more information regarding these results and SAP benchmarks, visit www.sap.com/benchmark. These benchmark fully complies with the SAP Benchmark Council regulations and has been audited and certified by SAP AG SAP 3-tier SD Benchmark: 168,300 SD benchmark users. SAP R/3 4.7. 3-tier with database server: IBM eServer p5 Model 595, 32-way SMP, POWER5 1.9 GHz, 32 KB(D) + 64 KB(I) L1 cache per processor, 1.92 MB L2 cache and 36 MB L3 cache per 2 processors. DB2 v8.2.2, AIX 5.3 (cert # 2005021) 100,000 SD benchmark users. SAP R/3 4.7. 3-tier with database server: HP Integrity Model SD64A, 64-way SMP, Intel Itanium 2 1.6 GHz, 32 KB L1 cache, 256 KB L2 cache, 9 MB L3 cache. Oracle 10g, HP-UX11i (cert # 2004068) 93,000 SD benchmark users. SAP R/3 4.7. 3-tier with database server: HP Integrity Superdome 64P Server, 64-way SMP, Intel Itanium 2 1.6 GHz, 32 KB L1 cache, 256 KB L2 cache, 9 MB L3 cache . SQL Server 2005, Windows 2003 (cert # 2005045) SAP 3-tier BW Benchmark: 311,004 throughput./hour query navigation steps.. SAP BW 3.5. Cluster of 32 servers, each with IBM x346 Model 884041U, 1 processor/ 1 core/ 2 threads, Intel XEON 3.6 GHz, L1 Execution Trace Cache, 2 MB L2 cache, 2 GB main memory. DB2 8.2.3 SLES 9. (cert # 2005043) SAP TRBK Benchmark: 15,519,000. Day processing no. of postings to bank accounts/hour. SAP Deposit Management 4.0. IBM System p570, 4 core, POWER6, 64GB RAM. DB2 9 on AIX 5.3. (cert # 2007050) 10,012,000 Day processing no. of postings to bank accounts/hour. SAP Account Management 3.0. Sun Fire E6900, 16 core, UltraSPARC1V, 56GB RAM, Oracle 10g on Solaris 10, (cert # 2006018) 8,279,000 Day processing no. of postings to bank accounts/hour/ SAP Account Management 3.0. HP rx8620, 16 core, HP mx2 DC,64 GB RAM, SQL Server on Windows Server (cert # 2005052) SD 2-tier SD Benchmark: 39,100 SD benchmark users, SAP ECC 6.0. Sun SPARC Enterprise Server M9000, 64 processors / 256 cores / 512 threads, SPARC64 VII, 2.52 GHz, 64 KB(D) + 64 KB(I) L1 cache per core, 6 MB L2 cache per processor, 1024 GB main memory, Oracle 10g on Solaris 10. (cert # 2008-042-1) 35,400 SD benchmark users, SAP ECC 6.0. IBM Power 595, 32 processors / 64 cores / 128 threads, POWER6 5.0 GHz, 128 KB L1 cache and 4 MB L2 cache per core, 32 MB L3 cache per processor, 512 GB main memory. DB2 9.5, AIX 6.1. (Cert# 2008019). 30,000 SD benchmark users. SAP ECC 6.0. HP Integrity SD64B , 64 processors/128 cores/256 threads, Dual-Core Intel Itanium 2 9050 1.6 GHz, 32 KB(I) + 32 KB(D) L1 cache, 2 MB(I) + 512 KB(D) L2 cache, 24 MB L3 cache, 512 GB main memory. Oracle 10g on HP-UX 11iV3. (cert # 2006089) 23,456 SD benchmark users. SAP ECC 5.0. Central server: IBM System p5 Model 595, 64-way SMP, POWER5+ 2.3GHz, 32 KB(D) + 64 KB(I) L1 cache per processor, 1.92 MB L2 cache and 36 MB L3 cache per 2 processors. DB2 9, AIX 5.3 (cert # 2006045) 20,000 SD benchmark users. SAP ECC 4.7. IBM eServer p5 Model 595, 64-way SMP, POWER5, 1.9 GHz, 32 KB(D) + 64 KB(I) L1 cache per processor, 1.92 MB L2 cache and 36 MB L3 cache per 2 processors, 512 GB main memory. (cert # 2004062) These benchmarks fully comply with SAP Benchmark Council's issued benchmark regulations and have been audited and certified by SAP. For more information, see http://www.sap.com/benchmark 64