2. Vizuri – an Operating Division of AEM
Applied Engineering Management (AEM) Corporation Founded in 1986 as a
100% woman-owned business
More than 25 years of profitable growth
Headquartered in Chantilly, VA with offices located in major metropolitan
areas including; Los Angeles, San Antonio, and Jacksonville
Diversified client base including Fortune 500 and major government
agencies
Industry recognized awards and certifications for performance, capability
and delivery
3. Chris Bradham
•Lead Architect, Data Services
•Oracle DBA experience 1997 to present (Oracle 7 to 11.2)
•Data Guard, Replication, Materialized Views, GoldenGate, Exadata, RAC
•Part-time Instructor George Mason University (OCA/OCP)
•Oracle Certified Exadata Implementation Specialist, Oracle Certified
Professional (11g), Performance Tuning Certified(11g), ITIL Foundation,
Security+
cbradham@vizuri.com
4. What’s being covered?
•Technology Refresh
•Legacy Environment / Options
•Exadata Components
•Security Considerations
•Migration Considerations
•Results of Migration
•Lessons Learned
•References
•Q & A
5. Background Information
Global multi-service DoD Web-based Housing application
Over 300 schemas
750 Gb of data
4,300 Active Users
4.2 million annual log ins
4,500 Reports Generated Per Day
AEM Corporation responsible for Hosting / Operations & Maintenance / Technology Refresh
7. Pre-Tech Refresh Issues
Legacy hardware over six years old
•Patches (5 nodes, slower machines)
•Deployments, data updates time consuming
•Large or complex reports often hang
•Node evictions due to network / disk speed issues
•Oracle 10.2.0.4 Support ended 6/31/11
Data Warehouse delay due to performance requirements
(Oracle Streams attempt)
8. Alternative 1 : Based on Legacy Solution
Virtualized application servers
Network bonding
8 Gbps backbone
EMC Disk Array
5 Database Servers
Oracle 11gR2 RAC install
9. Alternative 2 : Based on Exadata Solution
Virtualized application servers
Network bonding
40 Gbps backbone
Oracle Storage Servers
2 node Quarter Rack
Oracle 11gR2 RAC preconfigured
Surprise, we chose Exadata!
10. X2-2 Quarter Rack Specifications
•2 Xeon-based Dual-processor Database Servers (Sun Fire
X4170 M2)
• 24 cores (12 per server)
• 192 GB memory expandable to 288 GB (96 GB per
server expandable to 144 GB)
• 10 GigE connectivity to Data Center
• 4 x 10GbE ports (2 per server)
•1.1 TB High Speed Flash
•3 Exadata Storage Servers X2-2
• All with High Performance 600GB disks
OR
• All with High Capacity 3 TB disks
•2 Sun Datacenter InfiniBand Switch 36
• 36-port Managed QDR (40Gb/s) switch
•1 “Admin” Cisco Ethernet switch
•Keyboard, Video, Mouse (KVM) hardware Can Upgrade to a Half Rack
•Redundant Power Distributions Units (PDUs) or just add storage
11. Exadata Selection Points
•Licensing fees made Exadata the low cost solution
•Total database hardware solution
•Sizable and expandable
•Oracle vested to help DoD succeed
•Patch Strategy
•Storage Indexes / Smart Scan / Smart Flash Cache
12. Throughput Gb/Second
80.0
75
70.0
60.0
50.0 2 Gbps Fibre Channel x2
4 Gbps Fibre Channel x2
40.0
8 Gbps Fibre Channel x2
30.0 37
Exadata 1/4 - Disk
20.0 25
Exadata 1/2 - Disk
10.0 12.5 16 Exadata Full - Disk
0.4 0.8 5.4
1.6 Exadata 1/4 - Disk & Flash
0.0
Exadata 1/2 - Disk & Flash
Exadata Full - Disk & Flash
14. Tech Refresh Challenges
•100% hardware replacement and Data Center move
•Data Center staff responsiveness
•Narrow window for outage to avoid negative impact on end users
•Performance of system, database growth, and network bandwidth
•Exadata unproven in DoD space at the time (Security)
•Upgrading Database versions (data/code/reports)
Lots of change, what if issues surface???
15. Smart Flash Cache Considerations
Helps with…
•Write-Through cache voids caching data that will not be reused
•Holds hot data, much faster than disk (small, random I/O)
•Data not duplicated from cache in other Storage Servers
•Reduce latency of log write by simultaneous write to flash / disk
(faster writes) with minimal space (512 Mb)
•Write-Back cache 11.2.0.3.9
Don’t touch except for…
•Alter table <table_name> flash_cache keep;
•Create Flash Disks out of the Flash Cache
•Reassign portion for TEMP tablespace on index builds
16. Database Node Considerations
•Database Consolidation
•SGA Settings
•AMM Bad! ASMM Good! (set minimum values)
•Where’s the shared storage space?
•DBFS is the answer (fix_control=8,ac_timeout=60 and SGA=2Gb)
•Is everything setup correctly?
•Exachk is the answer
•Indexes / Hints / Compression
•Huge Pages (reduce overhead)
•Large Segments <- 8 Mb Initial / Next Size with Autoallocate
•TEMP <- BIGFILE, Autoextend 1 Gb, Uniform 1 Mb
17. Exadata Patch Management
Multiple Patches
•Infiniband (once per year)
•DB Nodes / Storage Server (quarterly)
•Bundle Path (BP) DB Software (quarterly)
•Additional components (Ethernet switch, KVM, PDU)
Bug Fixes included so important to apply
Proceed with caution: one-off patches
Rolling option time a consideration
18. Security
•DoD 8570 Requirements
•Security Technical Implementation Guide (STIG)
oOracle installation not customizable
oDBFS and idle_time don’t play well together
oAutomatic Service Request (ASR) / Configuration Manager Limitation
oGrid Control / Third Party Certificates (September release)
oBanners / SQLNET.ORA settings impact on tools
(DEFAULT_SDU_SIZE=32767, ORA-12541)
Don’t assume security settings will not have impact. Must TEST!!!
19. Migration Strategies
10.2.0.4 to 11.2.0.x Options Considered
•DBFS with external tables (5 to 7 GB/sec file system I/O throughput)
•GoldenGate with datapump (near-zero downtime)
•Datapump
Factors
•Maintenance window
•Risk of data loss
•Familiarity with technology
Whatever the choice, perform multiple trial runs for optimal settings.
20. 2011 – Technical Refresh (Data Center move)
On 9/9/11 at 7pm application servers
at legacy site were Turned Off:
•Transferred encrypted data pump
exports to Data Center
•Network outage occurred during data
transfer (2 hours)
•On 9/10/11at 7am New System Testing
was Initiated
• Users were on the system by 3pm
21. Migration Timeline
Text Initial
1/4 Rack Grid DB Cutover
Delivered Migration Oracle Setup / STIG Prod/
Migrate Control Setup to
(Test) Options Setup DBFS Test CAB Load Apply BP Exadata
1/11 2/11 3/11 4/11 5/11 6/11 7/11 8/11 9/11
Chris Initial DB STIG Apply BP Migration 1/4 Rack Migration Grid
First Setup / Test Test Selection Delivered Test Control
Day Load
(Prod) Setup
22. Post Tech Refresh Performance (in hours)
9
8
7
6
5
4 Legacy
3 Exadata
2
1
0
Exports MV Refresh BOR Batch Index
Process Rebuild
23. Exadata Lessons Learned
•Ensure hosting center can accommodate Exadata’s dimensions
•Staff requirements (more communication necessary)
•Testing required, ideally 2 Exadata Database machines
•Smart Scan <- direct path reads, table access full, fast full index scans,
parallel with parallel_degree_policy not auto
•Chained rows void smart scans
•EHCC 10x space and performance (DML)
•Data Warehouse (EHCC, SGA sizing)
•IORM not heavily used by customers
In-Memory
X2-8 Massive Memory X3
2010 All I/Os to Memory 2012
24. Exadata Lessons Learned (cont.)
•Grid Control for monitoring / managing components
•Expect CPU utilization to decrease
•Expect Disk failures
•Can’t mix drive types and pricy to switch
•Standard tuning principles apply (OLTP)
•DB link opportunities for tuning
•Platinum Support, major assistance
•Exachk and opatch before / after patching
•Time, Experience keys to stability
26. References (cont.)
Database Machine and Exadata Storage Server (888828.1)
Oracle Exadata Database Machine exachk (1070954.1)
Oracle Exadata Best Practices (757552.1)
Best Practices for OLTP on the Sun Oracle Database Machine (1269706.1)
Best Practices for Data Warehousing on Database Machine (1297112.1)
Oracle Sun Database Machine Application Best Practices for Data Warehousing
(1094934.1)
Oracle Sun Database Machine Diagnosability and Troubleshooting Best Practices
(1274324.1)
Expert Oracle Exadata (Osborne, Johnson, Poder)
32. Cache Hierarchy (Full Rack Example)
Database DRAM
768 GB Raw Capacity 100 GB / Second
Flash Cache
5 TB Raw Capacity 50 GB / Second
Disk
100 – 300 TB Raw Capacity 21 GB / Second
33. Exadata issues
78 SRs and counting since Aug 2011, many Grid Control related
•DBFS slow performance (BP12, ac_timeout=60, fix_control=32, SGA size)
•DBFS password change (Support provided instructions)
•11.2.0.3.0 Upgrade issue (Disk issue morning of upgrade)
•ORA-01008 not all variables bound (cursor_sharing=similar, Patch 9877980)
•ORA-7445 [kkslMarkLiteralBinds()+214] (cursor_sharing=similar Patch 13627381)
•ORA-7445 [nstimeexp()+63] (alert log, Patch 12615660)
•Ossnet: connection failed (alert log, Patch 13536739)
•ORA-600 [2116] (alert log, BP6)
•ORA-7445 [__intel_new_memcpy] using expdp (Patch 13335183)
•Impdp fails ORA-39097 (Patch 13704684)
Disks go bad, especially early on
34. Exadata Rack Options
2 six Core Processors / 96 Gb RAM per DB node
2 six Core Processors / 24 Gb RAM per Storage Server
Dual ported 40 Gb/sec InfiniBand
Quarter Rack
•2 DB nodes
•2 Infiniband switches
Half Rack
•4 DB nodes
•3 Infiniband switches
Full Rack
•8 DB nodes
•3 Infiniband switches
Infiniband 10x faster than Fiber Channel