The document discusses 4 strategies for disaster recovery in real-time data warehousing: 1) Separate operational warehouses from reporting systems, 2) Use changed data capture and Big Data replication, 3) Implement parallel, active-active data warehouses, and 4) Maintain a "golden event" warehouse in Hadoop. It then describes Tervela Turbo, a new product that can capture, share, and distribute big data for mission-critical analytics applications.
2. What you will learn: 4 strategies
1. Separate operational warehouses from reporting systems
2. Use changed data capture and Big Data replication
3. Implement parallel, active-active data warehouses
4. Maintain a “golden event” warehouse in Hadoop
Confidential & Proprietary 2
3. Analytics Have a Measurable Effect
• For the median Fortune 1000 Company, a
10% increase in data usability corresponds to
$2.01B in annual revenue gains
Big Data, Big Opportunity – University of Texas at Austin, Sept 2011
• A “real-time infrastructure” ranks
#3 on the CIO’s list of strategies
A “real-time infrastructure” – Gartner
• Organizations adept at analytics see
1.6x the revenue growth
2.0x the profit growth, and
2.5x the stock price appreciation
of their peers – “Outperforming in a Data-Rich and Hyper-Connected World.”
IBM Center for Applied Insights and Economic Intelligence
Confidential & Proprietary 3
4. Data Warehousing: Now Part of Operations
real-time pricing
real-time marketing
fraud detection
inventory management
customer service
Confidential & Proprietary 4
5. Analytics in Business Operations:
Constant, Up-to-Minute Access to Big Data
ADVERTISING CAPITAL MARKETS
Click-stream Mobile ads Market Data Securities Trading
UTILITIES TRANSPORTATION
Energy usage Power production Traffic & Logistics Fleet Deployment
INFORMATION TECHNOLOGY TELECOMMUNICATIONS
Network Activity IT Root-Cause Call Activity Capacity Allocation
5
7. What we need…vs. what we have
Need Have
SLAs: 99.999% Backup and recovery can
Up-Time take days in the event of an
outage or system failure
Access to information as it ETL processes can take
Real-time happens hours before information is
available
Add new applications as Access to warehouse is
the business demands tightly controlled;
Distribution performance bottlenecks of a
single database can impact
mission-critical systems
Confidential & Proprietary 7
8. 4 disaster recovery strategies for big data
1. Separate operational warehouses from reporting systems
2. Use changed data capture and Big Data replication
3. Implement parallel, active-active data warehousing
4. Maintain a “golden event” warehouse in Hadoop
Confidential & Proprietary 8
9. 1. Separate operations from reporting
Operations Primary
application Warehouse
DB2
Run day-to-day
applications in one
place. Ad-hoc
reporting happens in a
separate warehouse.
WAN BENEFIT
Better control over
performance
CHALLENGE
Keeping changes in
Secondary
sync
Reporting
Warehouse
9
10. 2. Changed data capture
Primary Cluster
Determine what has
application changed, then
replicate it to achieve
parity between
environments
1 GB/s
Data Fabric BENEFIT
250 MB/s per box
Load-balanced
Quickly propagate
Linearly scalable changes to remote
Built-in persistence
sites
WAN
CHALLENGE
Identifying changes is
difficult. The volume of
data represents a stop-
gap as it continues to
Reporting Cluster grow.
10
11. 3. Parallel, active-active data warehousing
Primary Cluster
Capture application
data streams and load
to parallel data
warehouses over the
WAN
1 GB/s
BENEFIT
Data Fabric Multiple warehouses
250 MB/s per box are kept up to date
Load-balanced WAN
Linearly scalable
Built-in persistence CHALLENGE
Synchronization of
many data streams
Reporting Cluster
Confidential & Proprietary 11
12. 4. “Golden Event” store
Data Fabric
Primary Data Warehouse
250 MB/s per box
application Load-balanced
Linearly scalable
Built-in persistence
Capture raw data and
store it in Hadoop
BENEFIT
New analytics are
Reporting Data Warehouse
always possible
(Optional)
CHALLENGE
Best practices are only New Apps &
just being developed Analytics
Golden Event Store
Confidential & Proprietary 12
13. About Tervela Turbo
• New release!
• Capture, share, and distribute data
• Accelerate any of the use cases we discussed today
Confidential & Proprietary 13
14. Big Data Requires Big Data Movement
As companies
implement more big
data solutions, the
need to use high-
performance message
delivery with those
systems will grow.
Gartner: Hype Cycle for Big Data, 2012
Confidential & Proprietary 14
15. Key Features and Benefits of Tervela Turbo
Key Features Key Benefits
Data Capture
• Adapters for top data stores Real-Time
• Flexible multi-language API Regardless of data volume or
• Real-time acquisition number of sources
Data Availability Reliable
• Parallel loading
• Large-volume buffering For mission-critical operations
• Automatic retry that can’t go down
• Data replay
Data Distribution Multi-Platform
• Continuous loading
• No disruption with bad consumers
Feeds explosion of analytic
• Warehouses, DBs, Hadoop, etc apps on any platform without
• Web, mobile, custom apps disrupting other consumers
15
16. Learn More About Big Data Movement
Capture, Share, and Distribute
Big Data For Mission-Critical Analytics
Access videos, how-to
guides, and other
educational materials at: www.terverla.com
tervela.com/datafabric @tervela
info@tervela.com
16
Notes de l'éditeur
Big Data, Big Opportunity – University of Texas at Austin, Sept 2011A “real-time infrastructure” – Gartner – (ranks 3rd after “developing business solutions” and “reducing the cost of IT”)Organizations using analytics for competitive advantage – “Outperforming in a Data-Rich and Hyper-Connected World.” IBM Center for Applied Insights and Economic Intelligence
Use this, instead, for “new role of the data warehouse” slide??
Benefit: better manage performanceChallenge: Keep reporting systems up to date with changes