Contenu connexe Similaire à Edw Optimization Solution (20) Edw Optimization Solution 1. 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved1 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Scott Gnau, CTO
@Scott_Gnau
2. 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
The Next Gen EDW is the Big Data Warehouse
In Forrester’s 2016 global survey, 59% of respondents stated that leveraging big data
and analytics was a critical or high priority.
3. 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Companies Are Looking to Big Data for EDW Optimization
82% of 2550+ respondents are looking to Big Data for EDW Optimization rather than a
straight replacement. – 2016 Big Data Maturity Survey
4. 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Hortonworks Connected Data Platforms and Solutions
Hortonworks
Connection
Hortonworks Solutions
Enterprise Data
Warehouse Optimization
Cyber Security and
Threat Management
Internet of Things
and Streaming Analytics
Hortonworks Connection
Subscription Support
SmartSense
Premier Support
Educational Services
Professional Services
Community Connection
Cloud
Hortonworks Data Cloud
AWS HDInsight
Data Center
Hortonworks Data Suite
HDFHDP
5. 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Drivers of a Modern BI Infrastructure
Deeper and
Broader Data Sets
Complete Data
‘Provenance’
Leading Analytics
and Tools
Integrate non-EDW
data and EDW data
Total Cost of
Ownership
6. 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Open Source Transformational Impact to EDW
Unmatched Economics
support low cost data-center and cloud
architectures for Enterprise Apache
Hadoop
Eliminates Risk and Ensures Integration
prevents vendor lock-in and speeds
ecosystem adoption of ODPi-compliant
core
COST
EFFICIENCY
DATA
VARIETY
EDW
PROPRIETARY
HADOOP
HORTONWORKS
OPEN SOURCE
RDBMS
7. 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
But, why aren’t more companies running to this solution?
Risky
Hadoop requires a bunch of
new skill sets
It’ll take a long time
There’s too much manual coding required
It’s hard to integrate to
my BI tool stack
8. 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Legacy EDW vs. EDW Optimization Solution with Connected Data Platforms
9. 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
EDW Optimization: Fast BI on Hadoop
The Problem:
– Legacy EDW systems were adopted for Fast BI
and deep slice-and-dice analytics, but EDW
costs can limit breadth and depth of these
analytics.
The Solution:
– Interactive SQL is a reality on Hadoop today.
– AtScale Intelligence Platform adds OLAP
capabilities for deep drilldown at scale.
The Result:
– Query terabytes of data in seconds.
– Connect your favorite BI tools like Tableau and
Excel through SQL and MDX interfaces.
– The EDW Optimization Solution is tailor-made
to deliver Fast BI on Hadoop.
ETL/ELT
DATA
MART
DATA
LANDING &
DEEP
ARCHIVE
CUBE
MART
END USER
APPLICATIONS
APPLICATIONS
APPLICATIONS
END USERS
AND APPS
EDW OPTIMIZATION SOLUTION
10. 10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
EDW Optimization: ETL Offload
The Problem:
– EDWs can consume between 50% and 90% of
resources just on ETL/ELT tasks.
– These jobs interfere with more business-
critical tasks like BI and advanced analytics.
The Solution:
– Hive and HDP deliver ETL that scales to
petabytes.
– Syncsort DMX-h for simple drag-and-drop ETL
workflows.
– Economical scale-out processing on
commodity servers.
The Result:
– Better SLAs for mission-critical analytics.
– Limit EDW expansion or retire old systems.
ETL/ELT
DATA
MART
DATA
LANDING &
DEEP
ARCHIVE
CUBE
MART
END USER
APPLICATIONS
APPLICATIONS
APPLICATIONS
END USERS
AND APPS
EDW OPTIMIZATION SOLUTION
11. 11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
EDW Optimization: Active Archive
The Problem:
– Increasing data volumes and cost pressure
force data to be archived to tape.
– Archived data not available for analytics, or
must be retrieved at great expense.
The Solution:
– Adopting Hadoop delivers cost per terabyte
on par with tape backup solutions.
– Data in Hadoop can be analyzed by all major
BI tools, allowing analytics on archive data.
The Result:
– Data always available for analytics.
– Store years of data rather than months.
ETL/ELT
DATA
MART
DATA
LANDING &
DEEP
ARCHIVE
CUBE
MART
END USER
APPLICATIONS
APPLICATIONS
APPLICATIONS
END USERS
AND APPS
EDW OPTIMIZATION SOLUTION
12. 12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Multi-Channel Behavioral Analysis
Industry: Mass Media
– Largest broadcasting and cable company
in the world by revenue
– Multiple channels: Cable (set-top-box),
wireless devices, streaming
programming,
– 22 million+ subscribers (internet &
video)
Results:
– Scalability: 480B rows, 500 nodes
– 60x query performance improvement
– Insights: New info improve negations
– Loyalty: Outreach to customers viewing
competitive streams; ▼churn ▲
revenue
Before After
Leading Media Company
Hortonworks HDP
AtScale Intelligence Server
Hortonworks HDP
Netezza Data Mart
Channel Feeds
Tableau + MS Excel + R
Channel Feeds
Tableau + MS Excel
13. 13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Campaign Paid-Search Effectiveness: Retail
Industry: Retail / eCommerce
– Top US department store (by rev)
– Online sales $4B+ & growing (11%+ total)
– 800+ department stores nationwide
Results
– Scale: Millions paid keywords analyzed
– Speed: Eliminate extract step
– Insight: Operationalized closed-loop
analysis insight decision action
– Impact: Make and save $ millions w/
instant bid decisions over 6-week season
that drives 60% annual revenue
Before After
Hortonworks HDP
AtScale Intelligence Server
Hortonworks HDP
Vertica Data Marts
Ad & Paid Keywords
Cognos + Tableau + Excel
Ad & Paid Keywords
Tableau + Excel
Leading Retailer
14. 14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Client and Patient Analysis
Industry: Managed Health Care
– Member of Fortune 100
– Health, life + other insurance products
– ~ 52 million members;
medical/dental/pharm
Results
– Scalable: BI directly on 264+ nodes data
– Time: Eliminate data movement step
– 62x query performance improvement
– Speed: <2.2 second average query time
– Insight: Tableau on Hadoop for 1000+
– Security: Access control by user; HIPAA
Before After
Leading Managed Healthcare Provider
Hortonworks HDP
AtScale Intelligence Server
Hortonworks HDP
Netezza Data Mart
Client / Patient Details
Tableau + MS Excel
Client / Patient Details
Tableau + MS Excel
15. 15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Solution Architecture
Inbound
HDFS
(Based Data and Aggregates
Stored in ORC)
HIVE
(Batch and Interactive SQL)
HORTONWORKS DATA PLATFORM (HDP)
MULTITENANT PROCESSING:
YARN
(syncsort, llap, spark, tez)
AtScale
virtual cube
DMX Data
Funnel
DMX-h
Engine
EDW/
Legacy
4. Build Virtual Cube using AtScale
5. Build aggregates in Atscale for optimization
6. Query data using BI Tool like Tableau/Excel
through odbc/jdbc connection
High Level Flow
1. Install HDP, Syncsort and AtScale
2. Install EDW/Hive Drivers on Edge Node
3. Bring all tables involved in use case using
Syncsort data funnel into Hive
16. 16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Hortonworks EDW Optimization Solution Components
Syncsort
High-Performance
Data Movement
Hadoop
Scalable Storage and Compute
Hive LLAP
High Performance SQL Data Mart
AtScale Intelligence Platform
OLAP Cubes for Higher Performance
Source Data
Systems
Fast, scalable SQL analytics
Intelligent in-memory caching
Define OLAP cubes for 10x faster queries
Unified semantic layer for all BI tools
High performance data import
from all major EDW platforms
Pre-aggregated
data
... Or, full-fidelity
data
17. 17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
ETL Workflow Onboarding: SyncSort DMX-h
18. 18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Hybrid Query Service
❑ Choice of BI Tool
❑ Zero Client Install
❑ Secure Data
Access
❑ Optimized Queries
19. 19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Enterprise Data Optimization Solution Components
Hortonworks: 24 nodes of Enterprise
Plus Support
Syncsort: 24 nodes of DMX-H
AtScale: 24 nodes of AtScale Intelligence
Platform
Single Legacy Data source
1 Fact table with 5 Dimensions
Load up to 15 tables
One time data dump
Up to 1 cube with 10 measures
1 BI Connection
5TB Total Cube Limit
12 month license and support offering Pre-packaged Professional Services
20. 20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Future Proof
Hive Optimizations
– Hve, Tez, ORC, LLAP
– Additional SQL coverage
ACID Merge for SQL 2011 compliant (Upsert)
Business Continuity Options
– Replication
– Backup/Restore
Additional Hive options tech preview in 2.6
21. 21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
EDW Package: Professional Services ‘Proof of Value’
1. Install HDP, AtScale and Syncsort
2. Configure drivers for appropriate EDW and Hive on Edge Node
3. Enable and configure Interactive Hive (LLAP)
4. Ingest data from 1 legacy system
5. Create up to 3 BI cubes
6. Support connection to BI Tool
7. Demo of capabilities ( functionality and Performance). Under 10 second response time.
8. Solution Architecture Document and Schema definition
22. 22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
EDW Optimization Solution - Try It Now!
Tool-based approach means we can
leverage existing skillsets
Proof points in 60 days
Integrated into my BI tool stack
Hive supports scaled
queries and fast queries
It works!
23. 23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
To Learn More
Everyone will receive a free copy of Forrester White Paper titled ”The Next-Generation
EDW Is The Big Data Warehouse”
EDW Optimization with HDP
– http://hortonworks.com/solutions/edw-optimization/
– EDW Optimization 7 min video
AtScale Intelligence Platform
– http://hortonworks.com/partner/atscale/
Syncsort DMX-h
– http://hortonworks.com/partner/syncsort/
24. 24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Hortonworks Connected Data Platforms and Solutions
Hortonworks
Connection
Hortonworks Solutions
Enterprise Data
Warehouse Optimization
Cyber Security and
Threat Management
Internet of Things
and Streaming Analytics
Hortonworks Connection
Subscription Support
SmartSense
Premier Support
Educational Services
Professional Services
Community Connection
Cloud
Hortonworks Data Cloud
AWS HDInsight
Data Center
Hortonworks Data Suite
HDFHDP