Driving Behavioral Change for Information Management through Data-Driven Gree...
Blueprint for integrating big data analytics and bi
1. Big Data Insight
Blueprint for Integrating Big Data Analytics and BI
Abe Taha, VP Engineering
abetaha@karmasphere.com
www.karmasphere.com
2. Big Data Insight
> Agenda
ü Where does Big Data Analytics fit in the BI ecosystem
ü How does Big Data Analytics complement the type of analysis we do today using BI
ü What are clients doing with Big Data Analytics that they couldn’t do with BI
ü What do we need to think about to make Hadoop deployments successful
2 Karmasphere Proprietary and Confidential. Do Not Copy. Do Not Distribute
5. Big Data Insight
> The Best of Both Worlds = Big Data Analytics + Traditional BI
Traditional BI Big Data Analytics
Purpose Reporting on business Optimizing the business
Paradigm Ask a specific question Ask any question
Format Look at structured data Look at all data
Setup Pre-engineered On-the-fly
Data locations Siloed One place
Agility Weeks to months Almost Immediate
5 Karmasphere Proprietary and Confidential. Do Not Copy. Do Not Distribute
7. Big Data Insight
> What Hadoop Adopters Are Saying
“The kind of new stuff
we want to do
can’t get done with
BI“
Large Hi Tech Chip Manufacturer
7 Karmasphere Proprietary and Confidential. Do Not Copy. Do Not Distribute
8. Big Data Insight
> How to make Hadoop successful with BI
1. Employ All Data
2. Use All Analytic Assets
3. Provide Self-Service Access for All Users
4. Build a Collaborative Environment
5. Be Open and Extensible
6. Populate Best-of-Breed Reporting Tools
9. Big Data Insight
> Cornerstone 1: Employ All Data
ü Leave No Data Behind
• Raw unstructured – Web logs, machine /
sensor data, mobile social, video, etc.
• Structured data – traditional RDMBS, EDW’s
• Streaming vs. batch oriented
• Data governance and quality
10. Big Data Insight
> Cornerstone 2: Use All Analytic Assets
ü Employ All Analytic Assets
• Traditional models and assets
• Standard Hadoop components including
UDFs and SerDes
• Custom algorithms
• Models created in other systems such as
SAS/R
11. Big Data Insight
> Cornerstone 3: Provide Self-Service Access for All Users
ü Self-Service
• BYOD: Bring Your Own Data
• Ingest custom functions and algorithms
• Intuitive, no special skill sets required
ü Empower All Users and Skill Sets
• Business User
• Easy-to-use ad-hoc analysis, web-based forms
• Drag and drop
• Data Analysts
• Common skills: SQL
• Powerful iterative analysis
• Analytical models and algorithms
• Customers and Partners for ecosystem
12. Big Data Insight
> Cornerstone 4: Build a Collaborative Environment
ü Collaborative
• Project-based environment
• Leverage cross-functional skills
• Security and isolation
ü Social
• Share data and insights across teams
• Metadata, Queries, Results and Visualizations
• View colleague’s activities
• Usage feedback and metrics
13. Big Data Insight
> Cornerstone 5: Be Open and Extensible
ü Open
• Active community, rapid innovation
• Vendor commitment
• Standards based
• Portable - No vendor lock-in
• Expose standard API’s and interfaces
ü Extensible
• Add custom functions
• Reuse existing analytic models
• Add additional data sources by defining custom parsers
14. Big Data Insight
> Cornerstone 6: Populate Best-of-Breed Reporting Tools
ü Best-Of-Breed Reporting tools
• Ingest data from existing BI systems and ad hoc data including
Spreadsheet data
• Automate delivery of insights
• Push insights to RDBMS, EDW’s and MPP
• Expose standards APIs for programmability
15. Big Data Insight
> How would an architecture look
15 Karmasphere Proprietary and Confidential. Do Not Copy. Do Not Distribute
16. Big Data Insight
> Summary
1. Implement Big Data Analytics and BI co-existence Hadoop at your fingertips
2. Leverage all your assets
3. Use and build on open and extensible solutions across your company…
4. Build social and collaborative in early
Private and Confidential
17. Big Data Insight
> Summary Get the Best of Both Worlds – Build a Bridge Inside Your Company
Big Data Analytics on Hadoop
Future, see intent
Drives Optimization
BI Just getting started
Historical
Drives reporting
Entrenched
Be around for a long time