Gab Genai Cloudera - Going Beyond Traditional Analytic
1. 1
Intel and Cloudera:
Going beyond Traditional Analytics
Gab Gennai
Technical Services Director – APAC
August 2014
Intel and Cloudera:
Going Beyond Traditional Analytics
Gabriel Gennai
Technical Services Director – APAC, Cloudera
2. 2
Cloudera Snapshot
Founded 2008, by former employees of
Employees Today ~ 700
World Class Support 24x7 Global Staff
Pro-active & Predictive Support Programs
Mission Critical Thousands of Enterprise Users
Over 400 Paying Subscription Customers
The Largest Ecosystem Over 1000+ Partners
Cloudera University Over 50,000+ Trained
Open Source Leaders Cloudera Employees are Leading Developers & Contributors
Total Capital Raised A lot! (from Intel, Google, Dell, T. Rowe Price, Accel, Greylock)
Mission Help Organizations Leverage the Power of
All Their Data to Ask Bigger Questions.
3. 3
Expanding Big Data Requires A New Approach
1980s
Bring Data to Compute
Now
Bring Compute to Data
Relative size & complexity
Data
Information-centric
businesses use all data:
Multi-structured,
internal & external data
of all types
Compute
Compute
Compute
Process-centric
businesses use:
• Structured data mainly
• Internal data only
• “Important” data only
Compute
Compute
Compute
Data
Data
Data
Data
4. 4
The Old Way: Bringing Data to Compute
Complex Architecture
• Many special-purpose
systems
• Moving data around
• No complete views
4
Missing Data
• Leaving data behind
• Risk and compliance
• High cost of storage
1
Time to Data
• Up-front modeling
• Transforms slow
• Transforms lose data
2
Cost of Analytics
• Existing systems strained
• No agility
• “BI backlog”
3
SERVERSMARTSEDWS DOCUMENTS STORAGE SEARCH ARCHIVE
ERP, CRM, RDBMS, MACHINES FILES, IMAGES, VIDEOS, LOGS, CLICKSTREAMS EXTERNAL DATA SOURCES
5. 5
Why Is This Happening Now? • Consumption
• Connections
• Activity
• Pace & Speed
• Instrumentation
• Sensor Data
• Location Points, Metrics
• Tweets, Images
• Fuel band
• Exploration
• Ease of Accessibility
• Faster research
• Value
• Asset
• Drive new business Models
6. 6
SERVERS MARTS EDWS DOCUMENTS STORAGE SEARCH ARCHIVE
ERP, CRM, RDBMS, MACHINE FILES, IMAGES, VIDEOS, LOGS, CLICKSTREAMS EXTERNAL DATA SOURCES
The New Way: Bringing Compute to Data
Diverse Analytic Platform
• Bring applications to data
• Combine different workloads on
common data (i.e. SQL + Search)
• True analytic agility
4
1
2
3 4
Active Compliance Archive
• Full fidelity original data
• Indefinite time, any source
• Lowest cost storage
1
Persistent Staging
• One source of data for all analytics
• Persist state of transformed data
• Significantly faster & cheaper
2
Self-Service Exploratory BI
• Simple search + BI tools
• “Schema on read” agility
• Reduce BI user backlog requests
3
7. 7
Open Source
Scalable
Flexible
Cost-Effective
✔
Managed
Open Architecture
Secure and
Governed
From Hadoop to an Enterprise Data Hub
✔
✔
✔
BATCH
PROCESSING
ANALYTIC
SQL
SEARCH
ENGINE
MACHINE
LEARNING
STREAM
PROCESSING
3RD PARTY
APPS
WORKLOAD MANAGEMENT
STORAGE FOR ANY TYPE OF DATA
UNIFIED, ELASTIC, RESILIENT, SECURE
DATA
MANAGEMENT
SYSTEM
MANAGEMENT
CLOUDERA’S ENTERPRISE DATA HUB
Filesystem Online NoSQL
✔
8. 8
Enabling the App Store of Big Data
Software (BI, Analytics, & Data Integration)
System Integration Cloud & MSP
Hardware Database/Platform
Note: Display Cloudera Connect Platinum and Gold partners only
9. 9
2014 Gartner MQ for Data Warehouse DBMS
“A data warehouse DBMS is now expected
to coordinate data virtualization strategies,
and distributed file and/or processing
approaches, to address changes in data
management and access requirements.”
10. 10
Discover New Use Cases
ON-LINE SERVICES /
SOCIAL MEDIA
People & career
matching
Website
optimization
HEALTH CARE
Patient sensors,
monitoring,
EHRs Quality
of care
FINANCIAL SERVICES
Risk & portfolio
analysis
New products
MEDIA /
ENTERTAINMENT
Viewers /
advertising
effectiveness
CONSUMER
PACKAGED GOODS
Sentiment
analysis of
what’s hot,
customer service
TRAVEL & TRANSPORTATION
Sensor analysis for
optimal traffic flows
Customer
sentiment
RETAIL
Consumer sentiment
Optimized
marketing
LAW ENFORCEMENT
& DEFENSE
Threat analysis,
Social media
monitoring,
Photo analysis
EDUCATION
& RESEARCH
Experiment
sensor
analysis
LIFE SCIENCES
Clinical trials
Genomics
AUTOMOTIVE
Auto sensors
reporting location,
problems
COMMUNICATIONS
Location-
based
advertising
HIGH TECHNOLOGY /
INDUSTRIAL MFG.
Mfg quality
Warranty
analysis
UTILITIES
Smart Meter
analysis for
network
capacity
OIL & GAS
Drilling
exploration
sensor
analysis
11. 1111
High Res Images
• Multiple times a day
• 1TB per day
• Require high processing due to
zero light in space
• RDBMS could not scale
• Now using Cloudera Manager,
Impala, Search to perform Bus
Analysis.
• Compare images from today or
many years
12. 1212
R & D Decisions
• 5-10 years for new crop
development.
• Data was stored in Silo’s
• Could not combine results
• Bring Data together both
internal & external (new data)
• Researches now share data
• Examine data at speed.
• Data Driven decisions and
narrow time to develop
13. 1313
Oil & Gas Discovery
• Cost reduction of deep
water drill-ships
• Analyzing waves & Seismic
data & convert to pictures
• Collects X & Y coordinates,
wave source & target, way
it was collected.
• Importance because it cost
$1m per day for Drill ships.
• The more data it collects,
the better the search.
14. 1414
HealthCare
• Clinical, Financial & Operational
data kept separate.
• Analysis took days & months
• Increasing costs of storage
• Cloudera Search & Impala.
• Quicker Analysis – now minutes
& hours.
• Quick decisions
• 360 degree patient view
• What equipment to buy based
on Analysis.
• Can patient go to local doctor ?
15. 1515
Insurance
• Could not predict patterns and
merge information.
• Policies were based on minimal
information.
- Detailed historical weather
patterns by neighborhood
- Actual flooding based on
comprehensive data
- Detailed fine-grained
topographical maps
- Erosion data from coastal dunes
- Detailed construction details
- Aerial photos & survey data
16. 1616
Government
• Problem with Web Access
through VPN.
• Tracked data to identify
suspicious behavior.
• Prevent Fraud requirement
• Logs grew too big.
• Cloudera deployed to manage
Petabytes
• Hbase used to detect patterns
and Fraud.
• Perform in Real Time.
• Find the problem first.
18. 18
A High Level View of the Journey
Data
Science
Agile
Exploration
ETL
Acceleration
Operational Efficiency
(Faster, Bigger, Cheaper)
Transformative Applications
(New Business Value)
Cheap
Storage
BusinessIT
EDW
Optimization
Converged
Analytics
21. 21
Cloudera CONNECT Partner Levels
• Not all partners are created equal
• Partner level determines the benefits and
requirements for a partner
Bronze Silver Gold Platinum
22. 22
Partner Program Progression
Visit
Cloudera.com
•View website
•Identify areas of
synergy
•Explore Cloudera
Connect program
information
Submit
Application
•Register first on
Cloudera.com
•Select appropriate
program
•Explain partnership
goals
•Self-identify
Receive
Acceptance
•Start as bronze
member
•Receive Welcome
•Access Cloudera
Connect portal and
partner logo &
guidelines
•Request developer
license without
support
•Self-create assets
Deepen
Relationship
•Take online e-learning
classes
•Use 20% Discount on
Cloudera training
•Work on product
certification
•Use Cloudera Connect
logo on partner
collateral
•Meet in the field on
joint opportunities
Explore Silver
Level
Membership
•Invest in jointly
created collateral
•Receive marketing
support
•Increase
collaboration with
Cloudera
• Receive more sales
assistance for
qualified
opportunities
Invitation to Key
Partners to
become a Gold
Partner
•Must drive a certain
level of revenue with
us/for us.
•Well-defined joint
sales play.
Very Select
Partners invited
to Platinum
•Must drive >$2M+ in
revenue for a defined
joint sales play and be
an industry leader
23. 23
Cloudera Connect Partner programs
Cloudera Connect
Partnership Level
Cloudera Connect Partnership Level Cloudera Connect Partnership Level Cloudera Connect Partnership Level
Bronze Silver Gold Platinum
Program Agreement
Cloudera Reseller Agreement Online Application only ü ü ü
Program Membership Fee N/A N/A US$12k US$15k
Training & Certification
N/A 2 Trained & Certified Admin 4 Trained & Certified Admin 6 Trained & Certified Admin
N/A 2 Trained & Certified Developer 4 Trained & Certified Developer 6 Trained & Certified Developer
N/A 2 Trained Data Analysts 4 Trained & Certified Data Analysts 6 Trained & Certified Data Analysts
Sales Training [2] ü ü ü ü
N/A N/A 1 Business Development Manager 1 Business Development Manager
N/A N/A N/A 2 Solutions Architects
Sales & Marketing
Partner Profile in Cloudera Web N/A ü ü ü
Forecasting N/A Quarterly Monthly Weekly
Cloudera Focused Marketing
Initiatives
N/A 1 per Quarter 1 per Quarter 1 per Month
Defined Business Plan N/A N/A Annually Quaterly
Cloudera Integrated Solutions N/A N/A Min 1 Cloudera Solution Min 3 Cloudera Solutions
Business Commitment
Min. Qualified Pipelines N/A US$500k US$2.0m US$4.0m
ACV Sales Quota N/A US$250K US$1.0m US$2.0m
Relationship Management
Executive Level Sponsorship N/A ü ü ü
Program Requirements
Technical Training [1]
Dedicated Resources [3]
24. 24
How to be Cloudera’s Auth. partner?
Two step process to join Cloudera Connect Online
• 1. Please register on
http://www.cloudera.com/content/support/en/memb
ership-application.html
• 2. Fill out and submit partner program application and
accept our partnership Terms and Conditions:
http://www.cloudera.com/content/cloudera/en/partn
ers/partner-program.html
25. 25
How to be Cloudera’s Auth. partner?
• Once your application is approved by Cloudera, you will
receive an email with links to the Cloudera Connect partner
portal. The portal contains marketing, sales and training
resources you may use to start working with us.
• Enjoy 20% off List for any public training courses
• Submit Deal Registration for joint sales activities when
found sales opp (subject to approval) at
https://www.cloudera.com/content/partners/en/company-
profile/deal-registration.html
26. 26
Start Your Journey with FREE Online Training
Resources @ Cloudera Partner Portal
• Hadoop Essentials Training
• Cloudera Manager Training
• Hadoop Tutorials
• The Hadoop Ecosystem
• Resources for Administrators
• Resources for Developers
• Resources for Data Analysts
• Attend hand-on ILT courses at nearest Cloudera Authorized Training Centers
• Complete Cloudera Certification at your nearest Pearson VUE Test Centers
http://www.pearsonvue.com/cloudera/activity
27. 27
Useful Partner Resources
• Partner List on Cloudera.com (external resource)
http://www.cloudera.com/content/cloudera/en/partners/pa
rtners-listing.html
• Questions? Email to : partner@cloudera.com