More Related Content
Similar to Hortonworks Presentation at Big Data London
Similar to Hortonworks Presentation at Big Data London (20)
More from Hortonworks (20)
Hortonworks Presentation at Big Data London
- 2. Hortonworks
• Who is Hortonworks
• Our Approach
• Customer Use Cases
Page 2
© Hortonworks Inc. 2013
- 3. Housekeeping Items
• Restrooms on 2nd and 4th Floors
• Hadoop Summit
– March 20-21 in Amsterdam
– PreConference Training on March 18-19
– Discount Code Amst13Spon20
• Download SandBox
– QR Code at postcode on table
Page 3
© Hortonworks Inc. 2013
- 4. A Brief History of Apache Hadoop
Apache Project Yahoo! begins to Hortonworks
Established Operate at scale Data Platform
2013
2004 2006 2008 2010 2012 Enterprise
Hadoop
2005: Yahoo! creates
team under E14 to Focus on INNOVATION
work on Hadoop
2008: Yahoo team extends focus to
operations to support multiple Focus on OPERATIONS
projects & growing clusters
2011: Hortonworks created to focus
on “Enterprise Hadoop“. Starts with 24 STABILITY
key Hadoop engineers from Yahoo
Page 4
© Hortonworks Inc. 2013
- 5. Hortonworks Snapshot
We develop, distribute and support
the ONLY 100% open source
Headquarters: Palo Alto, CA
Employees: 180+ and growing Enterprise Hadoop distribution
Investors: Benchmark, Index, Yahoo
Develop Distribute Support
• We employ the core • We distribute the only 100% • We are uniquely positioned
architects, builders and Open Source Enterprise to deliver the highest quality
operators of Apache Hadoop Hadoop Distribution: of Hadoop support
Hortonworks Data Platform
• We drive innovation within • We enable the ecosystem to
Apache Software • We engineer, test & certify work better with Hadoop
Foundation projects HDP for enterprise usage
Endorsed by Strategic Partners
Page 5
© Hortonworks Inc. 2013
- 6. Hortonworks
• Who is Hortonworks
• Our approach
– Leading Open Source Hadoop innovation
– Addressing “Enterprise Hadoop” Requirements
– Enabling Interoperability of the Ecosystem
– Ensuring No Lock-In: 100% Open Source
• Patterns of Use
Page 6
© Hortonworks Inc. 2013
- 7. Apache Community Leadership
Apache
Apache Software Foundation
Pig Test & Guiding Principles
Patch Release
Apache • Release early & often
Hadoop
Apache • Transparency, respect, meritocracy
Hive
Design & Develop
Key Roles held by Hortonworkers
Apache
Apache
HBase HCatalog • VP & PMC Members
– Arun Murthy (Hadoop), Daniel Dai (Pig),
Apache
Ambari Mahadev Konar (Zookeeper)
Other
Apache
Projects
• Release Managers
– Matt Foley (Hadoop 1.x), Arun Murthy
(Hadoop 2.x), Ashutosh Chauhan (Hive),
“We have noticed more activity over the last year Daniel Dai (Pig), Alan Gates (HCatalog),
from Hortonworks’ engineers on building out Mahadev Konar (Ambari)
Apache Hadoop’s more innovative features. These
include YARN, Ambari and HCatalog..”
• Committers
- Jeff Kelly: Wikibon – 54 across all Hadoop-related projects
Page 7
© Hortonworks Inc. 2013
- 8. Leadership that Starts at the Core
• Driving next generation Hadoop
– YARN, MapReduce2, HDFS2, High
Availability, Disaster Recovery
• 420k+ lines authored since 2006
– More than twice nearest contributor
• Deeply integrating w/ecosystem
– Enabling new deployment platforms
– (ex. Windows & Azure, Linux & VMware HA)
– Creating deeply engineered solutions
– (ex. Teradata big data appliance)
• All Apache, NO holdbacks
– 100% of code contributed to Apache
Page 8
© Hortonworks Inc. 2013
- 9. Driving Enterprise Hadoop Innovation
Lines Of Code By Company Hortonworks Cloudera
Source: Apache Software Foundation Committers Committers
HADOOP 19 9
CORE
PIG 5 1
HIVE 1 0
HCATALOG 5 0
HBASE 3 7
AMBARI 14 0
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Hortonworks Yahoo!
Cloudera Other
Page 9
© Hortonworks Inc. 2013
- 10. Hortonworks Process for Enterprise Hadoop
Upstream Community Projects Downstream Enterprise Product
Virtuous cycle when development & fixed issues done upstream & stable project releases flow downstream
Integrate
& Test
Fixed Issues
Apache Design &
Pig Test &
Patch Develop
Apache Release Package
Hadoop & Certify
Apache Stable Project Hortonworks
Hive Releases
Design & Develop Data Platform
Apache Apache
HBase HCatalog
Distribute
Apache
Other Ambari
Apache
Projects No Lock-in: Integrated, tested & certified distribution lowers
risk by ensuring close alignment with Apache projects
Page 10
© Hortonworks Inc. 2013
- 11. Hortonworks
• Who is Hortonworks
• Our approach
– Leading Open Source Hadoop Innovation
– Addressing “Enterprise Hadoop” Requirements
– Enabling Interoperability of the Ecosystem
– Ensuring NO LOCK-IN: 100% Open Source
• Patterns of use
Page 11
© Hortonworks Inc. 2013
- 12. Enhancing the Core of Apache Hadoop
Deliver high-scale
storage & processing
with enterprise-ready
platform services
Distributed Unique Focus Areas:
HADOOP
CORE
Storage & Processing
• Bigger, faster, more flexible
Continued focus on speed & scale and
PLATFORM
SERVICES
Enterprise Readiness enabling near-real-time apps
• Tested & certified at scale
Run ~1300 system tests on large Yahoo
clusters for every release
Hortonworkers are the architects,
operators, and builders of core Hadoop
• Enterprise-ready services
High availability, disaster recovery,
snapshots, security, …
Page 12
© Hortonworks Inc. 2013
- 13. Data Services for Full Data Lifecycle
DATA
Provide data services to
SERVICES
store, process & access
Store, data in many ways
Process and
Access Data
Unique Focus Areas:
Distributed
• Apache HCatalog
HADOOP
CORE
Storage & Processing Metadata services for consistent table
access to Hadoop data
PLATFORM
SERVICES
Enterprise Readiness
• Apache Hive
Explore & process Hadoop data via SQL &
ODBC-compliant BI tools
Hortonworks enables Hadoop data to be
accessed via existing tools & systems
Page 13
© Hortonworks Inc. 2013
- 14. Operational Services for Ease of Use
OPERATIONAL
DATA
Include complete
SERVICES
SERVICES
operational services for
Manage & Store, productive operations
Operate at Process and
Scale Access Data & management
Distributed Unique Focus Area:
HADOOP
CORE
Storage & Processing
• Apache Ambari:
Provision, manage & monitor a cluster;
PLATFORM
SERVICES
Enterprise Readiness complete REST APIs to integrate with
existing operational tools; job & task
visualizer to diagnose issues
Only Hortonworks provides a complete
open source Hadoop management tool
Page 14
© Hortonworks Inc. 2013
- 15. Deployable Across a Range of Options
OPERATIONAL
DATA
Only Hortonworks
SERVICES
SERVICES
allows you to deploy
Manage & Store, seamlessly across any
Operate at Process and
Scale Access Data deployment option
Distributed • Linux & Windows
HADOOP
CORE
Storage & Processing
• Azure, Rackspace & other clouds
• Virtual platforms
PLATFORM
SERVICES
Enterprise Readiness
• Big data appliances
HORTONWORKS
DATA
PLATFORM
(HDP)
OS
Cloud
VM
Appliance
Page 15
© Hortonworks Inc. 2013
- 16. HDP: Enterprise Hadoop Distribution
OPERATIONAL
DATA
Hortonworks
SERVICES
SERVICES
Data Platform (HDP)
Manage & Store,
Operate at Process and Enterprise Hadoop
Scale Access Data
• The ONLY 100% open source
HADOOP
CORE
Distributed and complete distribution
Storage & Processing
PLATFORM
SERVICES
Enterprise Readiness • Enterprise grade, proven and
tested at scale
HORTONWORKS
DATA
PLATFORM
(HDP)
• Ecosystem endorsed to
ensure interoperability
OS
Cloud
VM
Appliance
Page 16
© Hortonworks Inc. 2013
- 17. HDP 1.2: Data Services Improvements
OPERATIONAL
DATA
Hortonworks
SERVICES
SERVICES
Data Platform (HDP)
AMBARI
FLUME
PIG
HIVE
HBASE
Enterprise Hadoop
OOZIE
SQOOP
HCATALOG
• The ONLY 100% open source
HADOOP
CORE
WEBHDFS
MAP
REDUCE
and complete distribution
HDFS
YARN
(in
2.0)
Enterprise Readiness
PLATFORM
SERVICES
High Availability, Disaster Recovery, • Enterprise grade, proven and
Snapshots, Security, etc…
tested at scale
HORTONWORKS
DATA
PLATFORM
(HDP)
• Ecosystem endorsed to
ensure interoperability
OS
Cloud
VM
Appliance
Page 17
© Hortonworks Inc. 2013
- 18. Latest Hortonworks Announcements
Two releases in January 2013
JANUARY Hortonworks Data Platform 1.2
Hortonworks Brings Enterprise Manageability to 100%
15 Open Source Apache Hadoop Distribution
JANUARY Hortonworks Sandbox
Hortonworks accelerates Hadoop skills development
22 with an easy-to-use, flexible and extensible platform to
learn, evaluate and use Apache Hadoop
Page 18
© Hortonworks Inc. 2013
- 19. Latest Hortonworks Announcements
February 2013
February Hortonworks : New Apache projects
Hortonworks fuel the Open Source by releasing three
20 new projects : KNOX / TEZ / STINGER
February HDP available on Microsoft Windows
To help the Hadoop adoption, Hortonworks release
25 HDP on Microsoft Windows
Page 19
© Hortonworks Inc. 2013
- 20. Hortonworks
• Who is Hortonworks
• Our approach
– Leading Open Source Hadoop Innovation
– Addressing “Enterprise Hadoop” Requirements
– Enabling Interoperability of the Ecosystem
– Ensuring No Lock-in: 100% Open Source
• Patterns of use
Page 20
© Hortonworks Inc. 2013
- 21. Existing Data Architecture
APPLICATIONS
Business
Custom
Enterprise
AnalyLcs
ApplicaLons
ApplicaLons
DEV
&
DATA
TOOLS
BUILD
&
TEST
DATA
SYSTEMS
OPERATIONAL
TOOLS
MANAGE
&
MONITOR
RDBMS
EDW
MPP
TRADITIONAL
REPOS
DATA
SOURCES
TradiLonal
Sources
(RDBMS,
OLTP,
OLAP)
OLTP,
POS
SYSTEMS
Page 21
© Hortonworks Inc. 2013
- 22. An Emerging Data Architecture
APPLICATIONS
Business
Custom
Enterprise
AnalyLcs
ApplicaLons
ApplicaLons
DEV
&
DATA
TOOLS
BUILD
&
TEST
DATA
SYSTEMS
OPERATIONAL
TOOLS
HORTONWORKS
MANAGE
&
DATA
PLATFORM
MONITOR
RDBMS
EDW
MPP
TRADITIONAL
REPOS
DATA
SOURCES
TradiLonal
Sources
New
Sources
(RDBMS,
OLTP,
OLAP)
OLTP,
POS
(web
logs,
email,
sensor
data,
social
mMOBILE
edia)
SYSTEMS
DATA
Page 22
© Hortonworks Inc. 2013
- 23. Interoperating With Your Tools
APPLICATIONS
Microsoft Applications
DEV
&
DATA
TOOLS
DATA
SYSTEMS
OPERATIONAL
TOOLS
HORTONWORKS
DATA
PLATFORM
TRADITIONAL
REPOS
Viewpoint
DATA
SOURCES
TradiLonal
Sources
New
Sources
(RDBMS,
OLTP,
OLAP)
OLTP,
POS
(web
logs,
email,
sensor
data,
social
mMOBILE
edia)
SYSTEMS
DATA
Page 23
© Hortonworks Inc. 2013
- 24. Hortonworks
• Who is Hortonworks
• Our approach
– Leading Open Source Hadoop Innovation
– Addressing “Enterprise Hadoop” Requirements
– Enabling Interoperability of the Ecosystem
– Ensuring No Lock-In: 100% Open Source
• Patterns of use
Page 24
© Hortonworks Inc. 2013
- 25. Hortonworks
• Who is Hortonworks
• Our approach
• Patterns of use
Page 25
© Hortonworks Inc. 2013
- 26. Operational Data Refinery
Refine Explore Enrich
APPLICATIONS
Business
Custom
Enterprise
Collect data and apply
AnalyLcs
ApplicaLons
ApplicaLons
a known algorithm to it
in trusted operational
process
1 Capture
3 Capture all data
DATA
SYSTEMS
HORTONWORKS
DATA
PLATFORM
2 2 Process
RDBMS
EDW
MPP
TRADITIONAL
REPOS
Parse, cleanse, apply
structure & transform
3 Exchange
1 Push to existing data
warehouse for use with
existing analytic tools
DATA
SOURCES
TradiLonal
Sources
New
Sources
(RDBMS,
OLTP,
OLAP)
(web
logs,
email,
sensor
data,
social
media)
Page 26
© Hortonworks Inc. 2013
- 27. Big Data Exploration & Visualization
Refine Explore Enrich
APPLICATIONS
Business
Custom
Enterprise
Collect data and
AnalyLcs
ApplicaLons
ApplicaLons
perform iterative
investigation for value
3
1 Capture
Capture all data
DATA
SYSTEMS
HORTONWORKS
DATA
PLATFORM
2 2 Process
RDBMS
EDW
MPP
TRADITIONAL
REPOS
Parse, cleanse, apply
structure & transform
3 Exchange
1 Explore and visualize
with analytics tools
supporting Hadoop
DATA
SOURCES
TradiLonal
Sources
New
Sources
(RDBMS,
OLTP,
OLAP)
(web
logs,
email,
sensor
data,
social
media)
Page 27
© Hortonworks Inc. 2013
- 28. Application Enrichment
Refine Explore Enrich
APPLICATIONS
Custom
Enterprise
Collect data, analyze
ApplicaLons
ApplicaLons
and present salient
results for online apps
3
1 Capture
Capture all data
DATA
SYSTEMS
HORTONWORKS
DATA
PLATFORM
2 2 Process
RDBMS
EDW
MPP
NOSQL
TRADITIONAL
REPOS
Parse, cleanse, apply
structure & transform
3 Exchange
1 Incorporate data directly
into applications
DATA
SOURCES
TradiLonal
Sources
New
Sources
(RDBMS,
OLTP,
OLAP)
(web
logs,
email,
sensor
data,
social
media)
Page 28
© Hortonworks Inc. 2013
- 29. Key 2013 “Enterprise Hadoop” Initiatives
Invest In:
Tez / “Stinger”
Interactive Query
– Platform Services
Ambari HBase – DR, Snapshot, …
Manage & Operate Online Data
OPERATIONAL
DATA
SERVICES
SERVICES
HADOOP
CORE
– Data Services
PLATFORM
SERVICES
– In support of Refine,
“Gateway” HORTONWORKS
“Herd” Explore, Enrich
Secure Access
DATA
PLATFORM
(HDP)
Data Integration
– Operational Services
“Continuum” – Manageability,
Biz Continuity
Security, …
Page 29
© Hortonworks Inc. 2013
- 30. Stinger: Make Hive Best for All Needs
Interac4ve
Non-‐Interac4ve
Batch
• Parameterized
• Data
prepara4on
• Opera4onal
batch
Reports
• Incremental
batch
processing
• Drilldown
processing
• Enterprise
Reports
• Visualiza4on
• Dashboards
/
• Data
Mining
• Explora4on
Scorecards
5s – 1m 1m – 1h 1h+
Data Size
Improve Latency & Throughput Extend Deep Analytical Ability
• Query engine improvements • Analytics functions
• New “Optimized RCFile” column store • Improved SQL coverage
• Next-gen runtime (elim’s M/R latency) • Continued focus on core Hive use cases
Page 30
©
Hortonworks
Inc.
2013
- 31. Flexible Support Subscription Programs
Leverage Hortonworks Expertise: Subscription and Support delivered and
backed by Hadoop experts; subscriptions based on nodes or storage
Developer Support
12 x 5 All Sev: Application
“How to” guidance for 1 seat Code Review
Web only 1 business day Design Advice
developers and archs
Enterprise Support 24 x 7
Sev 1: 1 Hour 5 Patches & Cluster Design, Install,
Operations support for Phone &
Sev 2: 4 Bus Hour Contacts Updates Maintain, Performance
Web
critical clusters
Additional Options
Standard Support
12 x 5 All Sev: 3 Patches & Cluster Design, Install,
Operations support for Web only 1 business day Contacts Updates Maintain, Performance
dev & test clusters
Essential Support*
12 x 5 All Sev: 3 Patches & Cluster Design, Install,
Operations support for Web only 1 business day Contacts Updates Maintain, Performance
small research clusters
* Limited in size and no expansion
© Hortonworks Inc. 2013 Page 31
- 32. Hortonworks: Best In Class Hadoop Support
• Experienced enterprise support team
– Experience supporting enterprise clients in production
– Core engineers have real operational
experience: built and supported 44+K nodes in production
– Extensive experience in commercial big data offerings
including HDP, MapR, Karmasphere
• Global 24x7 operation – support based in Sunnyvale, UK & India
• Stringent case management processes ensures high quality customer
service & responsiveness
Page 32
© Hortonworks Inc. 2013
- 33. Transferring Our Hadoop Expertise to You
The expert source for
Apache Hadoop training & certification
• World class training programs designed to
help you learn fast
– Role-based hands on classes with 50% lab time
• Expert consulting services
– Programs designed to transfer knowledge
• Industry leading Hadoop Sandbox program
– Fastest way to learn Apache Hadoop
– Multi-level tutorials for wide applicability
– Customizable and updateable
Page 33
© Hortonworks Inc. 2013
- 34. Summary
• Leading the Innovation in Core Hadoop
• Addressing the requirements for Enterprise usage
• Enabling interoperability of the ecosystem
• No lock-in. 100% Open Source.
• Best in industry support with flexible pricing model
• Find out more
– www.hortonworks.com
– http://hortonworks.com/hadoop-training/
Page 34
© Hortonworks Inc. 2013