This is the keynote presentation from Hadoop founder Dough Cutting and Cloudera Sr. Director of Product Management Charles Zedlewski. It announces Cloudera's Distribution for Hadoop v3 and Cloudera Enterprise.
for more info go to www.cloudera.com
4. A Career…
Copyright 2010 Cloudera Inc. All rights reserved 4
5. An Ecosystem…
Copyright 2010 Cloudera Inc. All rights reserved 5
6. A Market…
Copyright 2010 Cloudera Inc. All rights reserved 6
7. An Emerging Platform for Applications…
Graph analysis Machine learning Scientific Archive Security
Query & reporting Complex ETL Search quality Fraud detection
Clickstream analysis POS analysis Trade compliance And more…
Copyright 2010 Cloudera Inc. All rights reserved 7
8. Hadoop Started From Humble Beginnings….
• MapReduce and HDFS only
• Good for experienced Java
programmers
• Limited application set
Copyright 2010 Cloudera Inc. All rights reserved 8
9. Innovation: the Secret to Hadoop Success
• Projects & components
develop around Hadoop
“Provide more levels of
“Provide common
abstraction & automation
technical services”
for job creation ”
• User base grows
“Make it “Cover more data
movements –
• More applications easier to get
data in & out” inserts, appends,
etc”
are made possible
Copyright 2010 Cloudera Inc. All rights reserved 9
10. But Innovation Isn’t Free
• For every release of MapReduce 20
and HDFS, there are >20
10
releases of related projects
0
• Every component has its own
schedule, versioning, HBase 0.89 HDFS 0.20
dependencies & patch Pig 0.7
requirements Hive 0.6 Oozie 2.0
• Hadoop community likes to
build 2-3 of everything
Copyright 2010 Cloudera Inc. All rights reserved 10
11. Announcing Cloudera’s Distribution for Hadoop v3
• Open source – 100% Apache licensed
• Simplified – Cloudera manages
required versions &
dependencies
• Integrated – all components
interoperate
• Reliable – patched with fixes
from future releases to
improve stability
• Easy to consume – Debian, RPM, tarball, Virtual Machine, EC2,
Rackspace, Softlayer
Copyright 2010 Cloudera Inc. All rights reserved 11
12. What’s New in CDH v3?
• Updates to existing Hadoop frameworks
• Pig 0.7
• Sqoop 1.0
• Hadoop 0.20S (planned)
• Support for 3 new related components
• HBase – with durability
• Zookeeper
• Oozie – run workflows + support for Hive & Sqoop actions
• Introducing 2 new components
• Flume – collect streaming data with centralized
configuration & guaranteed delivery
• Hue – web UI and SDK for Hadoop web applications
Copyright 2010 Cloudera Inc. All rights reserved 12
14. Harnessing Hadoop Has Challenges
Skill Set – experts only
Complexity – more than ten components
Manageability – hard to configure, monitor & administer
Interoperability – limited support for DBMS &
analytic tools
14
15. Announcing Cloudera Enterprise
• Reduces the risks of running Hadoop in production
• Improves consistency, compliance and administrative overhead
Management tools
• Monitoring & config for
data integration
• Authorization mgmt &
provisioning
• Resource mgmt
• Production support for CDH & certified integrations (e.g. Oracle,
Vertica)
Copyright 2010 Cloudera Inc. All rights reserved 15
17. Some Announcements
• Party at our place
• Hackathon on CDH3 – applications, enhancements, open
source contributions
• July 27th, 9:30am – 7:30pm
• For invite: hackathon@cloudera.com
• Free food & snacks
• Or stay home and read
• Hadoop the Definitive Guide, second edition
• Available on October 12th at Hadoop World
Copyright 2010 Cloudera Inc. All rights reserved 17
18. Thank You!
• Stop by our table if you have questions!
Copyright 2010 Cloudera Inc. All rights reserved 18