Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Webinar: The Future of Hadoop

16 214 vues

Publié le

With a community of over 500 contributors, Apache Hadoop and related projects are evolving at an ever increasing rate. Join the co-creator of Apache Hadoop, Doug Cutting, and Cloudera’s Chief Scientist, Jeff Hammerbacher, for a discussion of the most exciting new features being developed by the Apache Hadoop community.

Publié dans : Technologie
  • Hi,
    I am recruiting you any for universalisation, charismation, divinisation and presentation,
    Sorry, for this comment, i have commented on topic for recession, but then i went universal, pardon me .... !
    i am not doing too much, i am doing what i think it has to be done ....
    my solution for recession is universalisation, means evaluate all resourcess and assets of universe and then apply necessary sum of new currency (Zik=100$) to pay all debts and to buy off all taxes from national governments ....
    of course for this we need adequate entity, i see on horizon only myself as the secular and universal, legal and official The God, recognised by UN and with contracts with all national states governments,
    of course i invite you all to create a fresh new account at google, free, but with my data: universal identities names and universal residence, like this: Zababau Ganetros Cirimbo Ostangu zaqaqef@gmail.com ogiriny64256142, ( you can create this one but then inform me), access to account i have to have because this is divinising universalisation, but you can open it for all, i simply have to arrange it to adapt to paradigm, isn't it ......
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici

Webinar: The Future of Hadoop

  1. The Future of Hadoop Doug Cutting | A Founder of Apache HadoopJeff Hammerbacher | Chief Scientist, Cloudera Welcome to the webinar! Audio/Telephone: +1 (215) 383-1016 Access Code: 421-634-457 Audio Pin: Shown after joining the Webinar Hadoop, Hbase, Pig, Hive, Bigtop, Avro, Flume & Whirr are trademark of the Apache Software Foundation
  2. Housekeeping▪ All lines are on mute▪ Ask questions at any time using the Questions panel on GoToMeeting▪ Slides and recording will be available on www.cloudera.com/events ©2011 Cloudera, Inc. All Rights Reserved.
  3. Presentation Outline▪ 1. Context▪ 2. Apache Bigtop▪ 3. Apache Hadoop Core▪ 4. Apache HBase, Hive, and Pig▪ 5. Other components▪ Questions and Discussion ©2011 Cloudera, Inc. All Rights Reserved.
  4. 1. Context
  5. ContextData▪ 1.8 ZB will be created and replicated in 2011 ▪ Up 9x in the last five years ▪ More than 90% of this data is unstructured ▪ Enterprises have some liability for 80% of this data ▪ Enterprises will spend $4T on managing data in 2011 ▪ Source: IDC Digital Universe Report 2011 ©2011 Cloudera, Inc. All Rights Reserved.
  6. ContextHadoop▪ Apache Hadoop and related software are designed for this world▪ Volume ▪ Commodity hardware and open source software lowers cost and increases capacity▪ Velocity ▪ Data ingest speed aided by append-only and schema-on-read design▪ Variety ▪ Multiple tools to structure, process, and access data ©2011 Cloudera, Inc. All Rights Reserved.
  7. ContextHadoop
  8. ContextHDFS and MapReduce▪ Apache Hadoop = HDFS + MapReduce ▪ Similar to kernel of an operating system ▪ Referred to as “Hadoop Core”▪ Related components are often deployed with Hadoop ▪ For example: HBase, Hive, Pig, Oozie, Flume, Sqoop ▪ Together, these components form a “Hadoop Stack” ▪ Not all components must be deployed
  9. ContextBigtop▪ What standards should all components follow?▪ How can we ensure all components of the stack work together?▪ How can we find the right version of each component?▪ How can we make it easy to install an additional component?
  10. 2. Apache Bigtop
  11. Apache Bigtop▪ Now incubating at Apache▪ Hadoop ecosystem-wide project, including: ▪ Interoperability testing of components ▪ Packaging of compatible versions of components▪ Like a Fedora, Debian or CentOS for Hadoop ecosystem▪ Releases are not a single artifact ▪ Rather a set of interdependent, compatible components ©2011 Cloudera, Inc. All Rights Reserved.
  12. Apache Bigtop▪ Current components ▪ Hadoop ▪ HBase ▪ Hive ▪ Pig ▪ Oozie ▪ Sqoop ▪ Flume ▪ ZooKeeper ▪ Whirr
  13. Apache Bigtop▪ Outputs ▪ Source ▪ RPM ▪ Deb▪ Tests ▪ Integration ▪ Package ▪ Smoke▪ Release 0.1.0 under vote now!
  14. 3. Apache Hadoop Core
  15. Apache Hadoop Core▪ Current stable releases based on branches from 0.20▪ Upcoming release: 0.22 ▪ Includes both security and new implementation of append ▪ Not expected to be run at scale or commercially supported ▪ Nearly ready for vote▪ Upcoming release: 0.23 ▪ Build and dependency management moved to Maven ▪ Branch to happen soon
  16. HDFS▪ Robustness ▪ HDFS-1073: Checkpointing of image and edits log▪ Availability ▪ HDFS-1623: High availability▪ Performance ▪ HDFS-941: Faster random reads ▪ HDFS-2080: Faster checksums ©2011 Cloudera, Inc. All Rights Reserved.
  17. HDFS▪ Scalability ▪ HDFS-1052: Federation of the NameNode ▪ Source of diagram: http://www.hortonworks.com/an-introduction-to-hdfs-federation/
  18. MapReduce▪ Modularity ▪ MAPREDUCE-279: MapReduce 2.0 ▪ Break JobTracker into ResourceManager and ApplicationMaster ▪ Replace TaskTracker with NodeManager ▪ Source of diagram: http://www.odbms.org/download/dean-keynote-ladis2009.pdf
  19. MapReduce▪ Potential New Frameworks ▪ MAPREDUCE-2719: Distributed shell ▪ MAPREDUCE-2720: Distributed Java commands ▪ MPI: Communication-intensive parallelism ▪ Fast scans and aggregations ▪ OpenDremel ▪ Bulk Synchronous Parallel ▪ Giraph, Golden Orb, Hama, et al. ▪ Actor Model (streaming) ▪ S4, Akka, Storm, et al.
  20. 4. HBase, Hive, and Pig
  21. Apache HBase▪ Upcoming release: 0.92.0▪ Server-side triggers ▪ HBASE-2000: Coprocessors▪ Availability ▪ HBASE-1730/4213: Online schema changes▪ Performance ▪ HBASE-3857: HFile 2.0▪ HBase book in September! ©2011 Cloudera, Inc. All Rights Reserved.
  22. Apache Hive▪ Upcoming release: 0.8▪ Data transfer ▪ HIVE-306: INSERT INTO ▪ HIVE-1918: EXPORT/IMPORT▪ Indexes ▪ HIVE-1644: Automatically use indexes ▪ HIVE-1803: Bitmap indexes▪ Data formats ▪ HIVE-895: Avro support ©2011 Cloudera, Inc. All Rights Reserved.
  23. Apache Pig▪ Recent release: 0.9▪ Scripting ▪ PIG-1479: Embedding Pig in Python ▪ PIG-1793: Macro expansion▪ Debugging ▪ PIG-1712: ILLUSTRATE rework▪ Data formats ▪ PIG-1748: Avro support ©2011 Cloudera, Inc. All Rights Reserved.
  24. 5. Other Components
  25. Other Components▪ Apache Incubator ▪ Sqoop, Flume, and Oozie now incubating ▪ Whirr graduated to a top-level Apache project▪ Apache Avro ▪ Interoperability with Protocol Buffers and Thrift ▪ Column-oriented file format ▪ Python MapReduce implementation▪ Apache ZooKeeper ▪ Multi-update ▪ Kerberos authentication of clients ©2011 Cloudera, Inc. All Rights Reserved.
  26. Q&AVisit www.hadoopworld.com• November 8-9, 2011 in New York City• Early bird discount ends September 5, 2011Enter Today: www.facebook.com/cloudera• Click the “Be a Cloudera Hero for Apache Hadoop” tab• Share what you think Apache Hadoop can do for you• Win a personal hackathon with Doug Cutting in San Francisco, CA