The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool

Presented at the Hadoop Contributors Meetup, hosted by Oath.
Publié dans : Technologie
  1. 1. Future of Hadoop In AI World Milind Bhandarkar Founder & CEO,Ampool @techmilind
  2. 2. The Future is not what it used to be -Yogi Berra
  3. 3. Google Papers
  4. 4. Yahoo! Search + =
  5. 5. W-1-W •WebMap : Graph processing for WWW •Dreadnaught: Infrastructure for WebMap •W-1-W:WebMap In One Week •Juggernaut: Infrastructure for W-1-W •JFS, JMR, Condor:Abandoned for Hadoop
  6. 6. Lucene, Nutch
  7. 7. Kryptonite: First Hadoop Cluster AtYahoo!
  8. 8. Hadoop Future in 2006: Hadoop will helpYahoo! win Search Engine Wars
  9. 9. Lessons Learned •Multi-Tenancy from ground-up •Agility in lieu of Performance •Provisioning vs Procurement •“Weird” use cases as learning experience •Academic collaboration
  10. 10. Hadoop Peak Hype 2011-2014
  11. 11. Hadoop Impact on Data Economics $- $20,000 $40,000 $60,000 $80,000 2008 2009 2010 2011 2012 2013 Big Data Platform Price/TB Big Data DB Hadoop
  12. 12. SQL on “Everything” •NoSQL = “Not Yet SQL” - Michael Stonebraker, 2010 •Hive, Cloudera Impala, SparkSQL, Facebook Presto,Apache Drill, IBM BigSQL,Apache HAWQ,ApacheTrafodion
  13. 13. Hadoop Future in 2014: Hadoop will end EDW as we know it.
  14. 14. Hadoop Future Disrupted: 2014
  15. 15. Clouds, Public & Private
  16. 16. IAAS: New Hardware •Public:AWS, Google Cloud,Azure •Private: vSphere, OpenStack •Easy Provisioning •Scalable, Elastic, Ubiquitous •Bundled with Data & Analytics as Services
  17. 17. Cloud Data Fabric •Store massive & diverse data sets economically •Integrate and Ingest from legacy & disparate sources •Ability to rapidly analyze massive data sets •Control,Auditing, Manageability, Self-Service •Object Stores
  18. 18. And Now,AI
  19. 19. So,“Big” Data is Still Important in AI World, So why *NOT* Hadoop?
  20. 20. Back to the Future 2018: What is Hadoop? Hadoop is the OSS Reference Implementation of APIs for managing distributed AI workloads and their access to large datasets.
  21. 21. Hadoop with Compute- Storage Separation
  22. 22. Compute •Containers & Orchestration •DeconstructingYARN •Resource Allocation, Scheduling, Management, Isolation •K8S Everywhere •Logically Separated from Storage
  23. 23. Storage •Massively ScalableTiers •Object Stores, Distributed File Systems, PersistentVolumes •Higher-Level Data Abstractions •Large, dense volatile & non-volatile memories
  24. 24. Questions?