Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Hadoop - An introduction for SQL Server DBAs

427 vues

Publié le

Hadoop - An introduction for SQL Server DBAs. Originally given to the Cambridge SQL Server User Group

Publié dans : Technologie
  • Soyez le premier à commenter

  • Soyez le premier à aimer ceci

Hadoop - An introduction for SQL Server DBAs

  1. 1. Hadoop. An introduction for SQL Server DBAs.
  2. 2. Andrew Denty Product Manager exploring Big Data Red Gate Ventures @andrewdenty
  3. 3. 1What is Hadoop? 2Why you should care 3How to get started
  4. 4. What we’re not going to talk about. • Replacing your existing servers with hadoop • How Hadoop compares to other databases • How to write Map Reduce or Java
  5. 5. Who has used ?Hadoop?
  6. 6. What is Hadoop? • Open source Apache project • Written in Java • Distributed system: – Shares large workloads – Commodity servers – Scales effectively
  7. 7. Map YARN Reduce (Java (Yet another based distributed resource programming negotiator) model) Storage HDFS (Hadoop Distributed File System) Compute
  8. 8. JBOD It’s just bytes 0II0I0I0I Scalable Fault tolerant
  9. 9. Why should you care? • Never again throw away any data! • Once you’ve kept EVERYTHING you can then derive some insights from all of that data.
  10. 10. http://priceonomics.com/why-ups-trucks-dont-turn-left/
  11. 11. Salary
  12. 12. The things you can’t do with SQL Server • Distributed processing • Generating insight from vast quantities of structured and unstructured data.
  13. 13. The Hadoop Journey Sandbox 2-3 node cluster Something in production
  14. 14. How to get started now: • Download & Install a sandbox: – Hortonworks Sandbox - http://bit.ly/1gkkCte – Cloudera QuickStart VM - http://bit.ly/19eOwR3 – Map R Sandbox - http://bit.ly/TWZynR • Fire it up, import some data with HDFS Explorer - http://bit.ly/1ivuSz5 • Create a table • Run a query…
  15. 15. To sum up… • Hadoop is a distributed data storage and computation engine • Hadoop enables you to do things which were impossible with SQL Server… (and get paid more!) • Get started by downloading a Sandbox – it’s easy!
  16. 16. Andrew Denty Product Manager exploring big data Red Gate Ventures @andrewdenty

×