Soumettre la recherche
Mettre en ligne
Hadoop 101 v2
•
Télécharger en tant que PPTX, PDF
•
0 j'aime
•
593 vues
John Berns
Suivre
Given at IoT Asia 2014
Lire moins
Lire la suite
Données & analyses
Technologie
Signaler
Partager
Signaler
Partager
1 sur 56
Télécharger maintenant
Recommandé
Hadoop 101 - Big Data Technology
Hadoop 101 - Big Data Technology
Firman Gautama
My other computer_is_a_datacentre
My other computer_is_a_datacentre
Steve Loughran
HadoopThe Hadoop Java Software Framework
HadoopThe Hadoop Java Software Framework
ThoughtWorks
Interview questions on Apache spark [part 2]
Interview questions on Apache spark [part 2]
knowbigdata
Another Intro To Hadoop
Another Intro To Hadoop
Adeel Ahmad
Hadoop And Big Data - My Presentation To Selective Audience
Hadoop And Big Data - My Presentation To Selective Audience
Chandra Sekhar
Geek camp
Geek camp
jdhok
Bigdata Nedir? Hadoop Nedir? MapReduce Nedir? Big Data.
Bigdata Nedir? Hadoop Nedir? MapReduce Nedir? Big Data.
Zekeriya Besiroglu
Recommandé
Hadoop 101 - Big Data Technology
Hadoop 101 - Big Data Technology
Firman Gautama
My other computer_is_a_datacentre
My other computer_is_a_datacentre
Steve Loughran
HadoopThe Hadoop Java Software Framework
HadoopThe Hadoop Java Software Framework
ThoughtWorks
Interview questions on Apache spark [part 2]
Interview questions on Apache spark [part 2]
knowbigdata
Another Intro To Hadoop
Another Intro To Hadoop
Adeel Ahmad
Hadoop And Big Data - My Presentation To Selective Audience
Hadoop And Big Data - My Presentation To Selective Audience
Chandra Sekhar
Geek camp
Geek camp
jdhok
Bigdata Nedir? Hadoop Nedir? MapReduce Nedir? Big Data.
Bigdata Nedir? Hadoop Nedir? MapReduce Nedir? Big Data.
Zekeriya Besiroglu
Hadoop at Yahoo! -- University Talks
Hadoop at Yahoo! -- University Talks
yhadoop
Hadoop Pig: MapReduce the easy way!
Hadoop Pig: MapReduce the easy way!
Nathan Bijnens
Checkupload1 140213043220-phpapp01
Checkupload1 140213043220-phpapp01
Nitish Bhardwaj
PGDAY FR 2014 : presentation de Postgresql chez leboncoin.fr
PGDAY FR 2014 : presentation de Postgresql chez leboncoin.fr
jlb666
Introduction To Elastic MapReduce at WHUG
Introduction To Elastic MapReduce at WHUG
Adam Kawa
PySparkの勘所(20170630 sapporo db analytics showcase)
PySparkの勘所(20170630 sapporo db analytics showcase)
Ryuji Tamagawa
Seminar ppt
Seminar ppt
RajatTripathi34
Introduction to Apache Hadoop
Introduction to Apache Hadoop
Steve Watt
Pptx present
Pptx present
Nitish Bhardwaj
Hadoop 130419075715-phpapp02(1)
Hadoop 130419075715-phpapp02(1)
Nitish Bhardwaj
Intro to Hadoop
Intro to Hadoop
jeffturner
introduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pig
Ricardo Varela
Practical Hadoop using Pig
Practical Hadoop using Pig
David Wellman
Technology Outlook - The new Era of computing
Technology Outlook - The new Era of computing
Swiss Big Data User Group
20171012 found IT #9 PySparkの勘所
20171012 found IT #9 PySparkの勘所
Ryuji Tamagawa
20170210 sapporotechbar7
20170210 sapporotechbar7
Ryuji Tamagawa
Hive vs Pig for HadoopSourceCodeReading
Hive vs Pig for HadoopSourceCodeReading
Mitsuharu Hamba
Hadoop basics
Hadoop basics
Antonio Silveira
Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)
Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)
Adam Kawa
The Computer Science Behind a modern Distributed Database
The Computer Science Behind a modern Distributed Database
ArangoDB Database
Understanding Spark Tuning: Strata New York
Understanding Spark Tuning: Strata New York
Rachel Warren
Spark Autotuning Talk - Strata New York
Spark Autotuning Talk - Strata New York
Holden Karau
Contenu connexe
Tendances
Hadoop at Yahoo! -- University Talks
Hadoop at Yahoo! -- University Talks
yhadoop
Hadoop Pig: MapReduce the easy way!
Hadoop Pig: MapReduce the easy way!
Nathan Bijnens
Checkupload1 140213043220-phpapp01
Checkupload1 140213043220-phpapp01
Nitish Bhardwaj
PGDAY FR 2014 : presentation de Postgresql chez leboncoin.fr
PGDAY FR 2014 : presentation de Postgresql chez leboncoin.fr
jlb666
Introduction To Elastic MapReduce at WHUG
Introduction To Elastic MapReduce at WHUG
Adam Kawa
PySparkの勘所(20170630 sapporo db analytics showcase)
PySparkの勘所(20170630 sapporo db analytics showcase)
Ryuji Tamagawa
Seminar ppt
Seminar ppt
RajatTripathi34
Introduction to Apache Hadoop
Introduction to Apache Hadoop
Steve Watt
Pptx present
Pptx present
Nitish Bhardwaj
Hadoop 130419075715-phpapp02(1)
Hadoop 130419075715-phpapp02(1)
Nitish Bhardwaj
Intro to Hadoop
Intro to Hadoop
jeffturner
introduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pig
Ricardo Varela
Practical Hadoop using Pig
Practical Hadoop using Pig
David Wellman
Technology Outlook - The new Era of computing
Technology Outlook - The new Era of computing
Swiss Big Data User Group
20171012 found IT #9 PySparkの勘所
20171012 found IT #9 PySparkの勘所
Ryuji Tamagawa
20170210 sapporotechbar7
20170210 sapporotechbar7
Ryuji Tamagawa
Hive vs Pig for HadoopSourceCodeReading
Hive vs Pig for HadoopSourceCodeReading
Mitsuharu Hamba
Hadoop basics
Hadoop basics
Antonio Silveira
Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)
Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)
Adam Kawa
The Computer Science Behind a modern Distributed Database
The Computer Science Behind a modern Distributed Database
ArangoDB Database
Tendances
(20)
Hadoop at Yahoo! -- University Talks
Hadoop at Yahoo! -- University Talks
Hadoop Pig: MapReduce the easy way!
Hadoop Pig: MapReduce the easy way!
Checkupload1 140213043220-phpapp01
Checkupload1 140213043220-phpapp01
PGDAY FR 2014 : presentation de Postgresql chez leboncoin.fr
PGDAY FR 2014 : presentation de Postgresql chez leboncoin.fr
Introduction To Elastic MapReduce at WHUG
Introduction To Elastic MapReduce at WHUG
PySparkの勘所(20170630 sapporo db analytics showcase)
PySparkの勘所(20170630 sapporo db analytics showcase)
Seminar ppt
Seminar ppt
Introduction to Apache Hadoop
Introduction to Apache Hadoop
Pptx present
Pptx present
Hadoop 130419075715-phpapp02(1)
Hadoop 130419075715-phpapp02(1)
Intro to Hadoop
Intro to Hadoop
introduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pig
Practical Hadoop using Pig
Practical Hadoop using Pig
Technology Outlook - The new Era of computing
Technology Outlook - The new Era of computing
20171012 found IT #9 PySparkの勘所
20171012 found IT #9 PySparkの勘所
20170210 sapporotechbar7
20170210 sapporotechbar7
Hive vs Pig for HadoopSourceCodeReading
Hive vs Pig for HadoopSourceCodeReading
Hadoop basics
Hadoop basics
Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)
Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)
The Computer Science Behind a modern Distributed Database
The Computer Science Behind a modern Distributed Database
Similaire à Hadoop 101 v2
Understanding Spark Tuning: Strata New York
Understanding Spark Tuning: Strata New York
Rachel Warren
Spark Autotuning Talk - Strata New York
Spark Autotuning Talk - Strata New York
Holden Karau
Spark Autotuning - Strata EU 2018
Spark Autotuning - Strata EU 2018
Holden Karau
A gentle introduction to the world of BigData and Hadoop
A gentle introduction to the world of BigData and Hadoop
Stefano Paluello
Spark autotuning talk final
Spark autotuning talk final
Rachel Warren
Ayw computer working
Ayw computer working
pbeerak
Trip down the GPU lane with Machine Learning
Trip down the GPU lane with Machine Learning
Renaldas Zioma
Infrastructure as code might be literally impossible part 2
Infrastructure as code might be literally impossible part 2
ice799
Mysql talk
Mysql talk
LogicMonitor
SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!
Andraz Tori
Data analysis with pandas
Data analysis with pandas
Outreach Digital
Data Analysis With Pandas
Data Analysis With Pandas
Stephan Solomonidis
Fast and Scalable Python
Fast and Scalable Python
Travis Oliphant
Assignment 2 Theoretical
Assignment 2 Theoretical
Esteban Gonzalez
Big Data - Need of Converged Data Platform
Big Data - Need of Converged Data Platform
GeekNightHyderabad
Operating Systems
Operating Systems
CharlieGilbertson
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
Fred de Villamil
Seminar Presentation Hadoop
Seminar Presentation Hadoop
Varun Narang
Hadoop
Hadoop
adm_exoplatform
Distributed machine learning 101 using apache spark from a browser devoxx.b...
Distributed machine learning 101 using apache spark from a browser devoxx.b...
Andy Petrella
Similaire à Hadoop 101 v2
(20)
Understanding Spark Tuning: Strata New York
Understanding Spark Tuning: Strata New York
Spark Autotuning Talk - Strata New York
Spark Autotuning Talk - Strata New York
Spark Autotuning - Strata EU 2018
Spark Autotuning - Strata EU 2018
A gentle introduction to the world of BigData and Hadoop
A gentle introduction to the world of BigData and Hadoop
Spark autotuning talk final
Spark autotuning talk final
Ayw computer working
Ayw computer working
Trip down the GPU lane with Machine Learning
Trip down the GPU lane with Machine Learning
Infrastructure as code might be literally impossible part 2
Infrastructure as code might be literally impossible part 2
Mysql talk
Mysql talk
SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!
Data analysis with pandas
Data analysis with pandas
Data Analysis With Pandas
Data Analysis With Pandas
Fast and Scalable Python
Fast and Scalable Python
Assignment 2 Theoretical
Assignment 2 Theoretical
Big Data - Need of Converged Data Platform
Big Data - Need of Converged Data Platform
Operating Systems
Operating Systems
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
Seminar Presentation Hadoop
Seminar Presentation Hadoop
Hadoop
Hadoop
Distributed machine learning 101 using apache spark from a browser devoxx.b...
Distributed machine learning 101 using apache spark from a browser devoxx.b...
Dernier
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
Anupama Kate
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
olyaivanovalion
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Delhi Call girls
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
Timothy Spann
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
shivangimorya083
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Delhi Call girls
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data 2023
ymrp368
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
shambhavirathore45
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
olyaivanovalion
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Pooja Nehwal
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
ranjana rawat
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
kumarajju5765
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
Dr. Soumendra Kumar Patra
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
shivangimorya083
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
olyaivanovalion
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
MohammedJunaid861692
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
Dernier
(20)
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data 2023
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
Hadoop 101 v2
1.
Hadoop 101 A really
quick overview of the concepts…
2.
A few Terabytes
of Data...
3.
4.
5.
Text processing--a few
hours?
6.
But what if
you have more data?
7.
Network Storage--Petabytes!
8.
Network Storage--Petabytes!
9.
What if you
need compute power for complex algorithms?
10.
8 core? 16
Cores? 64 cores? 512 GB RAM?
11.
A network of
commodity computers
12.
Run jobs on
PART of the data on each computer then AGGRETAGE the intermediary results from each computer.
13.
Let’s add a
computer to manage the process of job delegation, merging the results... and keeping track of the results...
14.
We also need
something to keep track of what files are where, so we know where the data is that needs to be computed...
15.
When you have
a lot of computers, and even more hard drives, one thing I can guarantee...
16.
Computers will eventually
fail.
17.
Computers will eventually
fail.
18.
Hard drives will
eventually fail.
19.
Hard drives will
eventually fail.
20.
Hard drives will
eventually fail.
21.
Hard drives will
eventually fail.
22.
Even whole racks
will fail.
23.
If a computer
fails and you only have one copy of your data...
24.
You will be
very, very unhappy.
25.
So lets store
multiple copies of the data. Hard drives are CHEAP!
26.
So lets store
multiple copies of the data. Hard drives are CHEAP!
27.
So lets store
multiple copies of the data. Hard drives are CHEAP!
28.
So lets store
multiple copies of the data. Hard drives are CHEAP!
29.
If one hard
drive fails... we are still OK
30.
If one computer
fails... we are still OK
31.
Even if a
whole rack fails... we are still OK
32.
Once we find
a failure let’s have the system recopy the copies.
33.
Send the compute
job to all nodes.
34.
And let it
run on it’s part of the data….
35.
And let it
run on it’s part of the data….
36.
And let it
run on it’s part of the data….
37.
And let it
run on it’s part of the data….
38.
One is stuck….
39.
We have three
copies—we can redistribute the compute
40.
And take the
one that finishes fastest
41.
Merge sorted sets
based on some key… A-E F-J K-O P-T U-Z
42.
…and write partial
results PART-01 PART-02 PART-03 PART-04 PART-05
43.
Guess, what? We’ve
just invented Hadoop! PART-03 PART-01 PART-02 A-E F-J
44.
So let’s talk
about the pieces of Hadoop.
45.
Data nodes store
and manage the data on a single “slave” computer Data Node
46.
Task trackers manage
the compute Data Node Task Tracker
47.
Job tracker manages
task trackers, ships code to compute nodes Data Node Task Tracker Job Tracker
48.
Name node manages
distribution and replication on the data nodes Data Node Task Tracker Job Tracker Name Node
49.
Map Reduce Task Tracker Job
Tracker
50.
HDFS (Hadoop Distributed
File System) Data Node Name Node
51.
HDFS
52.
Visual Example
53.
Map
54.
Shuffle
55.
Reduce
56.
Putting It All
Together
Télécharger maintenant