SlideShare une entreprise Scribd logo
1  sur  10
Apache Hadoop MapReduce
● What is it ?
● Why use it ?
● How does it work
● Some examples
● Big users
MapReduce – What is it ?
● Processing engine of Hadoop
● Developers create Map and Reduce jobs
● Used for big data batch processing
● Parallel processing of huge data volumes
● Fault tolerant
● Scalable
MapReduce – What use it ?
● Your data in Terabyte / Petabyte range
● You have huge I/O
● Hadoop framework takes care of
– Job and task management
– Failures
– Storage
– Replication
● You just write Map and Reduce jobs
MapReduce – How does it work ?
Take word counting as an example, something that Google does
all of the time.
MapReduce – How does it work ?
● Input data split into shards
● Split data mapped to key,value pairs i.e. Bear,1
● Mapped data shuffled/sorted by key i.e. Bear
● Sorted data reduced i.e. Bear, 2
● Final data stored on HDFS
● There might be extra map layer before shuffle
● JobTracker controls all tasks in job
● TaskTracker controls map and reduce
MapReduce - Some examples
A visual example with colours to show you the cycle
Split -> Map -> Shuffle -> Reduce
MapReduce - Some examples
A visual example of MapReduce with job and task trackers added to
individual map and reduce jobs.
Hadoop MapReduce – Big users
● Users
– Facebook
– Yahoo
– Amazon
– Ebay
● Providers
– Amazon
– Cloudera
– HortonWorks
– MapR
Contact Us
● Feel free to contact us at
– www.semtech-solutions.co.nz
– info@semtech-solutions.co.nz
● We offer IT project consultancy
● We are happy to hear about your problems
● You can just pay for those hours that you need
● To solve your problems
Contact Us
● Feel free to contact us at
– www.semtech-solutions.co.nz
– info@semtech-solutions.co.nz
● We offer IT project consultancy
● We are happy to hear about your problems
● You can just pay for those hours that you need
● To solve your problems

Contenu connexe

Plus de Mike Frampton

An introduction to Apache Mesos
An introduction to Apache MesosAn introduction to Apache Mesos
An introduction to Apache MesosMike Frampton
 
An introduction to Pentaho
An introduction to PentahoAn introduction to Pentaho
An introduction to PentahoMike Frampton
 
An introduction to Apache Thrift
An introduction to Apache ThriftAn introduction to Apache Thrift
An introduction to Apache ThriftMike Frampton
 
An introduction to Apache Cassandra
An introduction to Apache CassandraAn introduction to Apache Cassandra
An introduction to Apache CassandraMike Frampton
 
An example Hadoop Install
An example Hadoop InstallAn example Hadoop Install
An example Hadoop InstallMike Frampton
 
An Introduction to Apache Hadoop Yarn
An Introduction to Apache Hadoop YarnAn Introduction to Apache Hadoop Yarn
An Introduction to Apache Hadoop YarnMike Frampton
 
An Introduction to Cloud Computing
An Introduction to Cloud ComputingAn Introduction to Cloud Computing
An Introduction to Cloud ComputingMike Frampton
 
An Introduction to Hadoop Hue Gui
An Introduction to Hadoop Hue GuiAn Introduction to Hadoop Hue Gui
An Introduction to Hadoop Hue GuiMike Frampton
 
An introduction to Apache Hadoop Hive
An introduction to Apache Hadoop HiveAn introduction to Apache Hadoop Hive
An introduction to Apache Hadoop HiveMike Frampton
 

Plus de Mike Frampton (20)

Prometheus
PrometheusPrometheus
Prometheus
 
Apache Tephra
Apache TephraApache Tephra
Apache Tephra
 
Apache Kudu
Apache KuduApache Kudu
Apache Kudu
 
Apache Bahir
Apache BahirApache Bahir
Apache Bahir
 
Apache Arrow
Apache ArrowApache Arrow
Apache Arrow
 
JanusGraph DB
JanusGraph DBJanusGraph DB
JanusGraph DB
 
Apache Ignite
Apache IgniteApache Ignite
Apache Ignite
 
Apache Samza
Apache SamzaApache Samza
Apache Samza
 
Apache Flink
Apache FlinkApache Flink
Apache Flink
 
Apache Edgent
Apache EdgentApache Edgent
Apache Edgent
 
Apache CouchDB
Apache CouchDBApache CouchDB
Apache CouchDB
 
An introduction to Apache Mesos
An introduction to Apache MesosAn introduction to Apache Mesos
An introduction to Apache Mesos
 
An introduction to Pentaho
An introduction to PentahoAn introduction to Pentaho
An introduction to Pentaho
 
An introduction to Apache Thrift
An introduction to Apache ThriftAn introduction to Apache Thrift
An introduction to Apache Thrift
 
An introduction to Apache Cassandra
An introduction to Apache CassandraAn introduction to Apache Cassandra
An introduction to Apache Cassandra
 
An example Hadoop Install
An example Hadoop InstallAn example Hadoop Install
An example Hadoop Install
 
An Introduction to Apache Hadoop Yarn
An Introduction to Apache Hadoop YarnAn Introduction to Apache Hadoop Yarn
An Introduction to Apache Hadoop Yarn
 
An Introduction to Cloud Computing
An Introduction to Cloud ComputingAn Introduction to Cloud Computing
An Introduction to Cloud Computing
 
An Introduction to Hadoop Hue Gui
An Introduction to Hadoop Hue GuiAn Introduction to Hadoop Hue Gui
An Introduction to Hadoop Hue Gui
 
An introduction to Apache Hadoop Hive
An introduction to Apache Hadoop HiveAn introduction to Apache Hadoop Hive
An introduction to Apache Hadoop Hive
 

Dernier

Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 

Dernier (20)

Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 

An Introduction to Apache Hadoop MapReduce

  • 1. Apache Hadoop MapReduce ● What is it ? ● Why use it ? ● How does it work ● Some examples ● Big users
  • 2. MapReduce – What is it ? ● Processing engine of Hadoop ● Developers create Map and Reduce jobs ● Used for big data batch processing ● Parallel processing of huge data volumes ● Fault tolerant ● Scalable
  • 3. MapReduce – What use it ? ● Your data in Terabyte / Petabyte range ● You have huge I/O ● Hadoop framework takes care of – Job and task management – Failures – Storage – Replication ● You just write Map and Reduce jobs
  • 4. MapReduce – How does it work ? Take word counting as an example, something that Google does all of the time.
  • 5. MapReduce – How does it work ? ● Input data split into shards ● Split data mapped to key,value pairs i.e. Bear,1 ● Mapped data shuffled/sorted by key i.e. Bear ● Sorted data reduced i.e. Bear, 2 ● Final data stored on HDFS ● There might be extra map layer before shuffle ● JobTracker controls all tasks in job ● TaskTracker controls map and reduce
  • 6. MapReduce - Some examples A visual example with colours to show you the cycle Split -> Map -> Shuffle -> Reduce
  • 7. MapReduce - Some examples A visual example of MapReduce with job and task trackers added to individual map and reduce jobs.
  • 8. Hadoop MapReduce – Big users ● Users – Facebook – Yahoo – Amazon – Ebay ● Providers – Amazon – Cloudera – HortonWorks – MapR
  • 9. Contact Us ● Feel free to contact us at – www.semtech-solutions.co.nz – info@semtech-solutions.co.nz ● We offer IT project consultancy ● We are happy to hear about your problems ● You can just pay for those hours that you need ● To solve your problems
  • 10. Contact Us ● Feel free to contact us at – www.semtech-solutions.co.nz – info@semtech-solutions.co.nz ● We offer IT project consultancy ● We are happy to hear about your problems ● You can just pay for those hours that you need ● To solve your problems