SlideShare une entreprise Scribd logo
1  sur  15
MapReduce
Functional Programming Map Apply function to transform elements of a list  Return results as a list Reduce Apply function to all elements of a list Collect results and return as a single value
Definition MapReduce: A software framework to support processing of massive data sets across distributed computers
SAMPLE USE CASE Back end credit card processor Nightly processing of millions of transactions Processing requires grouping, sorting, and merchant wide analysis Can’t just divide the over all list into equal parts as further analysis is necessary Tight processing window
Description Simple, powerful programming model Language independent Can run on a single machine, but shines for distributed computing and extreme datasets Break down the processing problem into embarrassingly parallel atomic operations
Algorithm Map Phase  Raw data analyzed and converted to name/value pair Shuffle Phase All name/value pairs are sorted and grouped by their keys Reduce Phase All values associated with a key are processed for results
MapReduce Walk Through Goal: Construct a word frequency of all the words in Wikipedia
Step 0: Split Data Raw input data divided into N parts N > number of machines Split must be context specific
Step 1: Map Each machine takes/receives a single slice of the raw input for mapping The map function processes the input file and emits a name/value pair of the relevant data
STEP 2: Shuffle The results of the map phase are sorted and grouped by the key in each key value pair.
STEP 3: Reduce Results from shuffle phase divided into M parts M >number of machines Each machine runs a reduction method on a part of shuffle results.
MAPREDUCE BENEFITS Scale Processing speed increases with number of machines involved Reliable Loss of any one machine doesn’t stop processing Cost Often built from heterogeneous commodity grade computers
Use Case Results Processing time of 1 million records  Originally ~3 hours Reduced to 40 minutes on 5 computers
Other MapReduce installations Google – Index building Visa – Transaction Processing Facebook – Facebook Lexicon Intelligence Community Yahoo/Google – Terabyte Sort 10 billion, 100 byte records Yahoo: 910 nodes, 206 seconds Google: ~1,000 nodes, 68 seconds
Questions

Contenu connexe

Tendances

Automating Engineering with FME
Automating Engineering with FMEAutomating Engineering with FME
Automating Engineering with FMESafe Software
 
FME in Support of DOT Data Culture
FME in Support of DOT Data CultureFME in Support of DOT Data Culture
FME in Support of DOT Data CultureSafe Software
 
MAXITHERMAL SOFTWARE FOR USB LOGGERS : WITH MARATHON PRODUCTS
MAXITHERMAL SOFTWARE FOR USB LOGGERS : WITH MARATHON PRODUCTS MAXITHERMAL SOFTWARE FOR USB LOGGERS : WITH MARATHON PRODUCTS
MAXITHERMAL SOFTWARE FOR USB LOGGERS : WITH MARATHON PRODUCTS Marathon Products, Inc
 
LUMASS - a Spatial System Dynamics Modelling Framework
LUMASS - a Spatial System Dynamics Modelling FrameworkLUMASS - a Spatial System Dynamics Modelling Framework
LUMASS - a Spatial System Dynamics Modelling FrameworkAlexander Herzig
 
Web Based GIS LeadGen Introduction
Web Based GIS LeadGen IntroductionWeb Based GIS LeadGen Introduction
Web Based GIS LeadGen IntroductionInteractiveGIS
 
Spatial decision support and analytics on a campus scale: bringing GIS, CAD, ...
Spatial decision support and analytics on a campus scale: bringing GIS, CAD, ...Spatial decision support and analytics on a campus scale: bringing GIS, CAD, ...
Spatial decision support and analytics on a campus scale: bringing GIS, CAD, ...Safe Software
 
ArcGIS Bivariate Mapping Tools
ArcGIS Bivariate Mapping ToolsArcGIS Bivariate Mapping Tools
ArcGIS Bivariate Mapping ToolsAileen Buckley
 
Repartition join in mapreduce
Repartition join in mapreduceRepartition join in mapreduce
Repartition join in mapreduceUday Vakalapudi
 
Map reduce presentation
Map reduce presentationMap reduce presentation
Map reduce presentationAhmad El Tawil
 
ArcGIS Extensions
ArcGIS ExtensionsArcGIS Extensions
ArcGIS ExtensionsEsri
 
Utilities Industry Success Stories with FME
Utilities Industry Success Stories with FME Utilities Industry Success Stories with FME
Utilities Industry Success Stories with FME Safe Software
 
From 2D Drawings to 3D Navigation networks built with FME
From 2D Drawings to 3D Navigation networks built with FMEFrom 2D Drawings to 3D Navigation networks built with FME
From 2D Drawings to 3D Navigation networks built with FMESafe Software
 
Fdi extreme features
Fdi extreme featuresFdi extreme features
Fdi extreme features360cell
 
Hadoop MapReduce joins
Hadoop MapReduce joinsHadoop MapReduce joins
Hadoop MapReduce joinsShalish VJ
 
8 Ways Utility Networks Can Meet Data Demands
8 Ways Utility Networks Can Meet Data Demands8 Ways Utility Networks Can Meet Data Demands
8 Ways Utility Networks Can Meet Data DemandsSafe Software
 
Coordinate Systems in FME 101
Coordinate Systems in FME 101 Coordinate Systems in FME 101
Coordinate Systems in FME 101 Safe Software
 

Tendances (20)

Automating Engineering with FME
Automating Engineering with FMEAutomating Engineering with FME
Automating Engineering with FME
 
FME in Support of DOT Data Culture
FME in Support of DOT Data CultureFME in Support of DOT Data Culture
FME in Support of DOT Data Culture
 
MAXITHERMAL SOFTWARE FOR USB LOGGERS : WITH MARATHON PRODUCTS
MAXITHERMAL SOFTWARE FOR USB LOGGERS : WITH MARATHON PRODUCTS MAXITHERMAL SOFTWARE FOR USB LOGGERS : WITH MARATHON PRODUCTS
MAXITHERMAL SOFTWARE FOR USB LOGGERS : WITH MARATHON PRODUCTS
 
Hadoop Mapreduce joins
Hadoop Mapreduce joinsHadoop Mapreduce joins
Hadoop Mapreduce joins
 
LUMASS - a Spatial System Dynamics Modelling Framework
LUMASS - a Spatial System Dynamics Modelling FrameworkLUMASS - a Spatial System Dynamics Modelling Framework
LUMASS - a Spatial System Dynamics Modelling Framework
 
Web Based GIS LeadGen Introduction
Web Based GIS LeadGen IntroductionWeb Based GIS LeadGen Introduction
Web Based GIS LeadGen Introduction
 
QGIS Tutorial 2
QGIS Tutorial 2QGIS Tutorial 2
QGIS Tutorial 2
 
Spatial decision support and analytics on a campus scale: bringing GIS, CAD, ...
Spatial decision support and analytics on a campus scale: bringing GIS, CAD, ...Spatial decision support and analytics on a campus scale: bringing GIS, CAD, ...
Spatial decision support and analytics on a campus scale: bringing GIS, CAD, ...
 
ArcGIS Bivariate Mapping Tools
ArcGIS Bivariate Mapping ToolsArcGIS Bivariate Mapping Tools
ArcGIS Bivariate Mapping Tools
 
Repartition join in mapreduce
Repartition join in mapreduceRepartition join in mapreduce
Repartition join in mapreduce
 
Map reduce presentation
Map reduce presentationMap reduce presentation
Map reduce presentation
 
ArcGIS Extensions
ArcGIS ExtensionsArcGIS Extensions
ArcGIS Extensions
 
Utilities Industry Success Stories with FME
Utilities Industry Success Stories with FME Utilities Industry Success Stories with FME
Utilities Industry Success Stories with FME
 
Main map reduce
Main map reduceMain map reduce
Main map reduce
 
From 2D Drawings to 3D Navigation networks built with FME
From 2D Drawings to 3D Navigation networks built with FMEFrom 2D Drawings to 3D Navigation networks built with FME
From 2D Drawings to 3D Navigation networks built with FME
 
MDT7
MDT7MDT7
MDT7
 
Fdi extreme features
Fdi extreme featuresFdi extreme features
Fdi extreme features
 
Hadoop MapReduce joins
Hadoop MapReduce joinsHadoop MapReduce joins
Hadoop MapReduce joins
 
8 Ways Utility Networks Can Meet Data Demands
8 Ways Utility Networks Can Meet Data Demands8 Ways Utility Networks Can Meet Data Demands
8 Ways Utility Networks Can Meet Data Demands
 
Coordinate Systems in FME 101
Coordinate Systems in FME 101 Coordinate Systems in FME 101
Coordinate Systems in FME 101
 

En vedette

Energy For One World- The Book- updated version (December 2012)
Energy For One World- The Book- updated version (December 2012)Energy For One World- The Book- updated version (December 2012)
Energy For One World- The Book- updated version (December 2012)Energy for One World
 
Maintaining Body Balance Through Probiotics By Mr. Hussian Sabuwala
Maintaining Body Balance Through Probiotics By Mr. Hussian SabuwalaMaintaining Body Balance Through Probiotics By Mr. Hussian Sabuwala
Maintaining Body Balance Through Probiotics By Mr. Hussian SabuwalaHealth Education Library for People
 
Conservation of natural resources A Presentation By Mr Allah Dad Khan Former...
Conservation of natural resources  A Presentation By Mr Allah Dad Khan Former...Conservation of natural resources  A Presentation By Mr Allah Dad Khan Former...
Conservation of natural resources A Presentation By Mr Allah Dad Khan Former...Mr.Allah Dad Khan
 
EFOW Performance Platforms- Plan Frame
EFOW Performance Platforms- Plan FrameEFOW Performance Platforms- Plan Frame
EFOW Performance Platforms- Plan FrameEnergy for One World
 
презентация пятница (1)
презентация пятница (1)презентация пятница (1)
презентация пятница (1)yuyukul
 
An Ecological View on Process Improvement: Some Thoughts for Improving Proces...
An Ecological View on Process Improvement: Some Thoughts for Improving Proces...An Ecological View on Process Improvement: Some Thoughts for Improving Proces...
An Ecological View on Process Improvement: Some Thoughts for Improving Proces...Luigi Buglione
 
The LEGO Maturity & Capability Model Approach
The LEGO Maturity & Capability Model ApproachThe LEGO Maturity & Capability Model Approach
The LEGO Maturity & Capability Model ApproachLuigi Buglione
 
Employee onboarding and employee engagement in it organizations human resourc...
Employee onboarding and employee engagement in it organizations human resourc...Employee onboarding and employee engagement in it organizations human resourc...
Employee onboarding and employee engagement in it organizations human resourc...AKSHAY KHATRI
 
презентация лето 13 группа
презентация лето 13 группапрезентация лето 13 группа
презентация лето 13 группаyuyukul
 
Banking Trends for 2016
Banking Trends for 2016Banking Trends for 2016
Banking Trends for 2016Capgemini
 
кожни систем човека
кожни систем човекакожни систем човека
кожни систем човекаMaja Simic
 
Skeletni sistem čoveka
Skeletni sistem čoveka Skeletni sistem čoveka
Skeletni sistem čoveka Ena Horvat
 
THE SCIENCE BEHIND EFFECTIVE FACEBOOK AD CAMPAIGNS
THE SCIENCE BEHIND EFFECTIVE FACEBOOK AD CAMPAIGNSTHE SCIENCE BEHIND EFFECTIVE FACEBOOK AD CAMPAIGNS
THE SCIENCE BEHIND EFFECTIVE FACEBOOK AD CAMPAIGNSunfunnel
 
Measuring Success on Facebook, Twitter & LinkedIn
Measuring Success on Facebook, Twitter & LinkedInMeasuring Success on Facebook, Twitter & LinkedIn
Measuring Success on Facebook, Twitter & LinkedInBrian Honigman
 

En vedette (20)

Energy For One World- The Book- updated version (December 2012)
Energy For One World- The Book- updated version (December 2012)Energy For One World- The Book- updated version (December 2012)
Energy For One World- The Book- updated version (December 2012)
 
My favourite topic
My favourite topicMy favourite topic
My favourite topic
 
Maintaining Body Balance Through Probiotics By Mr. Hussian Sabuwala
Maintaining Body Balance Through Probiotics By Mr. Hussian SabuwalaMaintaining Body Balance Through Probiotics By Mr. Hussian Sabuwala
Maintaining Body Balance Through Probiotics By Mr. Hussian Sabuwala
 
Can Water Heal? By Ms. Anu Mehta
Can Water Heal? By Ms. Anu MehtaCan Water Heal? By Ms. Anu Mehta
Can Water Heal? By Ms. Anu Mehta
 
Conservation of natural resources A Presentation By Mr Allah Dad Khan Former...
Conservation of natural resources  A Presentation By Mr Allah Dad Khan Former...Conservation of natural resources  A Presentation By Mr Allah Dad Khan Former...
Conservation of natural resources A Presentation By Mr Allah Dad Khan Former...
 
EFOW Performance Platforms- Plan Frame
EFOW Performance Platforms- Plan FrameEFOW Performance Platforms- Plan Frame
EFOW Performance Platforms- Plan Frame
 
Presentation
PresentationPresentation
Presentation
 
презентация пятница (1)
презентация пятница (1)презентация пятница (1)
презентация пятница (1)
 
An Ecological View on Process Improvement: Some Thoughts for Improving Proces...
An Ecological View on Process Improvement: Some Thoughts for Improving Proces...An Ecological View on Process Improvement: Some Thoughts for Improving Proces...
An Ecological View on Process Improvement: Some Thoughts for Improving Proces...
 
The LEGO Maturity & Capability Model Approach
The LEGO Maturity & Capability Model ApproachThe LEGO Maturity & Capability Model Approach
The LEGO Maturity & Capability Model Approach
 
Hadoop Security
Hadoop SecurityHadoop Security
Hadoop Security
 
Employee onboarding and employee engagement in it organizations human resourc...
Employee onboarding and employee engagement in it organizations human resourc...Employee onboarding and employee engagement in it organizations human resourc...
Employee onboarding and employee engagement in it organizations human resourc...
 
Kekurangan zat makanan
Kekurangan zat makananKekurangan zat makanan
Kekurangan zat makanan
 
презентация лето 13 группа
презентация лето 13 группапрезентация лето 13 группа
презентация лето 13 группа
 
Banking Trends for 2016
Banking Trends for 2016Banking Trends for 2016
Banking Trends for 2016
 
кожни систем човека
кожни систем човекакожни систем човека
кожни систем човека
 
On Boarding Ppt
On Boarding PptOn Boarding Ppt
On Boarding Ppt
 
Skeletni sistem čoveka
Skeletni sistem čoveka Skeletni sistem čoveka
Skeletni sistem čoveka
 
THE SCIENCE BEHIND EFFECTIVE FACEBOOK AD CAMPAIGNS
THE SCIENCE BEHIND EFFECTIVE FACEBOOK AD CAMPAIGNSTHE SCIENCE BEHIND EFFECTIVE FACEBOOK AD CAMPAIGNS
THE SCIENCE BEHIND EFFECTIVE FACEBOOK AD CAMPAIGNS
 
Measuring Success on Facebook, Twitter & LinkedIn
Measuring Success on Facebook, Twitter & LinkedInMeasuring Success on Facebook, Twitter & LinkedIn
Measuring Success on Facebook, Twitter & LinkedIn
 

Similaire à Map Reduce

Comparing Distributed Indexing To Mapreduce or Not?
Comparing Distributed Indexing To Mapreduce or Not?Comparing Distributed Indexing To Mapreduce or Not?
Comparing Distributed Indexing To Mapreduce or Not?TerrierTeam
 
Map reduce presentation
Map reduce presentationMap reduce presentation
Map reduce presentationateeq ateeq
 
2004 map reduce simplied data processing on large clusters (mapreduce)
2004 map reduce simplied data processing on large clusters (mapreduce)2004 map reduce simplied data processing on large clusters (mapreduce)
2004 map reduce simplied data processing on large clusters (mapreduce)anh tuan
 
Spatial Data Integrator - Software Presentation and Use Cases
Spatial Data Integrator - Software Presentation and Use CasesSpatial Data Integrator - Software Presentation and Use Cases
Spatial Data Integrator - Software Presentation and Use Casesmathieuraj
 
Introduction To Map Reduce
Introduction To Map ReduceIntroduction To Map Reduce
Introduction To Map Reducerantav
 
Sawmill - Integrating R and Large Data Clouds
Sawmill - Integrating R and Large Data CloudsSawmill - Integrating R and Large Data Clouds
Sawmill - Integrating R and Large Data CloudsRobert Grossman
 
Mapreduce2008 cacm
Mapreduce2008 cacmMapreduce2008 cacm
Mapreduce2008 cacmlmphuong06
 
MapReduce: Ordering and Large-Scale Indexing on Large Clusters
MapReduce: Ordering and  Large-Scale Indexing on Large ClustersMapReduce: Ordering and  Large-Scale Indexing on Large Clusters
MapReduce: Ordering and Large-Scale Indexing on Large ClustersIRJET Journal
 
Download It
Download ItDownload It
Download Itbutest
 
High Throughput Data Analysis
High Throughput Data AnalysisHigh Throughput Data Analysis
High Throughput Data AnalysisJ Singh
 
High Performance Computing on NYC Yellow Taxi Data Set
High Performance Computing on NYC Yellow Taxi Data SetHigh Performance Computing on NYC Yellow Taxi Data Set
High Performance Computing on NYC Yellow Taxi Data SetParag Ahire
 
Big data & Hadoop
Big data & HadoopBig data & Hadoop
Big data & HadoopAhmed Gamil
 
FME Around the World
FME Around the WorldFME Around the World
FME Around the WorldSafe Software
 

Similaire à Map Reduce (20)

Comparing Distributed Indexing To Mapreduce or Not?
Comparing Distributed Indexing To Mapreduce or Not?Comparing Distributed Indexing To Mapreduce or Not?
Comparing Distributed Indexing To Mapreduce or Not?
 
Map reduce presentation
Map reduce presentationMap reduce presentation
Map reduce presentation
 
Map reduce
Map reduceMap reduce
Map reduce
 
2004 map reduce simplied data processing on large clusters (mapreduce)
2004 map reduce simplied data processing on large clusters (mapreduce)2004 map reduce simplied data processing on large clusters (mapreduce)
2004 map reduce simplied data processing on large clusters (mapreduce)
 
Spatial Data Integrator - Software Presentation and Use Cases
Spatial Data Integrator - Software Presentation and Use CasesSpatial Data Integrator - Software Presentation and Use Cases
Spatial Data Integrator - Software Presentation and Use Cases
 
Hadoop Map Reduce
Hadoop Map ReduceHadoop Map Reduce
Hadoop Map Reduce
 
Introduction To Map Reduce
Introduction To Map ReduceIntroduction To Map Reduce
Introduction To Map Reduce
 
Lecture 1 mapreduce
Lecture 1  mapreduceLecture 1  mapreduce
Lecture 1 mapreduce
 
Sawmill - Integrating R and Large Data Clouds
Sawmill - Integrating R and Large Data CloudsSawmill - Integrating R and Large Data Clouds
Sawmill - Integrating R and Large Data Clouds
 
Mapreduce2008 cacm
Mapreduce2008 cacmMapreduce2008 cacm
Mapreduce2008 cacm
 
MapReduce: Ordering and Large-Scale Indexing on Large Clusters
MapReduce: Ordering and  Large-Scale Indexing on Large ClustersMapReduce: Ordering and  Large-Scale Indexing on Large Clusters
MapReduce: Ordering and Large-Scale Indexing on Large Clusters
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Download It
Download ItDownload It
Download It
 
2 mapreduce-model-principles
2 mapreduce-model-principles2 mapreduce-model-principles
2 mapreduce-model-principles
 
High Throughput Data Analysis
High Throughput Data AnalysisHigh Throughput Data Analysis
High Throughput Data Analysis
 
MapReduce-Notes.pdf
MapReduce-Notes.pdfMapReduce-Notes.pdf
MapReduce-Notes.pdf
 
MapReduce.pptx
MapReduce.pptxMapReduce.pptx
MapReduce.pptx
 
High Performance Computing on NYC Yellow Taxi Data Set
High Performance Computing on NYC Yellow Taxi Data SetHigh Performance Computing on NYC Yellow Taxi Data Set
High Performance Computing on NYC Yellow Taxi Data Set
 
Big data & Hadoop
Big data & HadoopBig data & Hadoop
Big data & Hadoop
 
FME Around the World
FME Around the WorldFME Around the World
FME Around the World
 

Dernier

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
The Evolution of Money: Digital Transformation and CBDCs in Central Banking
The Evolution of Money: Digital Transformation and CBDCs in Central BankingThe Evolution of Money: Digital Transformation and CBDCs in Central Banking
The Evolution of Money: Digital Transformation and CBDCs in Central BankingSelcen Ozturkcan
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 

Dernier (20)

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
The Evolution of Money: Digital Transformation and CBDCs in Central Banking
The Evolution of Money: Digital Transformation and CBDCs in Central BankingThe Evolution of Money: Digital Transformation and CBDCs in Central Banking
The Evolution of Money: Digital Transformation and CBDCs in Central Banking
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

Map Reduce

  • 2. Functional Programming Map Apply function to transform elements of a list Return results as a list Reduce Apply function to all elements of a list Collect results and return as a single value
  • 3. Definition MapReduce: A software framework to support processing of massive data sets across distributed computers
  • 4. SAMPLE USE CASE Back end credit card processor Nightly processing of millions of transactions Processing requires grouping, sorting, and merchant wide analysis Can’t just divide the over all list into equal parts as further analysis is necessary Tight processing window
  • 5. Description Simple, powerful programming model Language independent Can run on a single machine, but shines for distributed computing and extreme datasets Break down the processing problem into embarrassingly parallel atomic operations
  • 6. Algorithm Map Phase Raw data analyzed and converted to name/value pair Shuffle Phase All name/value pairs are sorted and grouped by their keys Reduce Phase All values associated with a key are processed for results
  • 7. MapReduce Walk Through Goal: Construct a word frequency of all the words in Wikipedia
  • 8. Step 0: Split Data Raw input data divided into N parts N > number of machines Split must be context specific
  • 9. Step 1: Map Each machine takes/receives a single slice of the raw input for mapping The map function processes the input file and emits a name/value pair of the relevant data
  • 10. STEP 2: Shuffle The results of the map phase are sorted and grouped by the key in each key value pair.
  • 11. STEP 3: Reduce Results from shuffle phase divided into M parts M >number of machines Each machine runs a reduction method on a part of shuffle results.
  • 12. MAPREDUCE BENEFITS Scale Processing speed increases with number of machines involved Reliable Loss of any one machine doesn’t stop processing Cost Often built from heterogeneous commodity grade computers
  • 13. Use Case Results Processing time of 1 million records Originally ~3 hours Reduced to 40 minutes on 5 computers
  • 14. Other MapReduce installations Google – Index building Visa – Transaction Processing Facebook – Facebook Lexicon Intelligence Community Yahoo/Google – Terabyte Sort 10 billion, 100 byte records Yahoo: 910 nodes, 206 seconds Google: ~1,000 nodes, 68 seconds