SlideShare une entreprise Scribd logo
1  sur  11
Télécharger pour lire hors ligne
1
Cloud Computing :
MapReduce - Tutorial
Prof. Soumya K Ghosh
Department of Computer Science and Engineering
IIT KHARAGPUR
Introduction
• MapReduce: programming model developed at Google
• Objective:
– Implement large scale search
– Text processing on massively scalable web data stored using BigTable and GFS distributed file
system
• Designed for processing and generating large volumes of data via massively parallel
computations, utilizing tens of thousands of processors at a time
• Fault tolerant: ensure progress of computation even if processors and networks fail
• Example:
– Hadoop: open source implementation of MapReduce (developed at Yahoo!)
– Available on pre-packaged AMIs on Amazon EC2 cloud platform
9/11/2017 2
MapReduce Model
9/11/2017 3
• Parallel programming abstraction
• Used by many different parallel applications which carry out large-scale
computation involving thousands of processors
• Leverages a common underlying fault-tolerant implementation
• Two phases of MapReduce:
– Map operation
– Reduce operation
• A configurable number of M ‘mapper’ processors and R ‘reducer’ processors are
assigned to work on the problem
• The computation is coordinated by a single master process
MapReduce Model Contd…
9/11/2017 4
• Map phase:
– Each mapper reads approximately 1/M of the input from the global file
system, using locations given by the master
– Map operation consists of transforming one set of key-value pairs to
another:
– Each mapper writes computation results in one file per reducer
– Files are sorted by a key and stored to the local file system
– The master keeps track of the location of these files
MapReduce Model Contd…
9/11/2017 5
• Reduce phase:
– The master informs the reducers where the partial computations have been stored
on local files of respective mappers
– Reducers make remote procedure call requests to the mappers to fetch the files
– Each reducer groups the results of the map step using the same key and performs a
function f on the list of values that correspond to these key value:
– Final results are written back to the GFS file system
MapReduce: Example
9/11/2017 6
• 3 mappers; 2 reducers
• Map function:
• Reduce function:
Problem-1
9/11/2017 7
In a MapReduce framework consider the HDFS block size is 64 MB.
We have 3 files of size 64K, 65Mb and 127Mb. How many blocks will
be created by Hadoop framework?
Problem-2
9/11/2017 8
Write the pseudo-codes (for map and reduce functions) for calculating
the average of a set of integers in MapReduce.
Suppose A = (10, 20, 30, 40, 50) is a set of integers. Show the map and
reduce outputs.
Problem-3
9/11/2017 9
Compute total and average salary of organization XYZ and group by
based on gender (male or female) using MapReduce. The input is as
follows
Name, Gender, Salary
John, M, 10,000
Martha, F, 15,000
----
Problem-4
9/11/2017 10
Write the Map and Reduce functions (pseudo-codes) for the following Word
Length Categorization problem under MapReduce model.
Word Length Categorization: Given a text paragraph (containing only words),
categorize each word into following categories. Output the frequency of
occurrence of words in each category.
Categories:
tiny: 1-2 letters; small: 3-5 letters; medium: 6-9 letters; big: 10 or more letters
11

Contenu connexe

Tendances

Sequence diagram
Sequence diagramSequence diagram
Sequence diagramRahul Pola
 
program flow mechanisms, advanced computer architecture
program flow mechanisms, advanced computer architectureprogram flow mechanisms, advanced computer architecture
program flow mechanisms, advanced computer architecturePankaj Kumar Jain
 
Clock synchronization in distributed system
Clock synchronization in distributed systemClock synchronization in distributed system
Clock synchronization in distributed systemSunita Sahu
 
Distributed system notes unit I
Distributed system notes unit IDistributed system notes unit I
Distributed system notes unit INANDINI SHARMA
 
Selection of an appropriate project approach
Selection of an appropriate project approachSelection of an appropriate project approach
Selection of an appropriate project approachtumetr1
 
Relational algebra operations
Relational algebra operationsRelational algebra operations
Relational algebra operationsSanthiNivas
 
Substitution techniques
Substitution techniquesSubstitution techniques
Substitution techniquesvinitha96
 
Distributed system lamport's and vector algorithm
Distributed system lamport's and vector algorithmDistributed system lamport's and vector algorithm
Distributed system lamport's and vector algorithmpinki soni
 
Remote invocation
Remote invocationRemote invocation
Remote invocationishapadhy
 
Lamport’s algorithm for mutual exclusion
Lamport’s algorithm for mutual exclusionLamport’s algorithm for mutual exclusion
Lamport’s algorithm for mutual exclusionNeelamani Samal
 
advanced computer architesture-conditions of parallelism
advanced computer architesture-conditions of parallelismadvanced computer architesture-conditions of parallelism
advanced computer architesture-conditions of parallelismPankaj Kumar Jain
 
Eucalyptus, Nimbus & OpenNebula
Eucalyptus, Nimbus & OpenNebulaEucalyptus, Nimbus & OpenNebula
Eucalyptus, Nimbus & OpenNebulaAmar Myana
 
Cloud Storage - Nirvanix Overview
Cloud Storage - Nirvanix OverviewCloud Storage - Nirvanix Overview
Cloud Storage - Nirvanix OverviewNirvanix
 
Scheduling in Cloud Computing
Scheduling in Cloud ComputingScheduling in Cloud Computing
Scheduling in Cloud ComputingHitesh Mohapatra
 
Planning in AI(Partial order planning)
Planning in AI(Partial order planning)Planning in AI(Partial order planning)
Planning in AI(Partial order planning)Vicky Tyagi
 

Tendances (20)

Sequence diagram
Sequence diagramSequence diagram
Sequence diagram
 
Automata Theory - Turing machine
Automata Theory - Turing machineAutomata Theory - Turing machine
Automata Theory - Turing machine
 
program flow mechanisms, advanced computer architecture
program flow mechanisms, advanced computer architectureprogram flow mechanisms, advanced computer architecture
program flow mechanisms, advanced computer architecture
 
Clock synchronization in distributed system
Clock synchronization in distributed systemClock synchronization in distributed system
Clock synchronization in distributed system
 
Type Checking(Compiler Design) #ShareThisIfYouLike
Type Checking(Compiler Design) #ShareThisIfYouLikeType Checking(Compiler Design) #ShareThisIfYouLike
Type Checking(Compiler Design) #ShareThisIfYouLike
 
Distributed system notes unit I
Distributed system notes unit IDistributed system notes unit I
Distributed system notes unit I
 
Selection of an appropriate project approach
Selection of an appropriate project approachSelection of an appropriate project approach
Selection of an appropriate project approach
 
Relational algebra operations
Relational algebra operationsRelational algebra operations
Relational algebra operations
 
Peephole Optimization
Peephole OptimizationPeephole Optimization
Peephole Optimization
 
Substitution techniques
Substitution techniquesSubstitution techniques
Substitution techniques
 
Distributed system lamport's and vector algorithm
Distributed system lamport's and vector algorithmDistributed system lamport's and vector algorithm
Distributed system lamport's and vector algorithm
 
Remote invocation
Remote invocationRemote invocation
Remote invocation
 
Lamport’s algorithm for mutual exclusion
Lamport’s algorithm for mutual exclusionLamport’s algorithm for mutual exclusion
Lamport’s algorithm for mutual exclusion
 
Cloud Security Mechanisms
Cloud Security MechanismsCloud Security Mechanisms
Cloud Security Mechanisms
 
Parallel Database
Parallel DatabaseParallel Database
Parallel Database
 
advanced computer architesture-conditions of parallelism
advanced computer architesture-conditions of parallelismadvanced computer architesture-conditions of parallelism
advanced computer architesture-conditions of parallelism
 
Eucalyptus, Nimbus & OpenNebula
Eucalyptus, Nimbus & OpenNebulaEucalyptus, Nimbus & OpenNebula
Eucalyptus, Nimbus & OpenNebula
 
Cloud Storage - Nirvanix Overview
Cloud Storage - Nirvanix OverviewCloud Storage - Nirvanix Overview
Cloud Storage - Nirvanix Overview
 
Scheduling in Cloud Computing
Scheduling in Cloud ComputingScheduling in Cloud Computing
Scheduling in Cloud Computing
 
Planning in AI(Partial order planning)
Planning in AI(Partial order planning)Planning in AI(Partial order planning)
Planning in AI(Partial order planning)
 

Similaire à Mod05lec23(map reduce tutorial)

Report Hadoop Map Reduce
Report Hadoop Map ReduceReport Hadoop Map Reduce
Report Hadoop Map ReduceUrvashi Kataria
 
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdfmodule3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdfTSANKARARAO
 
Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015
Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015
Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015Deanna Kosaraju
 
Hadoop trainting in hyderabad@kelly technologies
Hadoop trainting in hyderabad@kelly technologiesHadoop trainting in hyderabad@kelly technologies
Hadoop trainting in hyderabad@kelly technologiesKelly Technologies
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introductionDong Ngoc
 
Hadoop scheduler with deadline constraint
Hadoop scheduler with deadline constraintHadoop scheduler with deadline constraint
Hadoop scheduler with deadline constraintijccsa
 
Hadoop and Mapreduce Introduction
Hadoop and Mapreduce IntroductionHadoop and Mapreduce Introduction
Hadoop and Mapreduce Introductionrajsandhu1989
 
Introduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to HadoopIntroduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to HadoopGERARDO BARBERENA
 
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...Reynold Xin
 
Hadoop Mapreduce Performance Enhancement Using In-Node Combiners
Hadoop Mapreduce Performance Enhancement Using In-Node CombinersHadoop Mapreduce Performance Enhancement Using In-Node Combiners
Hadoop Mapreduce Performance Enhancement Using In-Node Combinersijcsit
 
Hadoop mapreduce and yarn frame work- unit5
Hadoop mapreduce and yarn frame work-  unit5Hadoop mapreduce and yarn frame work-  unit5
Hadoop mapreduce and yarn frame work- unit5RojaT4
 
Design Issues and Challenges of Peer-to-Peer Video on Demand System
Design Issues and Challenges of Peer-to-Peer Video on Demand System Design Issues and Challenges of Peer-to-Peer Video on Demand System
Design Issues and Challenges of Peer-to-Peer Video on Demand System cscpconf
 

Similaire à Mod05lec23(map reduce tutorial) (20)

Report Hadoop Map Reduce
Report Hadoop Map ReduceReport Hadoop Map Reduce
Report Hadoop Map Reduce
 
E031201032036
E031201032036E031201032036
E031201032036
 
Hadoop
HadoopHadoop
Hadoop
 
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdfmodule3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
 
Big Data.pptx
Big Data.pptxBig Data.pptx
Big Data.pptx
 
Hadoop - Introduction to HDFS
Hadoop - Introduction to HDFSHadoop - Introduction to HDFS
Hadoop - Introduction to HDFS
 
Hadoop
HadoopHadoop
Hadoop
 
Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015
Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015
Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015
 
Big Data Processing
Big Data ProcessingBig Data Processing
Big Data Processing
 
Hadoop trainting in hyderabad@kelly technologies
Hadoop trainting in hyderabad@kelly technologiesHadoop trainting in hyderabad@kelly technologies
Hadoop trainting in hyderabad@kelly technologies
 
Hadoop
HadoopHadoop
Hadoop
 
Map reducecloudtech
Map reducecloudtechMap reducecloudtech
Map reducecloudtech
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
 
Hadoop scheduler with deadline constraint
Hadoop scheduler with deadline constraintHadoop scheduler with deadline constraint
Hadoop scheduler with deadline constraint
 
Hadoop and Mapreduce Introduction
Hadoop and Mapreduce IntroductionHadoop and Mapreduce Introduction
Hadoop and Mapreduce Introduction
 
Introduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to HadoopIntroduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to Hadoop
 
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
 
Hadoop Mapreduce Performance Enhancement Using In-Node Combiners
Hadoop Mapreduce Performance Enhancement Using In-Node CombinersHadoop Mapreduce Performance Enhancement Using In-Node Combiners
Hadoop Mapreduce Performance Enhancement Using In-Node Combiners
 
Hadoop mapreduce and yarn frame work- unit5
Hadoop mapreduce and yarn frame work-  unit5Hadoop mapreduce and yarn frame work-  unit5
Hadoop mapreduce and yarn frame work- unit5
 
Design Issues and Challenges of Peer-to-Peer Video on Demand System
Design Issues and Challenges of Peer-to-Peer Video on Demand System Design Issues and Challenges of Peer-to-Peer Video on Demand System
Design Issues and Challenges of Peer-to-Peer Video on Demand System
 

Plus de Ankit Gupta

Biometricstechnology in iot and machine learning
Biometricstechnology in iot and machine learningBiometricstechnology in iot and machine learning
Biometricstechnology in iot and machine learningAnkit Gupta
 
Week2 cloud computing week2
Week2 cloud computing week2Week2 cloud computing week2
Week2 cloud computing week2Ankit Gupta
 
Week 8 lecture material
Week 8 lecture materialWeek 8 lecture material
Week 8 lecture materialAnkit Gupta
 
Week 4 lecture material cc (1)
Week 4 lecture material cc (1)Week 4 lecture material cc (1)
Week 4 lecture material cc (1)Ankit Gupta
 
Week 3 lecture material cc
Week 3 lecture material ccWeek 3 lecture material cc
Week 3 lecture material ccAnkit Gupta
 
Week 1 lecture material cc
Week 1 lecture material ccWeek 1 lecture material cc
Week 1 lecture material ccAnkit Gupta
 
Mod05lec25(resource mgmt ii)
Mod05lec25(resource mgmt ii)Mod05lec25(resource mgmt ii)
Mod05lec25(resource mgmt ii)Ankit Gupta
 
Mod05lec24(resource mgmt i)
Mod05lec24(resource mgmt i)Mod05lec24(resource mgmt i)
Mod05lec24(resource mgmt i)Ankit Gupta
 
Lecture29 cc-security4
Lecture29 cc-security4Lecture29 cc-security4
Lecture29 cc-security4Ankit Gupta
 
Lecture28 cc-security3
Lecture28 cc-security3Lecture28 cc-security3
Lecture28 cc-security3Ankit Gupta
 
Lecture27 cc-security2
Lecture27 cc-security2Lecture27 cc-security2
Lecture27 cc-security2Ankit Gupta
 
Lecture26 cc-security1
Lecture26 cc-security1Lecture26 cc-security1
Lecture26 cc-security1Ankit Gupta
 
Lecture 30 cloud mktplace
Lecture 30 cloud mktplaceLecture 30 cloud mktplace
Lecture 30 cloud mktplaceAnkit Gupta
 
Week 7 lecture material
Week 7 lecture materialWeek 7 lecture material
Week 7 lecture materialAnkit Gupta
 
Gurukul Cse cbcs-2015-16
Gurukul Cse cbcs-2015-16Gurukul Cse cbcs-2015-16
Gurukul Cse cbcs-2015-16Ankit Gupta
 
Microprocessor full hand made notes
Microprocessor full hand made notesMicroprocessor full hand made notes
Microprocessor full hand made notesAnkit Gupta
 
Transfer Leaning Using Pytorch synopsis Minor project pptx
Transfer Leaning Using Pytorch  synopsis Minor project pptxTransfer Leaning Using Pytorch  synopsis Minor project pptx
Transfer Leaning Using Pytorch synopsis Minor project pptxAnkit Gupta
 
Intro/Overview on Machine Learning Presentation -2
Intro/Overview on Machine Learning Presentation -2Intro/Overview on Machine Learning Presentation -2
Intro/Overview on Machine Learning Presentation -2Ankit Gupta
 
Intro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationIntro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationAnkit Gupta
 
Cloud computing ebook
Cloud computing ebookCloud computing ebook
Cloud computing ebookAnkit Gupta
 

Plus de Ankit Gupta (20)

Biometricstechnology in iot and machine learning
Biometricstechnology in iot and machine learningBiometricstechnology in iot and machine learning
Biometricstechnology in iot and machine learning
 
Week2 cloud computing week2
Week2 cloud computing week2Week2 cloud computing week2
Week2 cloud computing week2
 
Week 8 lecture material
Week 8 lecture materialWeek 8 lecture material
Week 8 lecture material
 
Week 4 lecture material cc (1)
Week 4 lecture material cc (1)Week 4 lecture material cc (1)
Week 4 lecture material cc (1)
 
Week 3 lecture material cc
Week 3 lecture material ccWeek 3 lecture material cc
Week 3 lecture material cc
 
Week 1 lecture material cc
Week 1 lecture material ccWeek 1 lecture material cc
Week 1 lecture material cc
 
Mod05lec25(resource mgmt ii)
Mod05lec25(resource mgmt ii)Mod05lec25(resource mgmt ii)
Mod05lec25(resource mgmt ii)
 
Mod05lec24(resource mgmt i)
Mod05lec24(resource mgmt i)Mod05lec24(resource mgmt i)
Mod05lec24(resource mgmt i)
 
Lecture29 cc-security4
Lecture29 cc-security4Lecture29 cc-security4
Lecture29 cc-security4
 
Lecture28 cc-security3
Lecture28 cc-security3Lecture28 cc-security3
Lecture28 cc-security3
 
Lecture27 cc-security2
Lecture27 cc-security2Lecture27 cc-security2
Lecture27 cc-security2
 
Lecture26 cc-security1
Lecture26 cc-security1Lecture26 cc-security1
Lecture26 cc-security1
 
Lecture 30 cloud mktplace
Lecture 30 cloud mktplaceLecture 30 cloud mktplace
Lecture 30 cloud mktplace
 
Week 7 lecture material
Week 7 lecture materialWeek 7 lecture material
Week 7 lecture material
 
Gurukul Cse cbcs-2015-16
Gurukul Cse cbcs-2015-16Gurukul Cse cbcs-2015-16
Gurukul Cse cbcs-2015-16
 
Microprocessor full hand made notes
Microprocessor full hand made notesMicroprocessor full hand made notes
Microprocessor full hand made notes
 
Transfer Leaning Using Pytorch synopsis Minor project pptx
Transfer Leaning Using Pytorch  synopsis Minor project pptxTransfer Leaning Using Pytorch  synopsis Minor project pptx
Transfer Leaning Using Pytorch synopsis Minor project pptx
 
Intro/Overview on Machine Learning Presentation -2
Intro/Overview on Machine Learning Presentation -2Intro/Overview on Machine Learning Presentation -2
Intro/Overview on Machine Learning Presentation -2
 
Intro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationIntro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning Presentation
 
Cloud computing ebook
Cloud computing ebookCloud computing ebook
Cloud computing ebook
 

Dernier

result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college projectTonystark477637
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 

Dernier (20)

★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 

Mod05lec23(map reduce tutorial)

  • 1. 1 Cloud Computing : MapReduce - Tutorial Prof. Soumya K Ghosh Department of Computer Science and Engineering IIT KHARAGPUR
  • 2. Introduction • MapReduce: programming model developed at Google • Objective: – Implement large scale search – Text processing on massively scalable web data stored using BigTable and GFS distributed file system • Designed for processing and generating large volumes of data via massively parallel computations, utilizing tens of thousands of processors at a time • Fault tolerant: ensure progress of computation even if processors and networks fail • Example: – Hadoop: open source implementation of MapReduce (developed at Yahoo!) – Available on pre-packaged AMIs on Amazon EC2 cloud platform 9/11/2017 2
  • 3. MapReduce Model 9/11/2017 3 • Parallel programming abstraction • Used by many different parallel applications which carry out large-scale computation involving thousands of processors • Leverages a common underlying fault-tolerant implementation • Two phases of MapReduce: – Map operation – Reduce operation • A configurable number of M ‘mapper’ processors and R ‘reducer’ processors are assigned to work on the problem • The computation is coordinated by a single master process
  • 4. MapReduce Model Contd… 9/11/2017 4 • Map phase: – Each mapper reads approximately 1/M of the input from the global file system, using locations given by the master – Map operation consists of transforming one set of key-value pairs to another: – Each mapper writes computation results in one file per reducer – Files are sorted by a key and stored to the local file system – The master keeps track of the location of these files
  • 5. MapReduce Model Contd… 9/11/2017 5 • Reduce phase: – The master informs the reducers where the partial computations have been stored on local files of respective mappers – Reducers make remote procedure call requests to the mappers to fetch the files – Each reducer groups the results of the map step using the same key and performs a function f on the list of values that correspond to these key value: – Final results are written back to the GFS file system
  • 6. MapReduce: Example 9/11/2017 6 • 3 mappers; 2 reducers • Map function: • Reduce function:
  • 7. Problem-1 9/11/2017 7 In a MapReduce framework consider the HDFS block size is 64 MB. We have 3 files of size 64K, 65Mb and 127Mb. How many blocks will be created by Hadoop framework?
  • 8. Problem-2 9/11/2017 8 Write the pseudo-codes (for map and reduce functions) for calculating the average of a set of integers in MapReduce. Suppose A = (10, 20, 30, 40, 50) is a set of integers. Show the map and reduce outputs.
  • 9. Problem-3 9/11/2017 9 Compute total and average salary of organization XYZ and group by based on gender (male or female) using MapReduce. The input is as follows Name, Gender, Salary John, M, 10,000 Martha, F, 15,000 ----
  • 10. Problem-4 9/11/2017 10 Write the Map and Reduce functions (pseudo-codes) for the following Word Length Categorization problem under MapReduce model. Word Length Categorization: Given a text paragraph (containing only words), categorize each word into following categories. Output the frequency of occurrence of words in each category. Categories: tiny: 1-2 letters; small: 3-5 letters; medium: 6-9 letters; big: 10 or more letters
  • 11. 11