Map Reduce Basics Explained

MR: Processing
<key,
value>
<key, value>
MR Box/Job
MR: For distributed data processing (here discussing with reference to Hadoop).
Provides easy scalability.
Essential data processing units- Mapper and Reducer.
However other additional units can be introduced based on application.

MR: Key-Value Pairs
Following are the requirements for key-value pairs for MR Jobs:
Both the key and value classes need to be serializable. Which
implies they need to implement Writable interface.
Key class need to implement WritableComparable interface, to
enable the sorting.

MR: Detailed Processing
<key, value>
Map
<key, value>
Combine Reduce
<key, value> <key, value>

MR: Terminology
• Job: Execution of mapper/reducer program over a data set.
• Job Tracker: Schedules jobs and tracks the assign jobs to Task tracker.
• Task: Each job is divided into smaller tasks, thus it is execution of map/reduce on slice
of data.
• Task Tracker: Tracks the task and reports status to JobTracker.
• Task Attempt: Attempt to execute a task on slave node.
• Payload:

MR: Types of Nodes
• Named Node: Manages the Hadoop Distributed File System. JobTracker sits here.
• Data Node: Holds all the data required for processing. TaskTrackers sit here.
• Master Node: Executes JobTracker and accepts all the job requests from the client.
• Slave Node: Runs the mapper and reducer jobs/program.

MR: Process
• Map task executes map function on each split of the data. (Split
size is important)
• Output of the map job is written to the local disk not on HDFS
(to avoid replication). Since this is just an intermediately output
and is rejected once the entire process is done. Hence storing
on HDFS will cause replication.
• Output of map is input to reduce job (for simple Map-Reduce
model). Output of reduce (i.e. the final output) is stored on
HDFS (first on local node and then replicated on disk)

MR: References
• https://hadoop.apache.org/docs/current/hadoop-mapreduce-
client/hadoop-mapreduce-client-core/MapReduceTutorial.html
• https://www.guru99.com/introduction-to-mapreduce.html

Map Reduce Basics Explained

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Map Reduce Basics Explained

Similar to Map Reduce Basics Explained (20)

More from Surinder Kaur

More from Surinder Kaur (12)

Recently uploaded

Recently uploaded (20)

Map Reduce Basics Explained