2. Session Objectives
Introduction to BIG Data and Hadoop
Understanding Hadoop 2.0 and its features
Understanding the differences between Hadoop 1.x and Hadoop 2.x
Understanding YARN
Working of Application Master
Scheduling In YARN
Scheduling Mechanisms in YARN
Q & A
2
3. Introduction to Big Data and Hadoop
Big data is the term for a collection of data sets so large and complex
that it becomes difficult to process using on-hand database
management tools or traditional data processing applications.
Systems/Enterprises generate huge amount of data from terabytes to
even petabytes/zettabytes of information.
It’s very difficult to manage such huge data…
HADOOP
BIG DATA
& 3
4. Big Data and its challenges
Challenges of processing Big Data are 3 V’s.
VOLUME VELOCITY VARIETY
Modern systems have
Much more data.
- Terabytes + a day.
- Petabytes + total
We need a new
approach.
To Process such a huge
volume of data within a
specified time period, We
need a new approach .
We have to process different
sorts of data such as
Structured, Semi-structured,
and Unstructured data. We
need a new approach.
4
5. What is Hadoop ?
Apache Hadoop is a framework that allows the distributed processing of large data
sets across clusters of commodity computers using a simple programming model.
It is an open-source data management technology with scale-out storage and
distributed processing.
5
7. Background : Hadoop + HDFS
Every node contributes part of
its local file System to HDFS.
Tasks can only depend on the
local file system
(JVM class path does not
understand HDFS Protocol)
HDFS Distributed File System
NameNode
DataNode DataNode
Local File
System
Local File
System
7
9. Challenges for Hadoop 1.x
Problem Description
NameNode – No Horizontal
scalability
Single NameNode and single Namespaces
limited by NameNode RAM.
NameNode – No High Availability
(HA)
NameNode is single point of failure, need
manual recovery using secondary
NameNode in case of failure.
Job tracker – overburdened Spends significant amount of time and
effort managing the life-cycle of
applications.
MRv1 – only Map and Reduce tasks Humongous amount of data stored in
HDFS remains unutilized and cannot be
used for other workloads such as graph
processing etc.
9
10. Hadoop 2.x Features
Property Hadoop 1.0 Hadoop 2.0
Federation One Namenode and
Namespaces
Multiple Namenode and
Namespaces
High Availability Not present Highly Available
YARN – Processing
control and Multi tenancy
Job Tracker, Task Tracker Resource Manager, Node
Manager, App Master,
Capacity Scheduler
10
Other Important Hadoop 2.0 Features
Hadoop Snapshots
NFSv3 access to data in HDFS
Support for running Hadoop on MS Windows
Binary compatibility for MapReduce Applications built on Hadoop 1.0
14. YARN
Yet Another Resource Negotiator
YARN Application Resource Negotiator (Recursive
Acronym)
Remedies the scalability shortcomings of “classic”
MapReduce
Classic MapReduce has scalability issues around 4000 nodes
and higher
Is more of a general purpose framework of which classic
MapReduce is one application.
14
15. Classic MapReduce vs. YARN
Fault Tolerance and Availability
Resource Manager
No single point of failure – state saved in ZooKeeper
Application Masters are restarted automatically on RM restart
Application Master
Optional failover via application-specific checkpoint
MapReduce applications pick up where they left off via state saved in HDFS
Wire Compatibility
Protocols are wire-compatible
Old clients can talk to new servers
Rolling upgrades
15
16. Classic MapReduce vs. YARN
Support for programming paradigms other than MapReduce (Multi
tenancy)
• Tez – Generic framework to run a complex DAG
• HBase on YARN(HOYA)
• Machine Learning: Spark
• Graph processing: Giraph
• Real-time processing: Storm
• Enabled by allowing the use of paradigm-specific application
master
• Run all on the same Hadoop cluster!
16
17. YARN Architectural Overview
Scalability – Clusters of 6000
– 10000 machines
Each machine with 16 cores ,
48GB/96GB RAM,
24TB/36TB Hard Disks.
100,000 + Concurrent tasks
10000 concurrent jobs
17
18. YARN Architectural Overview(Contd..)
Splits up the two major functions of JobTracker
Global Resource Manager - Cluster resource management
Application Master - Job scheduling and monitoring (one per application).
The Application Master negotiates resource containers from the Scheduler,
tracking their status and monitoring for progress. Application Master itself
runs as a normal container.
Tasktracker
NodeManager (NM) - A new per-node slave is responsible for launching
the applications’containers, monitoring their resource usage (cpu, memory,
disk, network) and reporting to the Resource Manager.
YARN maintains compatibility with existing MapReduce applications and
users.
18
19. YARN Flow
YARN = YET ANOTHER RESOURCE NEGOTIATOR
Resource Manager
Cluster level Resource Manager
Long Life, High quality hardware
Node Manager
One per Data Node
Monitor resources on Data Node
Application Master
One per Data Node
Short Life
Manages Task/Scheduling
19
20. YARN – How It Works
Protocols :
1.) Client – RM: Submit the App Master
2.) RM – NM: Start the App Master
3.) AM – RM: Request + Release containers
4.) RM – NM: Start tasks in containers
YARN
Client
YARN
Resource Manager
Node Manager
Node Manager
Task
AM
Node Manager
Task
Task
Task Task
1.)
2.)
3.)
4.)
20
21. YARN – Application Master
Once We have an Application Master running in a container.
Next, the Master starts the application’s task in containers.
For each task , similar procedure as starting the master.
21
22. YARN – Application Master (contd..)
1) Connect with Resource Manager and register itself
2) Loop:
Send request
a) Send heartbeat
b) Request containers
c) Release containers
Receive response from Resource Manager
a) Receive containers
b) Receive notification of containers that terminated
For each container received, send request to Node Manager to start task.
22
23. YARN – Application Master (contd..)
Master Should terminate after all containers have terminated.
• If master crashes (or fails to send heart beat), all containers are killed.
• In the future, YARN will support to restart the master
Starting a task in a container
• Very similar to YARN Client starting the master.
• Node manager starts and monitors task.
• If task crashes, Node manager informs Resource Manager.
• Resource Manager informs master in the next response.
• Master must still release the container.
23
24. YARN – Application Master (contd..)
Resource Manager assigns containers asynchronously
• Requested containers are returned at the earliest in the next
call.
• Master must send empty requests until all it receives all
containers.
• Subsequent requests are incremental and can ask for additional
containers
• Master must keep track of what it has requested and received
24
25. YARN – Shortcomings
Complexity
• Protocols are at very low level and very verbose.
• Client must prepare all dependent jars on HDFS.
• Client must set up environment, class path etc.
Logs are only collected after application terminates
• What about long-running apps?
Applications don’t survive master crash
No built in communication between containers and masters.
Hard to debug
25
26. Scheduling In YARN
Ideal situation the requests that YARN application makes would
be granted immediately
Real world resources are limited and on a busy cluster an
application will often need to wait to have some of its requests
fulfilled.
YARN takes the responsibility of providing resources to
applications according to some defined policies
Scheduling is difficult problem and there is no one “best” policy,
which is why YARN provides a choice of schedulers and
configurable policies. 26
28. Scheduling In YARN – FIFO Scheduler
FIFO Scheduler: It places applications in a queue and runs them in the order of
submission (First In First Out). Requests for the first application in the queue
are allocated first. It is an simple implementation but not suitable for shared
cluster because large applications will all the resources and others needs to
wait in the queue.
28
29. Scheduling In YARN – FIFO Scheduler
The FIFO queue scheduler runs jobs based on the order in which the jobs were submitted.
29
30. Scheduling In YARN - Capacity Scheduler
Capacity Scheduler:
In a shared cluster each organization is allocated certain capacity of overall cluster.
Each organization is set up with dedicated queue that is configured to use given fraction
of the cluster capacity.
Queues may be further divided into hierarchical fashion allowing each organization to
share its cluster allowance between different groups of user within the organization.
If there is more than one job in the queue and there are idle resources available, then with
the capacity scheduler a separate dedicated queue allows the small job to start as soon as
it submitted, although its at the cost of overall cluster utilization.
Some times these queue allotments can be done beyond the specified queue capacity but
not beyond the maximum capacity of the parent queue. This is called a queue elasticity
30
32. Scheduling In YARN - Capacity Scheduler
The Capacity Scheduler is a scheduler for Hadoop that allows multiple tenants to securely share a
large cluster. Resources are allocated to each tenant's applications in a way that fully utilizes the
cluster. Free resources can be allocated to any queue beyond its capacity allocation.
32
33. Scheduling In YARN – Fair Scheduler
Fair Scheduler:
Fair scheduler is also used in the shared cluster environment
With Fair Scheduler there is no need to reserve a set amount of capacity, since it
will dynamically balance resources between all running jobs.
Just after the first (large) job starts, it is the only job running, so it gets all the
resources in the cluster. When the second (small) job starts, it is allocated half of
the cluster recourses so that each job is using its fair share of resources.
After the small job completes and no longer requires resources, the large job goes
back to using the full cluster capacity again.
It’s a rule/policy based scheduler. In the configuration file(xml) queue
configurations are defined with the set of rules and policies.
Under the configuration pre-emptions min/max limits and minimum guaranteed
share can be configured.
33
35. Scheduling In YARN – Fair Scheduler
The design goal of the Fair Scheduler is to assign resources to jobs so that each job receives its
fair share of resources over time. The Fair Scheduler enforces fair sharing within each queue.
35
36. Scheduling Mechanisms In YARN
Delay Scheduler:
All YARN scheduler try to honor locality requests
On a busy cluster if an application requests a particular node, there is a good chance that other containers
are running on the same anode and the node wont be allocated.
The obvious course of action is to immediately loosen the locality requirement and allocate container on
some other node (same rack, different rack, different data center)
However waiting for sometime can increase the possibility to get the requested node. This is called Delay
scheduling and its supported by both Capacity and Fair Scheduler.
When using the delay scheduling, the scheduler doesn’t simply use the first scheduling opportunity it
receives, but waits for up to given maximum number of scheduling opportunities to occur before
loosening the locality constraint and taking the next scheduling opportunity.
For capacity scheduler delay scheduling is configured by setting yarn.scheduler.capacity.node-locality-
delay
The fair scheduler also uses the number of scheduling opportunities to determine the delay, although it is
expressed as a portion of cluster size. For example setting yarn.scheduler.fair.locality.threshold.node to 0.5
means that the scheduler should wait until half of the nodes in the cluster have presented scheduling
opportunities before accepting another node.
36
37. Scheduling Mechanisms In YARN
Dominant Resource Fairness:
When there is only a single resource type being scheduled, such as memory, then the concept of capacity or
fairness is easy to determine. However when there are multiple resources in play, things get more
complicated.
The way that the schedulers in YARN address this problem is to look at each usrs dominant resources and
use it ad a measure of the cluster usage. This approach is called as DRF (Dominant Resource Fairness)
By Default DRF is not used. But can be configured for different types of schedulers for example: capacity
schedulers can be configured to use DRF by setting yarn.scheduler.capacity.resource-calculator to
org.apache.Hadoop.yarn.util.resource.DominantResourceCalculator in capacity-scheduler.xml
Example:
Imagine a cluster of 100 CPU’s and 10 Tb of memory. Application A requests for container of (2 CPU,
300 GB) and application B requests containers of (6 CPU, 100 GB). A’s request is (2%, 3%) and B’s
request is (6%, 1%). While comparing the dominant resource request for A and B (3% versus 6%) B
wins the DRF game and gets the maximum allocation (at least the half)
37