YARN - Presented At Dallas Hadoop User Group

Hadoop 2.0 –YARN
Yet Another Resource Negotiator
Rommel Garcia
Solutions Engineer

© Hortonworks Inc. 2013

Page 1

Agenda
• Hadoop 1.X & 2.X – Concepts Recap
• YARN Architecture – How does this affect MRv1?
• Slots be gone – What does this mean for MapReduce?
• Building YARN Applications
•Q &A


Hadoop 1.X vs. 2.X
Recap over the differences


The 1st Generation of Hadoop: Batch
HADOOP 1.0
Built for Web-Scale Batch Apps

Single App

Single App

INTERACTIVE

ONLINE

Single App

Single App

Single App

BATCH

BATCH

BATCH

HDFS

HDFS

HDFS


• All other usage
patterns must
leverage that same
infrastructure
• Forces the creation
of silos for managing
mixed workloads

Hadoop MapReduce Classic
• JobTracker
–Manages cluster resources and job scheduling

• TaskTracker
–Per-node agent
–Manage tasks


Page 5

Hadoop 1
• Limited up to 4,000 nodes per cluster
• O(# of tasks in a cluster)
• JobTracker bottleneck - resource management, job
scheduling and monitoring
• Only has one namespace for managing HDFS
• Map and Reduce slots are static
• Only job to run is MapReduce


Hadoop 1.X Stack
OPERATIONAL
SERVICES
AMBARI

DATA
SERVICES
FLUME

HIVE &

PIG
OOZIE

HCATALOG

SQOOP

HBASE

LOAD &
EXTRACT

HADOOP
CORE

PLATFORM
SERVICES

NFS

MAP REDUCE

WebHDFS

HDFS

Enterprise Readiness
High Availability, Disaster Recovery,
Security and Snapshots

HORTONWORKS
DATA PLATFORM (HDP)
OS


Cloud

VM

Appliance

Page 7

Our Vision: Hadoop as Next-Gen Platform

Single Use System

Multi Purpose Platform

Batch Apps

Batch, Interactive, Online, Streaming, …

HADOOP 1.0

HADOOP 2.0
MapReduce
(data processing)

MapReduce

Others
(data processing)

YARN

(cluster resource management
& data processing)

(cluster resource management)

HDFS

HDFS2

(redundant, reliable storage)

(redundant, reliable storage)

© Hortonworks Inc. 2012 Confidential and Proprietary.
2013.

Page 8

YARN: Taking Hadoop Beyond Batch
Store ALL DATA in one place…
Interact with that data in MULTIPLE WAYS
with Predictable Performance and Quality of Service
Applications Run Natively IN Hadoop
BATCH
INTERACTIVE
(MapReduce)
(Tez)

ONLINE
(HBase)

STREAMING
(Storm, S4,…)

GRAPH
(Giraph)

IN-MEMORY
(Spark)

HPC MPI
(OpenMPI)

OTHER
(Search)
(Weave…)

YARN (Cluster Resource Management)
HDFS2 (Redundant, Reliable Storage)


Page 9

Hadoop 2
• Potentially up to 10,000 nodes per cluster
• O(cluster size)
• Supports multiple namespace for managing HDFS
• Efficient cluster utilization (YARN)
• MRv1 backward and forward compatible
• Any apps can integrate with Hadoop
• Beyond Java


Hadoop 2.X Stack
OPERATIONAL
SERVICES
AMBARI

DATA
SERVICES
FLUME
HBASE

FALCON*
OOZIE

HIVE &

PIG

HCATALOG

SQOOP
LOAD &
EXTRACT

HADOOP
CORE
PLATFORM
SERVICES

NFS
WebHDFS

KNOX*

MAP
REDUCE

TEZ*

YARN
HDFS
Enterprise Readiness
High Availability, Disaster
Recovery, Rolling Upgrades,
Security and Snapshots

HORTONWORKS
DATA PLATFORM (HDP)
OS/VM

Cloud

Appliance
*included Q1 2013


Page 11

YARN Architecture


A Brief History of YARN
• Originally conceived & architected by the team at Yahoo!
– Arun Murthy created the original JIRA in 2008, led the PMC
– Currently Arun is the Lead for Map-Reduce/YARN/Tez at Hortonworks and
was formerly Architect Hadoop MapReduce at Yahoo

• The team at Hortonworks has been working on YARN for 4 years
• YARN based architecture running at scale at Yahoo!
– Deployed on 35,000 nodes for about a year
– Implemented Storm-on-Yarn that processes 133,000 events per second.


Page 13

Concepts
• Application
–Application is a job submitted to the framework
–Example – Map Reduce Job

• Container
–Basic unit of allocation
–Fine-grained resource allocation across multiple resource
types (memory, cpu, disk, network, gpu etc.)
– container_0 = 2GB, 1CPU
– container_1 = 1GB, 6 CPU

–Replaces the fixed map/reduce slots


14

Architecture
• Resource Manager
–Global resource scheduler
–Hierarchical queues
–Application management

• Node Manager
–Per-machine agent
–Manages the life-cycle of container
–Container resource monitoring

• Application Master
–Per-application
–Manages application scheduling and task execution
–E.g. MapReduce Application Master

15

YARN – Running Apps
create app1

Hadoop Client 1

ASM
NM

ResourceManager

.......negotiates....... Containers
.......reports to....... ASM

submit app1
Scheduler .......partitions.......
Resources

create app2

Hadoop Client 2

submit app2

Scheduler

ASM

queues
status report

NodeManager
C2.1
NodeManager
C2.2
NodeManager
AM2

Rack1


NodeManager

NodeManager

C1.3
NodeManager
C2.3

C1.2

NodeManager
AM1

Rack2

NodeManager
C1.4
NodeManager
C1.1

RackN

Slots be gone!
How does MapReduce run on YARN


Apache Hadoop MapReduce on YARN
• Original use-case
• Most complex application to build
– Data-locality
– Fault tolerance
– ApplicationMaster recovery: Check point to HDFS
– Intra-application Priorities: Maps v/s Reduces
– Needed complex feedback mechanism from ResourceManager

– Security
– Isolation

• Binary compatible with Apache Hadoop 1.x


Page 18

Apache Hadoop MapReduce on YARN
ResourceManager

Scheduler

NodeManager

NodeManager

NodeManager

NodeManager

map 1.1
map2.1
reduce2.1

NodeManager

NodeManager

MR AM 1

NodeManager

map1.2

NodeManager

reduce1.1


NodeManager

MR AM2

NodeManager

NodeManager

map2.2

NodeManager

reduce2.2

Efficiency Gains of YARN
• Key Optimizations
– No hard segmentation of resource into map and reduce slots
– Yarn scheduler is more efficient
– All resources are fungible

• Yahoo has over 30000 nodes running YARN across over
365PB of data.
• They calculate running about 400,000 jobs per day for
about 10 million hours of compute time.
• They also have estimated a 60% – 150% improvement on
node usage per day.
• Yahoo got rid of a whole colo (10,000 node datacenter)
because of their increased utilization.


An Example Calculating Node Capacity
• Important Parameters
– mapreduce.[map|reduce].memory.mb
– This is the physical ram hard-limit enforced by Hadoop on the task

– mapreduce.[map|reduce].java.opts
– The heapsize of the jvm –Xmx

– yarn.scheduler.minimum-allocation-mb
– The smallest container yarn will allow

– yarn.nodemanager.resource.memory-mb
– The amount of physical ram on the node

– yarn.nodemanager.vmem-pmem-ratio
– The amount of virtual ram each container is allowed.
– This is calculated by containerMemoryRequest*vmem-pmem-ratio


Calculating Node Capacity Continued
• Lets pretend we need a 1g map and a 2g reduce
• mapreduce[map|reduce].memory.mb = [-Xmx 1g | -Xmx 2g]

• Remember a container has more overhead then just your heap! Add
512mb to the container limit for overhead
• mapreduce.[map.reduce].memory.mb= [1536 | 2560]

• We have 36g per node and minimum allocations of 512mb
• yarn.nodemanager.resource.memory-mb=36864
• yarn.scheduler.minimum-allocation-mb=512

• Virtual Memory for each container is
• Map: 1536mb*vmem-pmem-ratio (default is 2.1) = 3225.6mb
• Reduce 2560mb*vmem-pmem-ratio = 5376mb

• Our 36g node can support
• 24 Maps OR 14 Reducers OR any combination allowed by the
resources on the node


Building YARN Apps
Super Simple APIs


YARN – Implementing Applications
• What APIs do I need to use?
–Only three protocols
– Client to ResourceManager
– Application submission

– ApplicationMaster to ResourceManager
– Container allocation

– ApplicationMaster to NodeManager
– Container launch

–Use client libraries for all 3 actions
–Module yarn-client
–Provides both synchronous and asynchronous libraries
–Use 3rd party like Weave
– http://continuuity.github.io/weave/

24

• What do I need to do?
–Write a submission Client
–Write an ApplicationMaster (well copy-paste)
–DistributedShell is the new WordCount

–Get containers, run whatever you want!


25

• What else do I need to know?
–Resource Allocation & Usage
–ResourceRequest
–Container
–ContainerLaunchContext
–LocalResource

–ApplicationMaster
–ApplicationId
–ApplicationAttemptId
–ApplicationSubmissionContext


26

YARN – Resource Allocation & Usage
• ResourceRequest
– Fine-grained resource ask to the ResourceManager
– Ask for a specific amount of resources (memory, cpu etc.) on a
specific machine or rack
– Use special value of * for resource name for any machine

ResourceRequest
priority
resourceName
capability
numContainers


Page 27

• ResourceRequest

priority

1


<4gb, 1 core>

numContainers
1

rack0

1

*

<2gb, 1 core>

resourceName
host01

0

capability

1

*

1

Page 28

• Container
– The basic unit of allocation in YARN
– The result of the ResourceRequest provided by ResourceManager
to the ApplicationMaster
– A specific amount of resources (cpu, memory etc.) on a specific
machine
Container
containerId
resourceName
capability

tokens


Page 29

• ContainerLaunchContext
– The context provided by ApplicationMaster to NodeManager to
launch the Container
– Complete specification for a process
– LocalResource used to specify container binary and
dependencies
– NodeManager responsible for downloading from shared namespace
(typically HDFS)

ContainerLaunchContext
container
commands
environment
localResources

LocalResource
uri
type


Page 30

YARN - ApplicationMaster
• ApplicationMaster
– Per-application controller aka container_0
– Parent for all containers of the application
– ApplicationMaster negotiates all it’s containers from
ResourceManager

– ApplicationMaster container is child of ResourceManager
– Think init process in Unix
– RM restarts the ApplicationMaster attempt if required (unique
ApplicationAttemptId)

– Code for application is submitted along with Application itself


Page 31

YARN - ApplicationMaster
• ApplicationMaster
– ApplicationSubmissionContext is the complete specification of the
ApplicationMaster, provided by Client
– ResourceManager responsible for allocating and launching
ApplicationMaster container

ApplicationSubmissionContext
resourceRequest
containerLaunchContext
appName
queue


Page 32

YARN Application API - Overview
• YarnClient is submission client api
• Both synchronous & asynchronous APIs for resource
allocation and container start/stop
• Synchronous API
– AMRMClient
– AMNMClient

• Asynchronous API
– AMRMClientAsync
– AMNMClientAsync


Page 33

YARN Application API – The Client
New Application Request: YarnClient.createApplication

ResourceManager

1

Client2

Scheduler

Submit Application:
YarnClient.submitApplication

2
NodeManager

NodeManager

NodeManager

NodeManager

Container 1.1

Container 2.2
Container 2.4

NodeManager

NodeManager

AM 1

NodeManager

Container 1.2

NodeManager

Container 1.3


NodeManager

AM2

NodeManager

NodeManager

Container 2.1

NodeManager

Container 2.3

YARN Application API – The Client
• YarnClient
– createApplication to create application
– submitApplication to start application
– Application developer needs to provide ApplicationSubmissionContext

– APIs to get other information from ResourceManager
– getAllQueues
– getApplications
– getNodeReports

– APIs to manipulate submitted application e.g. killApplication


Page 35

YARN Application API – Resource Allocation
ResourceManager

AMRMClient.allocate

Scheduler
Container

3
NodeManager

NodeManager

NodeManager

2

4

unregisterApplicationMaster

NodeManager

NodeManager

NodeManager

NodeManager

NodeManager

NodeManager

AM

1
NodeManager

registerApplicationMaster

NodeManager


• AMRMClient - Synchronous API for ApplicationMaster
to interact with ResourceManager
– Prologue / epilogue – registerApplicationMaster /
unregisterApplicationMaster
– Resource negotiation with ResourceManager
– Internal book-keeping - addContainerRequest / removeContainerRequest /
releaseAssignedContainer
– Main API – allocate

– Helper APIs for cluster information
– getAvailableResources
– getClusterNodeCount


Page 37

• AMRMClientAsync - Asynchronous API for
ApplicationMaster
– Extension of AMRMClient to provide asynchronous
CallbackHandler
– Callbacks make it easier to build mental model of interaction with
ResourceManager for the application developer
– onContainersAllocated
– onContainersCompleted
– onNodesUpdated
– onError
– onShutdownRequest


Page 38

YARN Application API – Using Resources
ResourceManager

Scheduler

NodeManager

NodeManager

NodeManager

NodeManager

NodeManager

NodeManager

NodeManager

NodeManager

Container 1.1
AMNMClient.startContainer

NodeManager

NodeManager

AM 1

AMNMClient.getContainerStatus

NodeManager

NodeManager


• AMNMClient - Synchronous API for ApplicationMaster
to launch / stop containers at NodeManager
– Simple (trivial) APIs
– startContainer
– stopContainer
– getContainerStatus


Page 40

• AMNMClient - Asynchronous API for
ApplicationMaster to launch / stop containers at
NodeManager
– Simple (trivial) APIs
– startContainerAsync
– stopContainerAsync
– getContainerStatusAsync

– CallbackHandler to make it easier to build mental model of
interaction with NodeManager for the application developer
– onContainerStarted
– onContainerStopped
– onStartContainerError
– onContainerStatusReceived


Page 41

Hadoop Summit 2014


Page 42

THANK YOU!
Rommel Garcia, Solution Engineer – Big Data
rgarcia@hortonworks.com


Page 43

YARN - Presented At Dallas Hadoop User Group

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à YARN - Presented At Dallas Hadoop User Group

Similaire à YARN - Presented At Dallas Hadoop User Group (20)

Plus de Rommel Garcia

Plus de Rommel Garcia (12)

Dernier

Dernier (20)

YARN - Presented At Dallas Hadoop User Group

Notes de l'éditeur