SlideShare a Scribd company logo
1 of 48
Page1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Developing YARN Native Applications
Arun Murthy – Architect / Founder
Bob Page – VP Partner Products
Page2 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Topics
Hadoop 2 and YARN: Beyond Batch
YARN: The Hadoop Resource Manager
• YARN Concepts and Terminology
• The YARN APIs
• A Simple YARN application
• The Application Timeline Server
Next Steps
Page3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Hadoop 2 and YARN: Beyond Batch
Page4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Hadoop 2.0: From Batch-only to Multi-Workload
HADOOP 1.0
HDFS
(redundant, reliable storage)
MapReduce
(cluster resource management
& data processing)
HDFS2
(redundant, reliable storage)
YARN
(cluster resource management)
MapReduce
(data processing)
Others
(data processing)
HADOOP 2.0
Single Use System
Batch Apps
Multi Purpose Platform
Batch, Interactive, Online, Streaming, …
Page5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Key Driver Of Hadoop Adoption: Enterprise Data Lake
Flexible
Enables other purpose-built data
processing models beyond
MapReduce (batch), such as
interactive and streaming
Efficient
Double processing IN Hadoop on
the same hardware while providing
predictable performance & quality
of service
Shared
Provides a stable, reliable,
secure foundation and shared
operational services across
multiple workloads
Data Processing Engines Run Natively IN Hadoop
BATCH
MapReduce
INTERACTIVE
Tez
STREAMING
Storm
IN-MEMORY
Spark
GRAPH
Giraph
ONLINE
HBase, Accumulo
OTHERS
HDFS: Redundant, Reliable Storage
YARN: Cluster Resource Management
Page6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
5 Key Benefits of YARN
1. Scale
2. New Programming Models & Services
3. Improved Cluster Utilization
4. Agility
5. Beyond Java
Page7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
YARN Platform Benefits
Deployment
YARN provides a seamless vehicle to deploy your software to an enterprise Hadoop cluster
Fault Tolerance
YARN ‘handles’ (detects, notifies, and provides default actions) for HW, OS, JVM failure
tolerance
YARN provides plugins for the app to define failure behavior
Scheduling (incorporating Data Locality)
YARN utilizes HDFS to schedule app processing where the data lives
YARN ensures that your apps finish in the SLA expected by your customers
Page8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
A Brief History of YARN
Originally conceived & architected at Yahoo!
Arun Murthy created the original JIRA in 2008 and led the PMC
The team at Hortonworks has been working on YARN for 4 years
90% of code from Hortonworks & Yahoo!
YARN battle-tested at scale with Yahoo!
In production on 32,000+ nodes
YARN Released October 2013 with Apache Hadoop 2
Page9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
YARN Development Framework
YARN : Data Operating System
°1 ° ° ° ° ° ° ° °
° ° ° ° ° ° ° ° °
°
°°
° ° ° ° ° ° °
° ° ° ° ° ° N
HDFS
(Hadoop Distributed File System)
System
Batch
MapReduce
Interactive
Tez
Engine Real-Time
Slider
Direct
ISV
Apps
Scripting
Pig
SQL
Hive
Cascading
Java
Scala
NoSQL
HBase
Accumulo
Stream
Storm
API
ISV
Apps
ISV
Aps
Applications
Others
Spark
ISV Apps
ISV
Apps
Page10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
YARN Concepts
Page11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Apps on YARN: Categories
Type Definition Examples
Framework / Engine Provides platform capabilities to
enable data services and
applications
Twill, Reef, Tez, MapReduce, Spark
Service An application that runs
continuously
Storm, HBase, Memcached, etc
Job A batch/iterative data processing
job that runs on a Service or a
Framework
- XML Parsing MR job
- Mahout K-means algorithm
YARN App A temporal job or a service
submitted to YARN
- HBase Cluster (service)
- MapReduce job
Page12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
YARN Concepts: Container
Basic unit of allocation
Fine-grained resource allocation
memory, CPU, disk, network, GPU, etc.
• container_0 = 2GB, 1CPU
• container_1 = 1GB, 6 CPU
Replaces the fixed map/reduce
slots from Hadoop 1
Capability
Memory, CPU
Container Request
Capability, Host, Rack, Priority, relaxLocality
Container Launch Context
LocalResources - Resources needed to
execute container application
Environment variables - Example: classpath
Command to execute
Page13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
YARN Terminology
ResourceManager (RM) – central
agent
–Allocates & manages cluster resources
–Hierarchical queues
NodeManager (NM) – per-node agent
–Manages, monitors and enforces node
resource allocations
–Manages lifecycle of containers
User Application
ApplicationMaster (AM)
 Manages application lifecycle and task
scheduling
Container
 Executes application logic
Client
 Submits the application
Launching the app
1. Client requests ResourceManager to
launch ApplicationMaster Container
2. ApplicationMaster requests NodeManager
to launch Application Containers
Page14 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
YARN Process Flow - Walkthrough
NodeManager NodeManager NodeManager NodeManager
Container 1.1
Container 2.4
NodeManager NodeManager NodeManager NodeManager
NodeManager NodeManager NodeManager NodeManager
Container 1.2
Container 1.3
AM 1
Container 2.2
Container 2.1
Container 2.3
AM2
Client2
ResourceManager
Scheduler
Page15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
The YARN APIs
Page16 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Node ManagerNode Manager
APIs Needed
Only three protocols
Client to ResourceManager
• Application submission
ApplicationMaster to ResourceManager
• Container allocation
ApplicationMaster to NodeManager
• Container launch
Use client libraries for all 3 actions
Package org.apache.hadoop.yarn.client.api
provides both synchronous and asynchronous libraries
Client
Resource
Manager
Application
Master
Node Manager
YarnClient
Application Client
Protocol
AMRMClient
NMClient
Application Master
Protocol
App
Container
Container Management
Protocol
Page17 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
YARN – Implementation Outline
1. Write a Client to submit the application
2. Write an ApplicationMaster (well, copy & paste)
“DistributedShell is the new WordCount”
3. Get containers, run whatever you want!
Page18 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
YARN – Implementing Applications
What else do I need to know?
Resource Allocation & Usage
• ResourceRequest
• Container
• ContainerLaunchContext & LocalResource
ApplicationMaster
• ApplicationId
• ApplicationAttemptId
• ApplicationSubmissionContext
Page19 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
YARN – Resource Allocation & Usage
ResourceRequest
Fine-grained resource ask to the ResourceManager
Ask for a specific amount of resources (memory, CPU etc.) on a specific machine or rack
Use special value of * for resource name for any machine
ResourceRequest
priority
resourceName
capability
numContainers
Page20 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
YARN – Resource Allocation & Usage
Container
The basic unit of allocation in YARN
The result of the ResourceRequest provided by ResourceManager to the ApplicationMaster
A specific amount of resources (CPU, memory etc.) on a specific machine
Container
containerId
resourceName
capability
tokens
Page21 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
YARN – Resource Allocation & Usage
ContainerLaunchContext & LocalResource
The context provided by ApplicationMaster to NodeManager to launch the Container
Complete specification for a process
LocalResource is used to specify container binary and dependencies
• NodeManager is responsible for downloading from shared namespace (typically HDFS)
ContainerLaunchContext
container
commands
environment
localResources LocalResource
uri
type
Page22 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
The ApplicationMaster
The per-application controller aka container_0
The parent for all containers of the application
ApplicationMaster negotiates its containers from ResourceManager
ApplicationMaster container is child of ResourceManager
Think init process in Unix
RM restarts the ApplicationMaster attempt if required (unique ApplicationAttemptId)
Code for application is submitted along with Application itself
Page23 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
ApplicationSubmissionContext
ApplicationSubmissionContext is the complete specification of the
ApplicationMaster
Provided by the Client
ResourceManager responsible for allocating and launching the ApplicationMaster container
ApplicationSubmissionContext
resourceRequest
containerLaunchContext
appName
queue
Page24 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
YARN Application API - Overview
hadoop-yarn-client module
YarnClient is submission client API
Both synchronous & asynchronous APIs for resource allocation and
container start/stop
Synchronous: AMRMClient & AMNMClient
Asynchronous: AMRMClientAsync & AMNMClientAsync
Page25 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
YARN Application API – YarnClient
createApplication to create application
submitApplication to start application
Application developer provides ApplicationSubmissionContext
APIs to get other information from ResourceManager
getAllQueues
getApplications
getNodeReports
APIs to manipulate submitted application e.g. killApplication
Page26 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
YARN Application API – The Client
NodeManager NodeManager NodeManager NodeManager
Container 1.1
Container 2.4
NodeManager NodeManager NodeManager NodeManager
NodeManager NodeManager NodeManager NodeManager
Container 1.2
Container 1.3
AM 1
Container 2.2
Container 2.1
Container 2.3
AM2
Client2
New Application Request:
YarnClient.createApplication
Submit Application:
YarnClient.submitApplication
1
2
ResourceManager
Scheduler
Page27 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
AppMaster-ResourceManager API
AMRMClient - Synchronous API
registerApplicationMaster
unregisterApplicationMaster
Resource negotiation
addContainerRequest
removeContainerRequest
releaseAssignedContainer
Main API – allocate
Helper APIs for cluster information
getAvailableResources
getClusterNodeCount
AMRMClientAsync – Asynchronous
Extension of AMRMClient to provide
asynchronous CallbackHandler
Callback interaction model with
ResourceManager
onContainersAllocated
onContainersCompleted
onNodesUpdated
onError
onShutdownRequest
Page28 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
AppMaster-ResourceManager flow
NodeManager NodeManager NodeManager NodeManager
NodeManager NodeManager NodeManager
AM
registerApplicationMaster
1
4
AMRMClient.allocate
Container
2
3
unregisterApplicationMaster
ResourceManager
Scheduler
NodeManager NodeManager NodeManager NodeManager
Page29 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
AppMaster-NodeManager API
For AM to launch/stop containers at NodeManager
AMNMClient - Synchronous API
Simple (trivial) APIs
• startContainer
• stopContainer
• getContainerStatus
AMNMClientAsync – Asynchronous
Simple (trivial) APIs
startContainerAsync
stopContainerAsync
getContainerStatusAsync
Callback interaction model with
NodeManager
onContainerStarted
onContainerStopped
onStartContainerError
onContainerStatusReceived
Page30 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
YARN Application API - Development
Un-Managed Mode for ApplicationMaster
Run the ApplicationMaster on your development machine rather than in-cluster
• No submission client needed
Use hadoop-yarn-applications-unmanaged-am-launcher
Easier to step through debugger, browse logs etc.
$ bin/hadoop jar hadoop-yarn-applications-unmanaged-am-launcher.jar 
Client 
–jar my-application-master.jar 
–cmd ‘java MyApplicationMaster <args>’
Page31 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
A Simple YARN Application
Page32 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
A Simple YARN Application
Simplest example of a YARN application – get n containers, and run a specific Unix command
on each. Minimal error handling, etc.
Control Flow
1. User submits application to the Resource Manager
• Client provides ApplicationSubmissionContext to the Resource Manager
2. App Master negotiates with Resource Manager for n containers
3. App Master launches containers with the user-specified command as
ContainerLaunchContext.commands
Code: https://github.com/hortonworks/simple-yarn-app
Page33 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Simple YARN Application – Client
Command to launch
ApplicationMaster process
Page34 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Simple YARN Application – Client
Resources required for
ApplicationMaster
container
ApplicationSubmissionContext
for
ApplicationMaster
Submit application to
ResourceManager
Page35 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Simple YARN Application – AppMaster
Steps:
1. AMRMClient.registerApplication
2. Negotiate containers from ResourceManager by providing ContainerRequest to
AMRMClient.addContainerRequest
3. Take the resultant Container returned via subsequent call to AMRMClient.allocate, build
ContainerLaunchContext with Container and commands, then launch them using
AMNMClient.launchContainer
– Use LocalResources to specify software/configuration dependencies for each worker container
4. Wait till done… AllocateResponse.getCompletedContainersStatuses from subsequent calls
to AMRMClient.allocate
5. AMRMClient.unregisterApplication
Page36 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Simple YARN Application – AppMaster
Initialize clients
to ResourceManager
and NodeManagers
Register with
ResourceManager
Initialize clients to
ResourceManager
and NodeManagers
Page37 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Simple YARN Application – AppMaster
Setup requirements for
worker containers
Make resource
requests to
ResourceManager
Page38 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Simple YARN Application – AppMaster
Get containers from
ResourceManager
Launch containers
on NodeManagers
Page39 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Simple YARN Application – AppMaster
Wait for containers to
complete successfully
Un-register with
ResourceManager
Page40 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Graduating from simple-yarn-app
DistributedShell. Same functionality but less simple
e.g. error checking, use of timeline server
For a complex YARN app, see Tez
Pre-warmed containers, sessions, etc.
Look at MapReduce for even more excitement
Data locality, fault tolerance, checkpoint to HDFS, security, isolation, etc
Intra-application priorities (maps vs reduces) need complex feedback from ResourceManager
(all at apache.org)
Page41 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Application Timeline Server
Page42 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Application Timeline Server
Maintains historical state & provides metrics visibility for YARN apps
Similar to MapReduce Job History Server
Information can be queried via REST APIs
ATS in HDP 2.1 is considered a Tech Preview
Generic information
• queue name
• user information
• information about application attempts
• a list of Containers that were run under
each application attempt
• information about each Container
Per-framework/application info
Developers can publish information to the
Timeline Server via the TimelineClient (from
within a client), the ApplicationMaster, or the
application's Containers.
Page43 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Application Timeline Server
App Timeline Server
AMBARI
Custom App
Monitoring
Client
Page44 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Next Steps
Page45 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
hortonworks.com/get-started/YARN
Setup HDP 2.1 environment
Leverage Sandbox
Review Sample Code & Execute Simple YARN Application
https://github.com/hortonworks/simple-yarn-app
Graduate to more complex code examples
BUILD FLEXIBLE, SCALABLE, RESILIENT & POWERFUL APPLICATIONS TO RUN IN HADOOP
Page46 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Hortonworks YARN Resources
Hortonworks Web Site
hortonworks.com/hadoop/yarn
Includes links to blog posts
YARN Forum
Community of Hadoop YARN developers – collaboration and Q&A
hortonworks.com/community/forums/forum/yarn
YARN Office Hours
Dial in and chat with YARN experts
Next Office Hour: Thursday August 14 @ 10-11am PDT. Register:
https://hortonworks.webex.com/hortonworks/onstage/g.php?t=a&d=628190636
Page47 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
And from Hortonworks University
Hortonworks Course: Developing Custom YARN Applications
Format: Online
Duration: 2 Days
When: Aug 18th & 19th (Mon & Tues)
Cost: No Charge to Hortonworks Technical Partners
Space: Very Limited
Interested? Please contact lsensmeier@hortonworks.com
Page48 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Stay in Touch!
Join us for the full series of YARN development webinars:
YARN Native July 24 @ 9am PT (recording link)
Slider August 7 @ 9am PT (registration link)
Tez August 21 @ 9am PT (registration link)
Additional webinar topics are being added – watch the blog or visit
Hortonworks.com/webinars
http://hortonworks.com/hadoop/yarn

More Related Content

What's hot

Azure App Service
Azure App ServiceAzure App Service
Azure App ServiceBizTalk360
 
2- Configuration des référentiels ODI 11
2- Configuration des référentiels ODI 112- Configuration des référentiels ODI 11
2- Configuration des référentiels ODI 11samr
 
Microsoft Azure Traffic Manager
Microsoft Azure Traffic ManagerMicrosoft Azure Traffic Manager
Microsoft Azure Traffic ManagerIdo Katz
 
Learn Oracle WebLogic Server 12c Administration
Learn Oracle WebLogic Server 12c AdministrationLearn Oracle WebLogic Server 12c Administration
Learn Oracle WebLogic Server 12c AdministrationRevelation Technologies
 
Azure IAAS architecture with High Availability for beginners and developers -...
Azure IAAS architecture with High Availability for beginners and developers -...Azure IAAS architecture with High Availability for beginners and developers -...
Azure IAAS architecture with High Availability for beginners and developers -...Malleswar Reddy
 
Websphere Application Server V8.5
Websphere Application Server V8.5Websphere Application Server V8.5
Websphere Application Server V8.5IBM WebSphereIndia
 
Working with PowerVC via its REST APIs
Working with PowerVC via its REST APIsWorking with PowerVC via its REST APIs
Working with PowerVC via its REST APIsJoe Cropper
 
Pig Tutorial | Twitter Case Study | Apache Pig Script and Commands | Edureka
Pig Tutorial | Twitter Case Study | Apache Pig Script and Commands | EdurekaPig Tutorial | Twitter Case Study | Apache Pig Script and Commands | Edureka
Pig Tutorial | Twitter Case Study | Apache Pig Script and Commands | EdurekaEdureka!
 
PUC SE Day 2019 - SpringBoot
PUC SE Day 2019 - SpringBootPUC SE Day 2019 - SpringBoot
PUC SE Day 2019 - SpringBootJosué Neis
 
SQLite in Flutter.pptx
SQLite in Flutter.pptxSQLite in Flutter.pptx
SQLite in Flutter.pptxNabin Dhakal
 
Building large scale applications in yarn with apache twill
Building large scale applications in yarn with apache twillBuilding large scale applications in yarn with apache twill
Building large scale applications in yarn with apache twillHenry Saputra
 
Asp.Net Core MVC with Entity Framework
Asp.Net Core MVC with Entity FrameworkAsp.Net Core MVC with Entity Framework
Asp.Net Core MVC with Entity FrameworkShravan A
 
Sap Netweaver Portal
Sap Netweaver PortalSap Netweaver Portal
Sap Netweaver PortalSaba Ameer
 
Springboot introduction
Springboot introductionSpringboot introduction
Springboot introductionSagar Verma
 

What's hot (20)

Azure Backup Simplifies
Azure Backup SimplifiesAzure Backup Simplifies
Azure Backup Simplifies
 
Introduction to Apache Synapse
Introduction to Apache SynapseIntroduction to Apache Synapse
Introduction to Apache Synapse
 
Azure App Service
Azure App ServiceAzure App Service
Azure App Service
 
2- Configuration des référentiels ODI 11
2- Configuration des référentiels ODI 112- Configuration des référentiels ODI 11
2- Configuration des référentiels ODI 11
 
Microsoft Azure Traffic Manager
Microsoft Azure Traffic ManagerMicrosoft Azure Traffic Manager
Microsoft Azure Traffic Manager
 
Learn Oracle WebLogic Server 12c Administration
Learn Oracle WebLogic Server 12c AdministrationLearn Oracle WebLogic Server 12c Administration
Learn Oracle WebLogic Server 12c Administration
 
Azure IAAS architecture with High Availability for beginners and developers -...
Azure IAAS architecture with High Availability for beginners and developers -...Azure IAAS architecture with High Availability for beginners and developers -...
Azure IAAS architecture with High Availability for beginners and developers -...
 
Websphere Application Server V8.5
Websphere Application Server V8.5Websphere Application Server V8.5
Websphere Application Server V8.5
 
Working with PowerVC via its REST APIs
Working with PowerVC via its REST APIsWorking with PowerVC via its REST APIs
Working with PowerVC via its REST APIs
 
Pig Tutorial | Twitter Case Study | Apache Pig Script and Commands | Edureka
Pig Tutorial | Twitter Case Study | Apache Pig Script and Commands | EdurekaPig Tutorial | Twitter Case Study | Apache Pig Script and Commands | Edureka
Pig Tutorial | Twitter Case Study | Apache Pig Script and Commands | Edureka
 
Nutanix basic
Nutanix basicNutanix basic
Nutanix basic
 
PUC SE Day 2019 - SpringBoot
PUC SE Day 2019 - SpringBootPUC SE Day 2019 - SpringBoot
PUC SE Day 2019 - SpringBoot
 
Apache tomcat
Apache tomcatApache tomcat
Apache tomcat
 
SQLite in Flutter.pptx
SQLite in Flutter.pptxSQLite in Flutter.pptx
SQLite in Flutter.pptx
 
Building large scale applications in yarn with apache twill
Building large scale applications in yarn with apache twillBuilding large scale applications in yarn with apache twill
Building large scale applications in yarn with apache twill
 
Asp.Net Core MVC with Entity Framework
Asp.Net Core MVC with Entity FrameworkAsp.Net Core MVC with Entity Framework
Asp.Net Core MVC with Entity Framework
 
Spring Security 5
Spring Security 5Spring Security 5
Spring Security 5
 
Spring boot
Spring bootSpring boot
Spring boot
 
Sap Netweaver Portal
Sap Netweaver PortalSap Netweaver Portal
Sap Netweaver Portal
 
Springboot introduction
Springboot introductionSpringboot introduction
Springboot introduction
 

Viewers also liked

Get Started Building YARN Applications
Get Started Building YARN ApplicationsGet Started Building YARN Applications
Get Started Building YARN ApplicationsHortonworks
 
Harnessing the power of YARN with Apache Twill
Harnessing the power of YARN with Apache TwillHarnessing the power of YARN with Apache Twill
Harnessing the power of YARN with Apache TwillTerence Yim
 
Apache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data ApplicationsApache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data ApplicationsHortonworks
 
Writing app framworks for hadoop on yarn
Writing app framworks for hadoop on yarnWriting app framworks for hadoop on yarn
Writing app framworks for hadoop on yarnDataWorks Summit
 
Apache REEF - stdlib for big data
Apache REEF - stdlib for big dataApache REEF - stdlib for big data
Apache REEF - stdlib for big dataSergiy Matusevych
 
Introduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready ProgramIntroduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready ProgramHortonworks
 
Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014Hortonworks
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez Hortonworks
 
YARN Ready - Integrating to YARN using Slider Webinar
YARN Ready - Integrating to YARN using Slider WebinarYARN Ready - Integrating to YARN using Slider Webinar
YARN Ready - Integrating to YARN using Slider WebinarHortonworks
 
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextDiscover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextHortonworks
 
Hortonworks Technical Workshop - build a yarn ready application with apache ...
Hortonworks Technical Workshop -  build a yarn ready application with apache ...Hortonworks Technical Workshop -  build a yarn ready application with apache ...
Hortonworks Technical Workshop - build a yarn ready application with apache ...Hortonworks
 
Dynamic Allocation in Spark
Dynamic Allocation in SparkDynamic Allocation in Spark
Dynamic Allocation in SparkDatabricks
 
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFSDiscover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFSHortonworks
 
Discover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in HadoopDiscover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in HadoopHortonworks
 
Dynamic Resource Allocation Spark on YARN
Dynamic Resource Allocation Spark on YARNDynamic Resource Allocation Spark on YARN
Dynamic Resource Allocation Spark on YARNTsuyoshi OZAWA
 
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopDiscover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopHortonworks
 
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...Hortonworks
 
Discover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop SearchDiscover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop SearchHortonworks
 
Dataguise hortonworks insurance_feb25
Dataguise hortonworks insurance_feb25Dataguise hortonworks insurance_feb25
Dataguise hortonworks insurance_feb25Hortonworks
 

Viewers also liked (20)

Get Started Building YARN Applications
Get Started Building YARN ApplicationsGet Started Building YARN Applications
Get Started Building YARN Applications
 
Harnessing the power of YARN with Apache Twill
Harnessing the power of YARN with Apache TwillHarnessing the power of YARN with Apache Twill
Harnessing the power of YARN with Apache Twill
 
Apache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data ApplicationsApache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data Applications
 
A Multi Colored YARN
A Multi Colored YARNA Multi Colored YARN
A Multi Colored YARN
 
Writing app framworks for hadoop on yarn
Writing app framworks for hadoop on yarnWriting app framworks for hadoop on yarn
Writing app framworks for hadoop on yarn
 
Apache REEF - stdlib for big data
Apache REEF - stdlib for big dataApache REEF - stdlib for big data
Apache REEF - stdlib for big data
 
Introduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready ProgramIntroduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready Program
 
Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez
 
YARN Ready - Integrating to YARN using Slider Webinar
YARN Ready - Integrating to YARN using Slider WebinarYARN Ready - Integrating to YARN using Slider Webinar
YARN Ready - Integrating to YARN using Slider Webinar
 
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextDiscover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
 
Hortonworks Technical Workshop - build a yarn ready application with apache ...
Hortonworks Technical Workshop -  build a yarn ready application with apache ...Hortonworks Technical Workshop -  build a yarn ready application with apache ...
Hortonworks Technical Workshop - build a yarn ready application with apache ...
 
Dynamic Allocation in Spark
Dynamic Allocation in SparkDynamic Allocation in Spark
Dynamic Allocation in Spark
 
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFSDiscover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
 
Discover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in HadoopDiscover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
 
Dynamic Resource Allocation Spark on YARN
Dynamic Resource Allocation Spark on YARNDynamic Resource Allocation Spark on YARN
Dynamic Resource Allocation Spark on YARN
 
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopDiscover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
 
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
 
Discover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop SearchDiscover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop Search
 
Dataguise hortonworks insurance_feb25
Dataguise hortonworks insurance_feb25Dataguise hortonworks insurance_feb25
Dataguise hortonworks insurance_feb25
 

Similar to Developing YARN Applications - Integrating natively to YARN July 24 2014

Bikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnBikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnhdhappy001
 
YARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute PlatformYARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute PlatformBikas Saha
 
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of HadoopApache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of HadoopHortonworks
 
Running Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache HadoopRunning Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache Hadoophitesh1892
 
Developing Applications with Hadoop 2.0 and YARN by Abhijit Lele
Developing Applications with Hadoop 2.0 and YARN by Abhijit Lele Developing Applications with Hadoop 2.0 and YARN by Abhijit Lele
Developing Applications with Hadoop 2.0 and YARN by Abhijit Lele Hakka Labs
 
Apache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesApache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesDataWorks Summit
 
Combine SAS High-Performance Capabilities with Hadoop YARN
Combine SAS High-Performance Capabilities with Hadoop YARNCombine SAS High-Performance Capabilities with Hadoop YARN
Combine SAS High-Performance Capabilities with Hadoop YARNHortonworks
 
YARN - Next Generation Compute Platform fo Hadoop
YARN - Next Generation Compute Platform fo HadoopYARN - Next Generation Compute Platform fo Hadoop
YARN - Next Generation Compute Platform fo HadoopHortonworks
 
How YARN Enables Multiple Data Processing Engines in Hadoop
How YARN Enables Multiple Data Processing Engines in HadoopHow YARN Enables Multiple Data Processing Engines in Hadoop
How YARN Enables Multiple Data Processing Engines in HadoopPOSSCON
 
Overview of slider project
Overview of slider projectOverview of slider project
Overview of slider projectSteve Loughran
 
YARN - Presented At Dallas Hadoop User Group
YARN - Presented At Dallas Hadoop User GroupYARN - Presented At Dallas Hadoop User Group
YARN - Presented At Dallas Hadoop User GroupRommel Garcia
 
Writing YARN Applications Hadoop Summit 2012
Writing YARN Applications Hadoop Summit 2012Writing YARN Applications Hadoop Summit 2012
Writing YARN Applications Hadoop Summit 2012hitesh1892
 
Writing Yarn Applications Hadoop Summit 2012
Writing Yarn Applications Hadoop Summit 2012Writing Yarn Applications Hadoop Summit 2012
Writing Yarn Applications Hadoop Summit 2012Hortonworks
 
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The UnionDataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The UnionWangda Tan
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionDataWorks Summit
 
YARN - way to share cluster BEYOND HADOOP
YARN - way to share cluster BEYOND HADOOPYARN - way to share cluster BEYOND HADOOP
YARN - way to share cluster BEYOND HADOOPOmkar Joshi
 
Apache Hadoop YARN: State of the Union
Apache Hadoop YARN: State of the UnionApache Hadoop YARN: State of the Union
Apache Hadoop YARN: State of the UnionDataWorks Summit
 

Similar to Developing YARN Applications - Integrating natively to YARN July 24 2014 (20)

Yarn
YarnYarn
Yarn
 
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnBikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
 
YARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute PlatformYARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute Platform
 
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of HadoopApache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
 
Running Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache HadoopRunning Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache Hadoop
 
Developing Applications with Hadoop 2.0 and YARN by Abhijit Lele
Developing Applications with Hadoop 2.0 and YARN by Abhijit Lele Developing Applications with Hadoop 2.0 and YARN by Abhijit Lele
Developing Applications with Hadoop 2.0 and YARN by Abhijit Lele
 
Apache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesApache Hadoop YARN: best practices
Apache Hadoop YARN: best practices
 
Combine SAS High-Performance Capabilities with Hadoop YARN
Combine SAS High-Performance Capabilities with Hadoop YARNCombine SAS High-Performance Capabilities with Hadoop YARN
Combine SAS High-Performance Capabilities with Hadoop YARN
 
YARN - Next Generation Compute Platform fo Hadoop
YARN - Next Generation Compute Platform fo HadoopYARN - Next Generation Compute Platform fo Hadoop
YARN - Next Generation Compute Platform fo Hadoop
 
How YARN Enables Multiple Data Processing Engines in Hadoop
How YARN Enables Multiple Data Processing Engines in HadoopHow YARN Enables Multiple Data Processing Engines in Hadoop
How YARN Enables Multiple Data Processing Engines in Hadoop
 
Overview of slider project
Overview of slider projectOverview of slider project
Overview of slider project
 
YARN - Presented At Dallas Hadoop User Group
YARN - Presented At Dallas Hadoop User GroupYARN - Presented At Dallas Hadoop User Group
YARN - Presented At Dallas Hadoop User Group
 
Apache Slider
Apache SliderApache Slider
Apache Slider
 
Writing YARN Applications Hadoop Summit 2012
Writing YARN Applications Hadoop Summit 2012Writing YARN Applications Hadoop Summit 2012
Writing YARN Applications Hadoop Summit 2012
 
Writing Yarn Applications Hadoop Summit 2012
Writing Yarn Applications Hadoop Summit 2012Writing Yarn Applications Hadoop Summit 2012
Writing Yarn Applications Hadoop Summit 2012
 
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The UnionDataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the union
 
Running Services on YARN
Running Services on YARNRunning Services on YARN
Running Services on YARN
 
YARN - way to share cluster BEYOND HADOOP
YARN - way to share cluster BEYOND HADOOPYARN - way to share cluster BEYOND HADOOP
YARN - way to share cluster BEYOND HADOOP
 
Apache Hadoop YARN: State of the Union
Apache Hadoop YARN: State of the UnionApache Hadoop YARN: State of the Union
Apache Hadoop YARN: State of the Union
 

More from Hortonworks

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyHortonworks
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakHortonworks
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsHortonworks
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysHortonworks
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's NewHortonworks
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerHortonworks
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsHortonworks
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeHortonworks
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidHortonworks
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleHortonworks
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATAHortonworks
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Hortonworks
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseHortonworks
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseHortonworks
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationHortonworks
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementHortonworks
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHortonworks
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCHortonworks
 

More from Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 

Recently uploaded

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 

Recently uploaded (20)

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 

Developing YARN Applications - Integrating natively to YARN July 24 2014

  • 1. Page1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Developing YARN Native Applications Arun Murthy – Architect / Founder Bob Page – VP Partner Products
  • 2. Page2 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Topics Hadoop 2 and YARN: Beyond Batch YARN: The Hadoop Resource Manager • YARN Concepts and Terminology • The YARN APIs • A Simple YARN application • The Application Timeline Server Next Steps
  • 3. Page3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Hadoop 2 and YARN: Beyond Batch
  • 4. Page4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Hadoop 2.0: From Batch-only to Multi-Workload HADOOP 1.0 HDFS (redundant, reliable storage) MapReduce (cluster resource management & data processing) HDFS2 (redundant, reliable storage) YARN (cluster resource management) MapReduce (data processing) Others (data processing) HADOOP 2.0 Single Use System Batch Apps Multi Purpose Platform Batch, Interactive, Online, Streaming, …
  • 5. Page5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Key Driver Of Hadoop Adoption: Enterprise Data Lake Flexible Enables other purpose-built data processing models beyond MapReduce (batch), such as interactive and streaming Efficient Double processing IN Hadoop on the same hardware while providing predictable performance & quality of service Shared Provides a stable, reliable, secure foundation and shared operational services across multiple workloads Data Processing Engines Run Natively IN Hadoop BATCH MapReduce INTERACTIVE Tez STREAMING Storm IN-MEMORY Spark GRAPH Giraph ONLINE HBase, Accumulo OTHERS HDFS: Redundant, Reliable Storage YARN: Cluster Resource Management
  • 6. Page6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 5 Key Benefits of YARN 1. Scale 2. New Programming Models & Services 3. Improved Cluster Utilization 4. Agility 5. Beyond Java
  • 7. Page7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved YARN Platform Benefits Deployment YARN provides a seamless vehicle to deploy your software to an enterprise Hadoop cluster Fault Tolerance YARN ‘handles’ (detects, notifies, and provides default actions) for HW, OS, JVM failure tolerance YARN provides plugins for the app to define failure behavior Scheduling (incorporating Data Locality) YARN utilizes HDFS to schedule app processing where the data lives YARN ensures that your apps finish in the SLA expected by your customers
  • 8. Page8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved A Brief History of YARN Originally conceived & architected at Yahoo! Arun Murthy created the original JIRA in 2008 and led the PMC The team at Hortonworks has been working on YARN for 4 years 90% of code from Hortonworks & Yahoo! YARN battle-tested at scale with Yahoo! In production on 32,000+ nodes YARN Released October 2013 with Apache Hadoop 2
  • 9. Page9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved YARN Development Framework YARN : Data Operating System °1 ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° °° ° ° ° ° ° ° ° ° ° ° ° ° ° N HDFS (Hadoop Distributed File System) System Batch MapReduce Interactive Tez Engine Real-Time Slider Direct ISV Apps Scripting Pig SQL Hive Cascading Java Scala NoSQL HBase Accumulo Stream Storm API ISV Apps ISV Aps Applications Others Spark ISV Apps ISV Apps
  • 10. Page10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved YARN Concepts
  • 11. Page11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Apps on YARN: Categories Type Definition Examples Framework / Engine Provides platform capabilities to enable data services and applications Twill, Reef, Tez, MapReduce, Spark Service An application that runs continuously Storm, HBase, Memcached, etc Job A batch/iterative data processing job that runs on a Service or a Framework - XML Parsing MR job - Mahout K-means algorithm YARN App A temporal job or a service submitted to YARN - HBase Cluster (service) - MapReduce job
  • 12. Page12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved YARN Concepts: Container Basic unit of allocation Fine-grained resource allocation memory, CPU, disk, network, GPU, etc. • container_0 = 2GB, 1CPU • container_1 = 1GB, 6 CPU Replaces the fixed map/reduce slots from Hadoop 1 Capability Memory, CPU Container Request Capability, Host, Rack, Priority, relaxLocality Container Launch Context LocalResources - Resources needed to execute container application Environment variables - Example: classpath Command to execute
  • 13. Page13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved YARN Terminology ResourceManager (RM) – central agent –Allocates & manages cluster resources –Hierarchical queues NodeManager (NM) – per-node agent –Manages, monitors and enforces node resource allocations –Manages lifecycle of containers User Application ApplicationMaster (AM)  Manages application lifecycle and task scheduling Container  Executes application logic Client  Submits the application Launching the app 1. Client requests ResourceManager to launch ApplicationMaster Container 2. ApplicationMaster requests NodeManager to launch Application Containers
  • 14. Page14 © Hortonworks Inc. 2011 – 2014. All Rights Reserved YARN Process Flow - Walkthrough NodeManager NodeManager NodeManager NodeManager Container 1.1 Container 2.4 NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager Container 1.2 Container 1.3 AM 1 Container 2.2 Container 2.1 Container 2.3 AM2 Client2 ResourceManager Scheduler
  • 15. Page15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved The YARN APIs
  • 16. Page16 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Node ManagerNode Manager APIs Needed Only three protocols Client to ResourceManager • Application submission ApplicationMaster to ResourceManager • Container allocation ApplicationMaster to NodeManager • Container launch Use client libraries for all 3 actions Package org.apache.hadoop.yarn.client.api provides both synchronous and asynchronous libraries Client Resource Manager Application Master Node Manager YarnClient Application Client Protocol AMRMClient NMClient Application Master Protocol App Container Container Management Protocol
  • 17. Page17 © Hortonworks Inc. 2011 – 2014. All Rights Reserved YARN – Implementation Outline 1. Write a Client to submit the application 2. Write an ApplicationMaster (well, copy & paste) “DistributedShell is the new WordCount” 3. Get containers, run whatever you want!
  • 18. Page18 © Hortonworks Inc. 2011 – 2014. All Rights Reserved YARN – Implementing Applications What else do I need to know? Resource Allocation & Usage • ResourceRequest • Container • ContainerLaunchContext & LocalResource ApplicationMaster • ApplicationId • ApplicationAttemptId • ApplicationSubmissionContext
  • 19. Page19 © Hortonworks Inc. 2011 – 2014. All Rights Reserved YARN – Resource Allocation & Usage ResourceRequest Fine-grained resource ask to the ResourceManager Ask for a specific amount of resources (memory, CPU etc.) on a specific machine or rack Use special value of * for resource name for any machine ResourceRequest priority resourceName capability numContainers
  • 20. Page20 © Hortonworks Inc. 2011 – 2014. All Rights Reserved YARN – Resource Allocation & Usage Container The basic unit of allocation in YARN The result of the ResourceRequest provided by ResourceManager to the ApplicationMaster A specific amount of resources (CPU, memory etc.) on a specific machine Container containerId resourceName capability tokens
  • 21. Page21 © Hortonworks Inc. 2011 – 2014. All Rights Reserved YARN – Resource Allocation & Usage ContainerLaunchContext & LocalResource The context provided by ApplicationMaster to NodeManager to launch the Container Complete specification for a process LocalResource is used to specify container binary and dependencies • NodeManager is responsible for downloading from shared namespace (typically HDFS) ContainerLaunchContext container commands environment localResources LocalResource uri type
  • 22. Page22 © Hortonworks Inc. 2011 – 2014. All Rights Reserved The ApplicationMaster The per-application controller aka container_0 The parent for all containers of the application ApplicationMaster negotiates its containers from ResourceManager ApplicationMaster container is child of ResourceManager Think init process in Unix RM restarts the ApplicationMaster attempt if required (unique ApplicationAttemptId) Code for application is submitted along with Application itself
  • 23. Page23 © Hortonworks Inc. 2011 – 2014. All Rights Reserved ApplicationSubmissionContext ApplicationSubmissionContext is the complete specification of the ApplicationMaster Provided by the Client ResourceManager responsible for allocating and launching the ApplicationMaster container ApplicationSubmissionContext resourceRequest containerLaunchContext appName queue
  • 24. Page24 © Hortonworks Inc. 2011 – 2014. All Rights Reserved YARN Application API - Overview hadoop-yarn-client module YarnClient is submission client API Both synchronous & asynchronous APIs for resource allocation and container start/stop Synchronous: AMRMClient & AMNMClient Asynchronous: AMRMClientAsync & AMNMClientAsync
  • 25. Page25 © Hortonworks Inc. 2011 – 2014. All Rights Reserved YARN Application API – YarnClient createApplication to create application submitApplication to start application Application developer provides ApplicationSubmissionContext APIs to get other information from ResourceManager getAllQueues getApplications getNodeReports APIs to manipulate submitted application e.g. killApplication
  • 26. Page26 © Hortonworks Inc. 2011 – 2014. All Rights Reserved YARN Application API – The Client NodeManager NodeManager NodeManager NodeManager Container 1.1 Container 2.4 NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager Container 1.2 Container 1.3 AM 1 Container 2.2 Container 2.1 Container 2.3 AM2 Client2 New Application Request: YarnClient.createApplication Submit Application: YarnClient.submitApplication 1 2 ResourceManager Scheduler
  • 27. Page27 © Hortonworks Inc. 2011 – 2014. All Rights Reserved AppMaster-ResourceManager API AMRMClient - Synchronous API registerApplicationMaster unregisterApplicationMaster Resource negotiation addContainerRequest removeContainerRequest releaseAssignedContainer Main API – allocate Helper APIs for cluster information getAvailableResources getClusterNodeCount AMRMClientAsync – Asynchronous Extension of AMRMClient to provide asynchronous CallbackHandler Callback interaction model with ResourceManager onContainersAllocated onContainersCompleted onNodesUpdated onError onShutdownRequest
  • 28. Page28 © Hortonworks Inc. 2011 – 2014. All Rights Reserved AppMaster-ResourceManager flow NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager AM registerApplicationMaster 1 4 AMRMClient.allocate Container 2 3 unregisterApplicationMaster ResourceManager Scheduler NodeManager NodeManager NodeManager NodeManager
  • 29. Page29 © Hortonworks Inc. 2011 – 2014. All Rights Reserved AppMaster-NodeManager API For AM to launch/stop containers at NodeManager AMNMClient - Synchronous API Simple (trivial) APIs • startContainer • stopContainer • getContainerStatus AMNMClientAsync – Asynchronous Simple (trivial) APIs startContainerAsync stopContainerAsync getContainerStatusAsync Callback interaction model with NodeManager onContainerStarted onContainerStopped onStartContainerError onContainerStatusReceived
  • 30. Page30 © Hortonworks Inc. 2011 – 2014. All Rights Reserved YARN Application API - Development Un-Managed Mode for ApplicationMaster Run the ApplicationMaster on your development machine rather than in-cluster • No submission client needed Use hadoop-yarn-applications-unmanaged-am-launcher Easier to step through debugger, browse logs etc. $ bin/hadoop jar hadoop-yarn-applications-unmanaged-am-launcher.jar Client –jar my-application-master.jar –cmd ‘java MyApplicationMaster <args>’
  • 31. Page31 © Hortonworks Inc. 2011 – 2014. All Rights Reserved A Simple YARN Application
  • 32. Page32 © Hortonworks Inc. 2011 – 2014. All Rights Reserved A Simple YARN Application Simplest example of a YARN application – get n containers, and run a specific Unix command on each. Minimal error handling, etc. Control Flow 1. User submits application to the Resource Manager • Client provides ApplicationSubmissionContext to the Resource Manager 2. App Master negotiates with Resource Manager for n containers 3. App Master launches containers with the user-specified command as ContainerLaunchContext.commands Code: https://github.com/hortonworks/simple-yarn-app
  • 33. Page33 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Simple YARN Application – Client Command to launch ApplicationMaster process
  • 34. Page34 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Simple YARN Application – Client Resources required for ApplicationMaster container ApplicationSubmissionContext for ApplicationMaster Submit application to ResourceManager
  • 35. Page35 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Simple YARN Application – AppMaster Steps: 1. AMRMClient.registerApplication 2. Negotiate containers from ResourceManager by providing ContainerRequest to AMRMClient.addContainerRequest 3. Take the resultant Container returned via subsequent call to AMRMClient.allocate, build ContainerLaunchContext with Container and commands, then launch them using AMNMClient.launchContainer – Use LocalResources to specify software/configuration dependencies for each worker container 4. Wait till done… AllocateResponse.getCompletedContainersStatuses from subsequent calls to AMRMClient.allocate 5. AMRMClient.unregisterApplication
  • 36. Page36 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Simple YARN Application – AppMaster Initialize clients to ResourceManager and NodeManagers Register with ResourceManager Initialize clients to ResourceManager and NodeManagers
  • 37. Page37 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Simple YARN Application – AppMaster Setup requirements for worker containers Make resource requests to ResourceManager
  • 38. Page38 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Simple YARN Application – AppMaster Get containers from ResourceManager Launch containers on NodeManagers
  • 39. Page39 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Simple YARN Application – AppMaster Wait for containers to complete successfully Un-register with ResourceManager
  • 40. Page40 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Graduating from simple-yarn-app DistributedShell. Same functionality but less simple e.g. error checking, use of timeline server For a complex YARN app, see Tez Pre-warmed containers, sessions, etc. Look at MapReduce for even more excitement Data locality, fault tolerance, checkpoint to HDFS, security, isolation, etc Intra-application priorities (maps vs reduces) need complex feedback from ResourceManager (all at apache.org)
  • 41. Page41 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Application Timeline Server
  • 42. Page42 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Application Timeline Server Maintains historical state & provides metrics visibility for YARN apps Similar to MapReduce Job History Server Information can be queried via REST APIs ATS in HDP 2.1 is considered a Tech Preview Generic information • queue name • user information • information about application attempts • a list of Containers that were run under each application attempt • information about each Container Per-framework/application info Developers can publish information to the Timeline Server via the TimelineClient (from within a client), the ApplicationMaster, or the application's Containers.
  • 43. Page43 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Application Timeline Server App Timeline Server AMBARI Custom App Monitoring Client
  • 44. Page44 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Next Steps
  • 45. Page45 © Hortonworks Inc. 2011 – 2014. All Rights Reserved hortonworks.com/get-started/YARN Setup HDP 2.1 environment Leverage Sandbox Review Sample Code & Execute Simple YARN Application https://github.com/hortonworks/simple-yarn-app Graduate to more complex code examples BUILD FLEXIBLE, SCALABLE, RESILIENT & POWERFUL APPLICATIONS TO RUN IN HADOOP
  • 46. Page46 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Hortonworks YARN Resources Hortonworks Web Site hortonworks.com/hadoop/yarn Includes links to blog posts YARN Forum Community of Hadoop YARN developers – collaboration and Q&A hortonworks.com/community/forums/forum/yarn YARN Office Hours Dial in and chat with YARN experts Next Office Hour: Thursday August 14 @ 10-11am PDT. Register: https://hortonworks.webex.com/hortonworks/onstage/g.php?t=a&d=628190636
  • 47. Page47 © Hortonworks Inc. 2011 – 2014. All Rights Reserved And from Hortonworks University Hortonworks Course: Developing Custom YARN Applications Format: Online Duration: 2 Days When: Aug 18th & 19th (Mon & Tues) Cost: No Charge to Hortonworks Technical Partners Space: Very Limited Interested? Please contact lsensmeier@hortonworks.com
  • 48. Page48 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Stay in Touch! Join us for the full series of YARN development webinars: YARN Native July 24 @ 9am PT (recording link) Slider August 7 @ 9am PT (registration link) Tez August 21 @ 9am PT (registration link) Additional webinar topics are being added – watch the blog or visit Hortonworks.com/webinars http://hortonworks.com/hadoop/yarn