SlideShare une entreprise Scribd logo
1  sur  32
1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache Hadoop YARN:
Past, Present and
Future
Melbourne, Aug.31 2016
Junping Du
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Who.JSON
{
"name" : "Junping Du" ,
"job_title" : "Lead Software Engineer @ Hortonworks YARN core team",
"experiences" : [ {
"software_industry_years" : 10,
"hadoop_experience" : "Hadoop contributor before YARN comes out, Apache
Hadoop committer & PMC, Release Manager for Apache Hadoop 2.6",
”non_hadoop_experience" : “Architect in cloud computing and enterprise software"
}],
"email" : "junping_du@apache.org"
}
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
What is Apache Hadoop YARN ?
⬢ YARN is short for “Yet Another Resource Negotiator”
⬢ Big Data Operating System
–Resource Management and Scheduling
–Support for “colorful” applications, like: Batch, Interactive,
Real-Time, etc.
⬢ Enterprise adoption accelerating
–Secure mode becoming more widespread
–Multi-tenant support
–Diverse workloads
⬢ SLAs
–Tolerance for slow running jobs decreasing
–Consistent performance desired
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Past
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
A brief Timeline
1st line of Code Open sourced First 2.0 alpha First 2.0 beta
June-July 2010 August 2011 May 2012 August 2013
⬢ Sub-project of Apache Hadoop
⬢ Releases tied to Hadoop releases
⬢ Alphas and betas
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
GA Releases
2.2 2.3 2.4 2.5
Oct. 2013 Feb. 2014 Apr. 2014 Aug. 2014
• 1st GA
• MR binary
compatibility
• YARN API
cleanup
• Testing!
• 1st Post GA
• Bug fixes
• Alpha features
- Load simulator
- LCE
enhancements
• RM Fail-over
• CS Preemption
• Timeline
Service V1
• Writable REST
APIs
• Timeline
Service V1
security
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
GA Releases (Recent + Planning)
2.6 2.7 2.8/2.9 3.0
Nov. 2014 Apr. 2015 2nd H 2016 (estimated) TBD
• KMS
• Long running
service support
• Rolling
Upgrade
• Node Label
Support
• Docker
Container
• Pluggable
Authorization
• Shared
Resource
Cache
• Timeline
Service V1.5
• Graceful
Decommission
• Log CLI
Enhancement
• Timeline
Service V2
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Outstanding YARN Features released in 2.6/2.7
Default Partition
Partition B
GPUs
Partition C
Windows
JDK 8 JDK 7 JDK 7
⬢ Rolling Upgrade
Node Label
Pluggable ACLs
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Recent Maintenance Releases Updates
⬢ 2.6 and 2.7 maintenance releases are carried out
–Only blockers and critical fixes are added
⬢ Apache Hadoop 2.6
–2.6.4 released in Feb. 2016
–2.6.3 released in Dec. 2015
–2.6.2 released in Oct. 2015
⬢ Apache Hadoop 2.7
–2.7.3 released in Aug. 2016
–2.7.2 released in Jan. 2016
–2.7.1 released in Jul. 2015
1
0
© Hortonworks Inc. 2011 – 2016. All Rights Reserved1
0
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Present
1
1
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
YARN in Modern Data Architecture
⬢ Modern Data Architecture
–Enable applications to have access to all
your enterprise data through an
efficient centralized platform
–Supported with a centralized approach
governance, security and operations
–Versatile to handle any applications and
datasets no matter the size or type
⬢ YARN’s Evolution
–The “CORE” of Modern Data
Architecture
–Centralized resource management, high
efficient scheduling, flexible resource
model, isolation in security and
performance, “colorful” applications
support, etc.
1
2
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache Hadoop YARN
ResourceManager
(active)
ResourceManager
(standby)
NodeManager1
NodeManager2
NodeManager3
NodeManager4
Resources: 128G, 16 vcores
Auto-calculate node resources
Label: SAS
Dynamically update node
resources
1
3
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
NodeManager Resource Management
⬢ Options to report NM resources based on node hardware
–YARN-160
–Restart of the NM required to enable feature
⬢ Alternatively, admins can use the rmadmin command to update the node’s resources
–YARN-291
–Looks at the dynamic-resource.xml
–No restart of the NM or the RM required
1
4
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache Hadoop YARN Scheduler
Inter queue pre-emption
Improvements to pre-emption
Application
Queue B – 25%
Queue C – 25%
Label: SAS (non-exclusive)
Queue A – 50%
Priority/FIFO, Fair
ResourceManager
(active)
Application, Queue A, 4G, 1 vcore
Support for application priority
Reservation for application
Support for cost based placement
agent
User
1
5
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Capacity scheduler
⬢ Support for application priority within a queue
–YARN-1963
–Users can specify application priority
–Specified as an integer, higher number is higher priority
–Application priority can be updated while it’s running
⬢ Improvements to reservations
–YARN-2572
–Support for cost based placement agent added in addition to greedy
⬢ Queue allocation policy can be switched to fair sharing
–YARN-3319
–Containers allocated on a fair share basis instead of FIFO
1
6
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Capacity scheduler
⬢ Support for non-exclusive node labels
–YARN-3214
–Improvement over partition that existed earlier
–Better for cluster utilization
⬢ Improvements to pre-emption
1
7
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Node 1
NodeManager
Support added for graceful
decomissioning
128G, 16 vcores
Launch Applicaton 1 AMAM process/Docker container(alpha)
Launch AM process via
ContainerExecutor – DCE, LCE, WSCE.
Monitor/isolate memory and cpu.
Support added for disk and network
isolation via CGroups(alpha)
Apache Hadoop YARN Application Lifecycle
ResourceManager
(active)
Request containers
Allocate containers
Support added to resize containers. Container 1 process/Docker
container(alpha)
Container 2 process/Docker
container(alpha)
Launch containers on node using DCE,
LCE, WSCE. Monitor/isolate memory and
cpu. Support added for disk and network
isolation using Cgroups(alpha).
History Server(ATS 1.5– leveldb
+ HDFS, JHS - HDFS)
HDFS
Log aggregation
1
8
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache Hadoop YARN
⬢ Graceful decommissioning of NodeManagers
–YARN-914
–Drains a node that’s being decommissioned to allow running containers to finish
⬢ Resource isolation support for disk and network
–YARN-2619, YARN-2140
–Containers get a fair share of disk and network resources using CGroups
–Alpha feature
⬢ Docker support in LinuxContainerExecutor
–YARN-3853
–Support to launch Docker containers alongside process containers
–Alpha feature
1
9
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache Hadoop YARN
⬢ Support for container resizing
–YARN-1197
–Allows applications to change the size of an existing container
⬢ ATS 1.5
–YARN-4233
–Store timeline events on HDFS
–Better scalability and reliability
2
0
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Operational support
⬢ Improvements to existing tools (like yarn logs)
⬢ New tools added (yarn top)
⬢ Improvements to the RM UI to expose more details about running applications
2
1
© Hortonworks Inc. 2011 – 2016. All Rights Reserved2
1
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Future
2
2
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Packaging
 Containers
– Lightweight mechanism for packaging and resource isolation
– Popularized and made accessible by Docker
– Can replace VMs in some cases
– Or more accurately, VMs got used in places where they didn’t
need to be
 Native integration ++ in YARN
– Support for “Container Runtimes” in LCE: YARN-3611
– Process runtime
– Docker runtime
2
3
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
APIs
 Applications need simple APIs
 Need to be deployable “easily”
 Simple REST API layer fronting YARN
– https://issues.apache.org/jira/browse/YARN-4793
– [Umbrella] Simplified API layer for services and beyond
 Spawn services & Manage them
2
4
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
YARN as a Platform
 YARN itself is evolving to support services and complex apps
– https://issues.apache.org/jira/browse/YARN-4692
– [Umbrella] Simplified and first-class support for services in YARN
 Scheduling
– Application priorities: YARN-1963
– Affinity / anti-affinity: YARN-1042
– Services as first-class citizens: Preemption, reservations etc
2
5
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
YARN as a Platform (Contd)
 Application & Services upgrades
– ”Do an upgrade of my Spark / HBase apps with minimal impact to end-
users”
– YARN-4726
 Simplified discovery of services via DNS mechanisms: YARN-4757
 YARN Federation – to infinity and beyond: YARN-2915
2
6
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
YARN Service Framework
 Platform is only as good as the tools
 A native YARN framework
– https://issues.apache.org/jira/browse/YARN-4692
– [Umbrella] Native YARN framework layer for services and
beyond
 Slider supporting a DAG of apps:
– https://issues.apache.org/jira/browse/SLIDER-875
2
7
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Operational and User Experience
 Modern YARN web UI - YARN-3368
 Enhanced shell interfaces
 Metrics: Timeline Service V2 – YARN-2928
 Application & Services monitoring, integration with other systems
 First class support for YARN hosted services in Ambari
– https://issues.apache.org/jira/browse/AMBARI-17353
2
8
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Use-cases.. Assemble!
YARN and Other Platform Services
Storage
Resource
Management Security
Service
Discovery Management
Monitoring
Alerts
Holiday Assembly
HBase
Web
Server
IOT Assembly
Kafka Storm HBase Solr
Governance
MR Tez Spark …
2
9
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Future Work List (I)
⬢ Arbitrary resource types
–YARN-3926
–Admins can decide what resource types to
support
–Resource types read via a config file
⬢ New scheduler features
–YARN-4902
–Support richer placement strategies such as
affinity, anti-affinity
⬢ Distributed scheduling
–YARN-2877, YARN-4742
–NMs run a local scheduler
–Allows faster scheduling turnaround
⬢ YARN federation
–YARN-2915
–Allows YARN to scale out to tens of thousands of
nodes
–Cluster of clusters which appear as a single cluster
to an end user
⬢ Better support for disk and network isolation
–Tied to supporting arbitrary resource types
3
0
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Future Work List (II)
⬢ Simplified and first-class support for
services in YARN
–YARN-4692
–Container restart (YARN-3988)
•Allow container restart without
losing allocation
–Service discovery via DNS (YARN-4757)
•Running services can be
discovered via DNS
–Allocation re-use (YARN-4726)
•Allow AMs to stop a container but
not lose resources on the node
⬢ Enhance Docker support
–YARN-3611
–Support to mount volumes
–Isolate containers using CGroups
⬢ ATS v2 Phase 2
–YARN-2928 (Phase 1), YARN-5355 (Phase 2)
–Run timeline service on Hbase
–Support for more data, better performance
⬢ Also in the pipeline
–Switch to Java 8 with Hadoop 3.0
–Add support for GPU isolation
–Better tools to detect limping nodes
–New RM UI – YARN-3368
3
1
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDP Evolution with Apache Hadoop YARN
3
2
© Hortonworks Inc. 2011 – 2016. All Rights Reserved3
2
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Thank you!

Contenu connexe

Tendances

Debugging Apache Hadoop YARN Cluster in Production
Debugging Apache Hadoop YARN Cluster in ProductionDebugging Apache Hadoop YARN Cluster in Production
Debugging Apache Hadoop YARN Cluster in ProductionXuan Gong
 
Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017alanfgates
 
An Overview on Optimization in Apache Hive: Past, Present Future
An Overview on Optimization in Apache Hive: Past, Present FutureAn Overview on Optimization in Apache Hive: Past, Present Future
An Overview on Optimization in Apache Hive: Past, Present FutureDataWorks Summit/Hadoop Summit
 
Log Analytics Optimization
Log Analytics OptimizationLog Analytics Optimization
Log Analytics OptimizationHortonworks
 
An Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseAn Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseDataWorks Summit
 
Hadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in ProductionHadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in ProductionDataWorks Summit/Hadoop Summit
 
Benefits of an Agile Data Fabric for Business Intelligence
Benefits of an Agile Data Fabric for Business IntelligenceBenefits of an Agile Data Fabric for Business Intelligence
Benefits of an Agile Data Fabric for Business IntelligenceDataWorks Summit/Hadoop Summit
 
Manage democratization of the data - Data Replication in Hadoop
Manage democratization of the data - Data Replication in HadoopManage democratization of the data - Data Replication in Hadoop
Manage democratization of the data - Data Replication in HadoopDataWorks Summit
 
Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies
Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies
Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies DataWorks Summit/Hadoop Summit
 
Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3
Deep learning on yarn  running distributed tensorflow etc on hadoop cluster v3Deep learning on yarn  running distributed tensorflow etc on hadoop cluster v3
Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3DataWorks Summit
 
Schema Registry - Set Your Data Free
Schema Registry - Set Your Data FreeSchema Registry - Set Your Data Free
Schema Registry - Set Your Data FreeDataWorks Summit
 
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureDataWorks Summit
 

Tendances (20)

Running Services on YARN
Running Services on YARNRunning Services on YARN
Running Services on YARN
 
The state of SQL-on-Hadoop in the Cloud
The state of SQL-on-Hadoop in the CloudThe state of SQL-on-Hadoop in the Cloud
The state of SQL-on-Hadoop in the Cloud
 
Why is my Hadoop cluster slow?
Why is my Hadoop cluster slow?Why is my Hadoop cluster slow?
Why is my Hadoop cluster slow?
 
Debugging Apache Hadoop YARN Cluster in Production
Debugging Apache Hadoop YARN Cluster in ProductionDebugging Apache Hadoop YARN Cluster in Production
Debugging Apache Hadoop YARN Cluster in Production
 
Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017
 
An Overview on Optimization in Apache Hive: Past, Present Future
An Overview on Optimization in Apache Hive: Past, Present FutureAn Overview on Optimization in Apache Hive: Past, Present Future
An Overview on Optimization in Apache Hive: Past, Present Future
 
Log Analytics Optimization
Log Analytics OptimizationLog Analytics Optimization
Log Analytics Optimization
 
An Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseAn Apache Hive Based Data Warehouse
An Apache Hive Based Data Warehouse
 
Hadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in ProductionHadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in Production
 
A Multi Colored YARN
A Multi Colored YARNA Multi Colored YARN
A Multi Colored YARN
 
Benefits of an Agile Data Fabric for Business Intelligence
Benefits of an Agile Data Fabric for Business IntelligenceBenefits of an Agile Data Fabric for Business Intelligence
Benefits of an Agile Data Fabric for Business Intelligence
 
Apache NiFi 1.0 in Nutshell
Apache NiFi 1.0 in NutshellApache NiFi 1.0 in Nutshell
Apache NiFi 1.0 in Nutshell
 
Manage democratization of the data - Data Replication in Hadoop
Manage democratization of the data - Data Replication in HadoopManage democratization of the data - Data Replication in Hadoop
Manage democratization of the data - Data Replication in Hadoop
 
Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies
Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies
Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies
 
Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3
Deep learning on yarn  running distributed tensorflow etc on hadoop cluster v3Deep learning on yarn  running distributed tensorflow etc on hadoop cluster v3
Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3
 
From Zero to Data Flow in Hours with Apache NiFi
From Zero to Data Flow in Hours with Apache NiFiFrom Zero to Data Flow in Hours with Apache NiFi
From Zero to Data Flow in Hours with Apache NiFi
 
Schema Registry - Set Your Data Free
Schema Registry - Set Your Data FreeSchema Registry - Set Your Data Free
Schema Registry - Set Your Data Free
 
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and Future
 
Why is my Hadoop* job slow?
Why is my Hadoop* job slow?Why is my Hadoop* job slow?
Why is my Hadoop* job slow?
 
Scalable OCR with NiFi and Tesseract
Scalable OCR with NiFi and TesseractScalable OCR with NiFi and Tesseract
Scalable OCR with NiFi and Tesseract
 

En vedette

Dancing Elephants - Efficiently Working with Object Stories from Apache Spark...
Dancing Elephants - Efficiently Working with Object Stories from Apache Spark...Dancing Elephants - Efficiently Working with Object Stories from Apache Spark...
Dancing Elephants - Efficiently Working with Object Stories from Apache Spark...DataWorks Summit/Hadoop Summit
 
Best Practices for Enterprise User Management in Hadoop Environment
Best Practices for Enterprise User Management in Hadoop EnvironmentBest Practices for Enterprise User Management in Hadoop Environment
Best Practices for Enterprise User Management in Hadoop EnvironmentDataWorks Summit/Hadoop Summit
 
Automatic Detection, Classification and Authorization of Sensitive Personal D...
Automatic Detection, Classification and Authorization of Sensitive Personal D...Automatic Detection, Classification and Authorization of Sensitive Personal D...
Automatic Detection, Classification and Authorization of Sensitive Personal D...DataWorks Summit/Hadoop Summit
 
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron
MaaS (Model as a Service): Modern Streaming Data Science with Apache MetronMaaS (Model as a Service): Modern Streaming Data Science with Apache Metron
MaaS (Model as a Service): Modern Streaming Data Science with Apache MetronDataWorks Summit
 
Bringing it All Together: Apache Metron (Incubating) as a Case Study of a Mod...
Bringing it All Together: Apache Metron (Incubating) as a Case Study of a Mod...Bringing it All Together: Apache Metron (Incubating) as a Case Study of a Mod...
Bringing it All Together: Apache Metron (Incubating) as a Case Study of a Mod...DataWorks Summit
 

En vedette (11)

Dancing Elephants - Efficiently Working with Object Stories from Apache Spark...
Dancing Elephants - Efficiently Working with Object Stories from Apache Spark...Dancing Elephants - Efficiently Working with Object Stories from Apache Spark...
Dancing Elephants - Efficiently Working with Object Stories from Apache Spark...
 
Big Data in Azure
Big Data in AzureBig Data in Azure
Big Data in Azure
 
Solving Cyber at Scale
Solving Cyber at ScaleSolving Cyber at Scale
Solving Cyber at Scale
 
Best Practices for Enterprise User Management in Hadoop Environment
Best Practices for Enterprise User Management in Hadoop EnvironmentBest Practices for Enterprise User Management in Hadoop Environment
Best Practices for Enterprise User Management in Hadoop Environment
 
File Format Benchmark - Avro, JSON, ORC and Parquet
File Format Benchmark - Avro, JSON, ORC and ParquetFile Format Benchmark - Avro, JSON, ORC and Parquet
File Format Benchmark - Avro, JSON, ORC and Parquet
 
Automatic Detection, Classification and Authorization of Sensitive Personal D...
Automatic Detection, Classification and Authorization of Sensitive Personal D...Automatic Detection, Classification and Authorization of Sensitive Personal D...
Automatic Detection, Classification and Authorization of Sensitive Personal D...
 
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron
MaaS (Model as a Service): Modern Streaming Data Science with Apache MetronMaaS (Model as a Service): Modern Streaming Data Science with Apache Metron
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron
 
Apache Metron: Community Driven Cyber Security
Apache Metron: Community Driven Cyber Security Apache Metron: Community Driven Cyber Security
Apache Metron: Community Driven Cyber Security
 
Bringing it All Together: Apache Metron (Incubating) as a Case Study of a Mod...
Bringing it All Together: Apache Metron (Incubating) as a Case Study of a Mod...Bringing it All Together: Apache Metron (Incubating) as a Case Study of a Mod...
Bringing it All Together: Apache Metron (Incubating) as a Case Study of a Mod...
 
Hadoop 3 in a Nutshell
Hadoop 3 in a NutshellHadoop 3 in a Nutshell
Hadoop 3 in a Nutshell
 
Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
 

Similaire à Apache Hadoop YARN: Past, Present and Future

Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The UnionDataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The UnionWangda Tan
 
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureApache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureDataWorks Summit
 
Accumulo Summit 2016: Apache Accumulo on Docker with YARN Native Services
Accumulo Summit 2016: Apache Accumulo on Docker with YARN Native ServicesAccumulo Summit 2016: Apache Accumulo on Docker with YARN Native Services
Accumulo Summit 2016: Apache Accumulo on Docker with YARN Native ServicesAccumulo Summit
 
Apache Hadoop YARN: State of the Union
Apache Hadoop YARN: State of the UnionApache Hadoop YARN: State of the Union
Apache Hadoop YARN: State of the UnionDataWorks Summit
 
Apache Hadoop 3 updates with migration story
Apache Hadoop 3 updates with migration storyApache Hadoop 3 updates with migration story
Apache Hadoop 3 updates with migration storySunil Govindan
 
YARN - Past, Present, & Future
YARN - Past, Present, & FutureYARN - Past, Present, & Future
YARN - Past, Present, & FutureDataWorks Summit
 
Apache Hadoop YARN: state of the union - Tokyo
Apache Hadoop YARN: state of the union - Tokyo Apache Hadoop YARN: state of the union - Tokyo
Apache Hadoop YARN: state of the union - Tokyo DataWorks Summit
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionDataWorks Summit
 
Get most out of Spark on YARN
Get most out of Spark on YARNGet most out of Spark on YARN
Get most out of Spark on YARNDataWorks Summit
 
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...Big Data Spain
 
Hadoop Summit San Jose 2015: YARN - Past, Present and Future
Hadoop Summit San Jose 2015: YARN - Past, Present and FutureHadoop Summit San Jose 2015: YARN - Past, Present and Future
Hadoop Summit San Jose 2015: YARN - Past, Present and FutureVinod Kumar Vavilapalli
 
Big data spain keynote nov 2016
Big data spain keynote nov 2016Big data spain keynote nov 2016
Big data spain keynote nov 2016alanfgates
 
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3DataWorks Summit
 
Introduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready ProgramIntroduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready ProgramHortonworks
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionDataWorks Summit
 
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnBikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnhdhappy001
 
YARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute PlatformYARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute PlatformBikas Saha
 
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of HadoopApache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of HadoopHortonworks
 

Similaire à Apache Hadoop YARN: Past, Present and Future (20)

Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and Future
 
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The UnionDataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
 
Apache Hadoop 3.0 What's new in YARN and MapReduce
Apache Hadoop 3.0 What's new in YARN and MapReduceApache Hadoop 3.0 What's new in YARN and MapReduce
Apache Hadoop 3.0 What's new in YARN and MapReduce
 
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureApache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and Future
 
Accumulo Summit 2016: Apache Accumulo on Docker with YARN Native Services
Accumulo Summit 2016: Apache Accumulo on Docker with YARN Native ServicesAccumulo Summit 2016: Apache Accumulo on Docker with YARN Native Services
Accumulo Summit 2016: Apache Accumulo on Docker with YARN Native Services
 
Apache Hadoop YARN: State of the Union
Apache Hadoop YARN: State of the UnionApache Hadoop YARN: State of the Union
Apache Hadoop YARN: State of the Union
 
Apache Hadoop 3 updates with migration story
Apache Hadoop 3 updates with migration storyApache Hadoop 3 updates with migration story
Apache Hadoop 3 updates with migration story
 
YARN - Past, Present, & Future
YARN - Past, Present, & FutureYARN - Past, Present, & Future
YARN - Past, Present, & Future
 
Apache Hadoop YARN: state of the union - Tokyo
Apache Hadoop YARN: state of the union - Tokyo Apache Hadoop YARN: state of the union - Tokyo
Apache Hadoop YARN: state of the union - Tokyo
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the union
 
Get most out of Spark on YARN
Get most out of Spark on YARNGet most out of Spark on YARN
Get most out of Spark on YARN
 
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
 
Hadoop Summit San Jose 2015: YARN - Past, Present and Future
Hadoop Summit San Jose 2015: YARN - Past, Present and FutureHadoop Summit San Jose 2015: YARN - Past, Present and Future
Hadoop Summit San Jose 2015: YARN - Past, Present and Future
 
Big data spain keynote nov 2016
Big data spain keynote nov 2016Big data spain keynote nov 2016
Big data spain keynote nov 2016
 
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
 
Introduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready ProgramIntroduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready Program
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the union
 
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnBikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
 
YARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute PlatformYARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute Platform
 
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of HadoopApache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
 

Plus de DataWorks Summit/Hadoop Summit

Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerDataWorks Summit/Hadoop Summit
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformDataWorks Summit/Hadoop Summit
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDataWorks Summit/Hadoop Summit
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...DataWorks Summit/Hadoop Summit
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...DataWorks Summit/Hadoop Summit
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLDataWorks Summit/Hadoop Summit
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)DataWorks Summit/Hadoop Summit
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...DataWorks Summit/Hadoop Summit
 

Plus de DataWorks Summit/Hadoop Summit (20)

Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionRunning Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
 
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache ZeppelinState of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
 
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and ZeppelinRevolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
 
Hadoop Crash Course
Hadoop Crash CourseHadoop Crash Course
Hadoop Crash Course
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
Schema Registry - Set you Data Free
Schema Registry - Set you Data FreeSchema Registry - Set you Data Free
Schema Registry - Set you Data Free
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
 
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
 

Dernier

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 

Dernier (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 

Apache Hadoop YARN: Past, Present and Future

  • 1. 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache Hadoop YARN: Past, Present and Future Melbourne, Aug.31 2016 Junping Du
  • 2. 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Who.JSON { "name" : "Junping Du" , "job_title" : "Lead Software Engineer @ Hortonworks YARN core team", "experiences" : [ { "software_industry_years" : 10, "hadoop_experience" : "Hadoop contributor before YARN comes out, Apache Hadoop committer & PMC, Release Manager for Apache Hadoop 2.6", ”non_hadoop_experience" : “Architect in cloud computing and enterprise software" }], "email" : "junping_du@apache.org" }
  • 3. 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved What is Apache Hadoop YARN ? ⬢ YARN is short for “Yet Another Resource Negotiator” ⬢ Big Data Operating System –Resource Management and Scheduling –Support for “colorful” applications, like: Batch, Interactive, Real-Time, etc. ⬢ Enterprise adoption accelerating –Secure mode becoming more widespread –Multi-tenant support –Diverse workloads ⬢ SLAs –Tolerance for slow running jobs decreasing –Consistent performance desired
  • 4. 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Past
  • 5. 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved A brief Timeline 1st line of Code Open sourced First 2.0 alpha First 2.0 beta June-July 2010 August 2011 May 2012 August 2013 ⬢ Sub-project of Apache Hadoop ⬢ Releases tied to Hadoop releases ⬢ Alphas and betas
  • 6. 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved GA Releases 2.2 2.3 2.4 2.5 Oct. 2013 Feb. 2014 Apr. 2014 Aug. 2014 • 1st GA • MR binary compatibility • YARN API cleanup • Testing! • 1st Post GA • Bug fixes • Alpha features - Load simulator - LCE enhancements • RM Fail-over • CS Preemption • Timeline Service V1 • Writable REST APIs • Timeline Service V1 security
  • 7. 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved GA Releases (Recent + Planning) 2.6 2.7 2.8/2.9 3.0 Nov. 2014 Apr. 2015 2nd H 2016 (estimated) TBD • KMS • Long running service support • Rolling Upgrade • Node Label Support • Docker Container • Pluggable Authorization • Shared Resource Cache • Timeline Service V1.5 • Graceful Decommission • Log CLI Enhancement • Timeline Service V2
  • 8. 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Outstanding YARN Features released in 2.6/2.7 Default Partition Partition B GPUs Partition C Windows JDK 8 JDK 7 JDK 7 ⬢ Rolling Upgrade Node Label Pluggable ACLs
  • 9. 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Recent Maintenance Releases Updates ⬢ 2.6 and 2.7 maintenance releases are carried out –Only blockers and critical fixes are added ⬢ Apache Hadoop 2.6 –2.6.4 released in Feb. 2016 –2.6.3 released in Dec. 2015 –2.6.2 released in Oct. 2015 ⬢ Apache Hadoop 2.7 –2.7.3 released in Aug. 2016 –2.7.2 released in Jan. 2016 –2.7.1 released in Jul. 2015
  • 10. 1 0 © Hortonworks Inc. 2011 – 2016. All Rights Reserved1 0 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Present
  • 11. 1 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved YARN in Modern Data Architecture ⬢ Modern Data Architecture –Enable applications to have access to all your enterprise data through an efficient centralized platform –Supported with a centralized approach governance, security and operations –Versatile to handle any applications and datasets no matter the size or type ⬢ YARN’s Evolution –The “CORE” of Modern Data Architecture –Centralized resource management, high efficient scheduling, flexible resource model, isolation in security and performance, “colorful” applications support, etc.
  • 12. 1 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache Hadoop YARN ResourceManager (active) ResourceManager (standby) NodeManager1 NodeManager2 NodeManager3 NodeManager4 Resources: 128G, 16 vcores Auto-calculate node resources Label: SAS Dynamically update node resources
  • 13. 1 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved NodeManager Resource Management ⬢ Options to report NM resources based on node hardware –YARN-160 –Restart of the NM required to enable feature ⬢ Alternatively, admins can use the rmadmin command to update the node’s resources –YARN-291 –Looks at the dynamic-resource.xml –No restart of the NM or the RM required
  • 14. 1 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache Hadoop YARN Scheduler Inter queue pre-emption Improvements to pre-emption Application Queue B – 25% Queue C – 25% Label: SAS (non-exclusive) Queue A – 50% Priority/FIFO, Fair ResourceManager (active) Application, Queue A, 4G, 1 vcore Support for application priority Reservation for application Support for cost based placement agent User
  • 15. 1 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Capacity scheduler ⬢ Support for application priority within a queue –YARN-1963 –Users can specify application priority –Specified as an integer, higher number is higher priority –Application priority can be updated while it’s running ⬢ Improvements to reservations –YARN-2572 –Support for cost based placement agent added in addition to greedy ⬢ Queue allocation policy can be switched to fair sharing –YARN-3319 –Containers allocated on a fair share basis instead of FIFO
  • 16. 1 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Capacity scheduler ⬢ Support for non-exclusive node labels –YARN-3214 –Improvement over partition that existed earlier –Better for cluster utilization ⬢ Improvements to pre-emption
  • 17. 1 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Node 1 NodeManager Support added for graceful decomissioning 128G, 16 vcores Launch Applicaton 1 AMAM process/Docker container(alpha) Launch AM process via ContainerExecutor – DCE, LCE, WSCE. Monitor/isolate memory and cpu. Support added for disk and network isolation via CGroups(alpha) Apache Hadoop YARN Application Lifecycle ResourceManager (active) Request containers Allocate containers Support added to resize containers. Container 1 process/Docker container(alpha) Container 2 process/Docker container(alpha) Launch containers on node using DCE, LCE, WSCE. Monitor/isolate memory and cpu. Support added for disk and network isolation using Cgroups(alpha). History Server(ATS 1.5– leveldb + HDFS, JHS - HDFS) HDFS Log aggregation
  • 18. 1 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache Hadoop YARN ⬢ Graceful decommissioning of NodeManagers –YARN-914 –Drains a node that’s being decommissioned to allow running containers to finish ⬢ Resource isolation support for disk and network –YARN-2619, YARN-2140 –Containers get a fair share of disk and network resources using CGroups –Alpha feature ⬢ Docker support in LinuxContainerExecutor –YARN-3853 –Support to launch Docker containers alongside process containers –Alpha feature
  • 19. 1 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache Hadoop YARN ⬢ Support for container resizing –YARN-1197 –Allows applications to change the size of an existing container ⬢ ATS 1.5 –YARN-4233 –Store timeline events on HDFS –Better scalability and reliability
  • 20. 2 0 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Operational support ⬢ Improvements to existing tools (like yarn logs) ⬢ New tools added (yarn top) ⬢ Improvements to the RM UI to expose more details about running applications
  • 21. 2 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved2 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Future
  • 22. 2 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Packaging  Containers – Lightweight mechanism for packaging and resource isolation – Popularized and made accessible by Docker – Can replace VMs in some cases – Or more accurately, VMs got used in places where they didn’t need to be  Native integration ++ in YARN – Support for “Container Runtimes” in LCE: YARN-3611 – Process runtime – Docker runtime
  • 23. 2 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved APIs  Applications need simple APIs  Need to be deployable “easily”  Simple REST API layer fronting YARN – https://issues.apache.org/jira/browse/YARN-4793 – [Umbrella] Simplified API layer for services and beyond  Spawn services & Manage them
  • 24. 2 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved YARN as a Platform  YARN itself is evolving to support services and complex apps – https://issues.apache.org/jira/browse/YARN-4692 – [Umbrella] Simplified and first-class support for services in YARN  Scheduling – Application priorities: YARN-1963 – Affinity / anti-affinity: YARN-1042 – Services as first-class citizens: Preemption, reservations etc
  • 25. 2 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved YARN as a Platform (Contd)  Application & Services upgrades – ”Do an upgrade of my Spark / HBase apps with minimal impact to end- users” – YARN-4726  Simplified discovery of services via DNS mechanisms: YARN-4757  YARN Federation – to infinity and beyond: YARN-2915
  • 26. 2 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved YARN Service Framework  Platform is only as good as the tools  A native YARN framework – https://issues.apache.org/jira/browse/YARN-4692 – [Umbrella] Native YARN framework layer for services and beyond  Slider supporting a DAG of apps: – https://issues.apache.org/jira/browse/SLIDER-875
  • 27. 2 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Operational and User Experience  Modern YARN web UI - YARN-3368  Enhanced shell interfaces  Metrics: Timeline Service V2 – YARN-2928  Application & Services monitoring, integration with other systems  First class support for YARN hosted services in Ambari – https://issues.apache.org/jira/browse/AMBARI-17353
  • 28. 2 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Use-cases.. Assemble! YARN and Other Platform Services Storage Resource Management Security Service Discovery Management Monitoring Alerts Holiday Assembly HBase Web Server IOT Assembly Kafka Storm HBase Solr Governance MR Tez Spark …
  • 29. 2 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Future Work List (I) ⬢ Arbitrary resource types –YARN-3926 –Admins can decide what resource types to support –Resource types read via a config file ⬢ New scheduler features –YARN-4902 –Support richer placement strategies such as affinity, anti-affinity ⬢ Distributed scheduling –YARN-2877, YARN-4742 –NMs run a local scheduler –Allows faster scheduling turnaround ⬢ YARN federation –YARN-2915 –Allows YARN to scale out to tens of thousands of nodes –Cluster of clusters which appear as a single cluster to an end user ⬢ Better support for disk and network isolation –Tied to supporting arbitrary resource types
  • 30. 3 0 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Future Work List (II) ⬢ Simplified and first-class support for services in YARN –YARN-4692 –Container restart (YARN-3988) •Allow container restart without losing allocation –Service discovery via DNS (YARN-4757) •Running services can be discovered via DNS –Allocation re-use (YARN-4726) •Allow AMs to stop a container but not lose resources on the node ⬢ Enhance Docker support –YARN-3611 –Support to mount volumes –Isolate containers using CGroups ⬢ ATS v2 Phase 2 –YARN-2928 (Phase 1), YARN-5355 (Phase 2) –Run timeline service on Hbase –Support for more data, better performance ⬢ Also in the pipeline –Switch to Java 8 with Hadoop 3.0 –Add support for GPU isolation –Better tools to detect limping nodes –New RM UI – YARN-3368
  • 31. 3 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HDP Evolution with Apache Hadoop YARN
  • 32. 3 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved3 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Thank you!