Apache Hadoop 3 is coming! As the next major milestone for hadoop and big data, it attracts everyone's attention as showcase several bleeding-edge technologies and significant features across all components of Apache Hadoop: Erasure Coding in HDFS, Docker container support, Apache Slider integration and Native service support, Application Timeline Service version 2, Hadoop library updates and client-side class path isolation, etc. In this talk, first we will update the status of Hadoop 3.0 releasing work in apache community and the feasible path through alpha, beta towards GA. Then we will go deep diving on each new feature, include: development progress and maturity status in Hadoop 3. Last but not the least, as a new major release, Hadoop 3.0 will contain some incompatible API or CLI changes which could be challengeable for downstream projects and existing Hadoop users for upgrade - we will go through these major changes and explore its impact to other projects and users.
2. Andrew Wang
● HDFS @ Cloudera
● Hadoop PMC Member
● Release Manager for Hadoop 3.0
Junping Du
● YARN @ Hortonworks
● Hadoop PMC Member
● Release Manager for Hadoop 2.8
Who We Are
3. An Abbreviated History of Hadoop Releases
Date Release Major Notes
2007-11-04 0.14.1 First release at the ASF
2011-12-27 1.0.0 Security, HBase support
2012-05-23 2.0.0 YARN, NameNode HA, wire compat
2014-11-18 2.6.0 HDFS encryption, rolling upgrade, node labels
2015-04-21 2.7.0 Most recent production-quality release line
4. Motivation for Hadoop 3
● Upgrade minimum Java version to Java 8
○ Java 7 end-of-life in April 2015
○ Many Java libraries now only support Java 8
● HDFS erasure coding
○ Major feature that refactored core pieces of HDFS
○ Too big to backport to 2.x
● YARN as Data/container cloud
○ Significant change to support Docker and native service in YARN
● Other miscellaneous incompatible bugfixes and improvements
○ Hadoop 2.x was branched in 2011
○ 6 years of changes waiting for 3.0
5. Hadoop 3 status and release plan
● A series of alphas and betas leading up to GA
● Alpha4 feature complete
● Beta1 compatibility freeze
● GA by the end of the year
Release Date
3.0.0-alpha1 2016-09-03 ✔
3.0.0-alpha2 2017-01-25 ✔
3.0.0-alpha3 2017-05-16 ✔
3.0.0-alpha4 2017 Q2
3.0.0-beta1 2017 Q3
3.0.0 GA 2017 Q4
https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3.0.0+release
17. EC Reconstruction
b1 b2 b3
/foo.csv - 3 block file
p1 p2
X
Read 3 remaining blocks
Reed-Solomon (3,2)
b3
Run RS to recover b3
New copy of b3 recovered
18. EC implications
● File data is striped across multiple nodes and racks
● Reads and writes are remote and cross-rack
● Reconstruction is network-intensive, reads m blocks cross-rack
● Important to use Intel’s optimized ISA-L for performance
○ 1+ GB/s encode/decode speed, much faster than Java implementation
○ CPU is no longer a bottleneck
● Need to combine data into larger files to avoid an explosion in replica count
○ Bad: 1x1GB file -> RS(10,4) -> 14x100MB EC blocks (4.6x # replicas)
○ Good: 10x1GB file -> RS(10,4) -> 14x1GB EC blocks (0.46x # replicas)
● Works best for archival / cold data usecases
22. Erasure coding status
● Massive development effort by the Hadoop community
○ 20+ contributors from many companies (Cloudera, Intel, Hortonworks, Huawei, Y! JP, …)
○ 100s of commits over three years (started in 2014)
● Erasure coding is feature complete!
● Solidifying some user APIs in preparation for beta1
● Current focus is on testing and integration efforts
○ Want the complete Hadoop stack to work with HDFS erasure coding enabled
○ Stress / endurance testing to ensure stability
23. Classpath isolation (HADOOP-11656)
● Hadoop leaks lots of
dependencies onto the
application’s classpath
○ Known offenders: Guava, Protobuf,
Jackson, Jetty, …
● No separate HDFS client jar
means server jars are leaked
● YARN / MR clients not shaded
● HDFS-6200: Split HDFS client
into separate JAR
● HADOOP-11804: Shaded
hadoop-client dependency
● YARN-6466: Shade the task
umbilical for a clean YARN
container environment (ongoing)
24. Miscellaneous
● Shell script rewrite
● Support for multiple Standby NameNodes
● Intra-DataNode balancer
● Support for Microsoft Azure Data Lake and Aliyun OSS
● Move default ports out of the ephemeral range
● S3 consistency and performance improvements (ongoing)
● Tightening the Hadoop compatibility policy (ongoing)
26. Apache Hadoop 3.0 - YARN Enhancements
● Built-in support for Long Running Services
● Better resource isolation and Docker!!
● YARN Scheduling Enhancements
● Re-architecture for YARN Timeline Service - ATS v2
● Better User Experiences
● Other Enhancements
27. YARN Native Service - Key Drivers
● Consolidation of Infrastructure
○ Hadoop clusters have a lot of compute and storage resources (some unused)
○ Can’t I use Hadoop’s resources for non-Hadoop load?
○ Other open source infra/cloud is hard to run, can I use YARN?
○ But does it support Docker? – yes, we heard you
○ Can we run hadoop (Hive, HBase, etc.) or related services on YARN?
■ Benefit from YARN’s Elasticity and resource management
28. Built-in support for long running service in YARN
● A native YARN framework - YARN-4692
○ Abstract common Framework for long running service
■ Similar to Slider
○ More simplified API
● Recognition of long running service
○ Affect the policy of preemption, container reservation, etc.
○ Auto-restart of containers
○ Long running containers are retried to same node in case of local state
● Service/application upgrade support - YARN-4726
○ Services are expected to run long enough to cross versions
● Dynamic container configuration
● Service Discovery
○ Expose existing service information in YARN registry via DNS (YARN-4757)
29. Docker on YARN
● Why Docker?
○ Lightweight mechanism for packaging, distributing,
and isolating processes
○ Most popular containerization framework
○ Packaging new apps for YARN easier
■ TensorFlow, etc.
○ Focus on integration instead of container primitives
○ Mostly fits into the YARN Container model
30. Docker on YARN (Contd.)
● Docker support in LinuxContainerExecutor
○ YARN-3611 (Umbrella)
○ Multiple container types are supported in the same executor.
○ A new docker container runtime is introduced that manages docker containers
○ LinuxContainerExecutor can delegate to either runtime on a per application basis
○ Clients specify which container type they want to use
■ currently via environment variables but eventually through well-defined client
APIs.
32. Scheduling Enhancements
● Generic Resource Types
○ Abstract ResourceTypes to allow new resources, like: GPU, Network, etc.
○ Resource profiles for containers
■ small, medium, large etc. similar to EC2 instance types
● Global Scheduling: YARN-5139
○ Replace trigger scheduling only on heartbeat with global scheduler that has parallel threads
○ Globally optimal placement
■ Critical for long running services – they stick to the allocation – better be a good one
■ Enhanced container scheduling throughput (6 - 8x)
33. Scheduling Enhancements (Contd.)
● Other CapacityScheduler improvements
○ Queue Management Improvements
■ More Dynamic Queue reconfiguration
■ REST API support for queue management
○ Absolute resource configuration support
○ Priority Support in Application and Queue
○ Preemption improvements
■ Inter-Queue preemption support
● FairScheduler improvements
○ Preemption improvements
■ Preemption considers resource requirements of starved applications
○ Better defaults:
■ Assign multiple containers in a heartbeat based on resource availability
34. Application Timeline Service v2
● ATS: Captures system/application
events/metrics
● v2 improvements:
○ Enhanced Data Model: first-class citizen for Flows,
Config, etc.
○ Scalable backend: HBase
○ Distributed Reader/Writer
○ Others
■ Captures system metrics. E.g. memory/cpu
usage per container over time
■ Efficient updates: just write a new version to
the appropriate HBase cell
36. YARN New WebUI
● Improved visibility into cluster usage
○ Memory, CPU
○ By queues and applications
○ Sunburst graphs for hierarchical queues
○ NodeManager heatmap
● ATSv2 integration
○ Plot container start/stop events
○ Easy to capture delays in app execution
37. Misc. YARN/MR improvements
● Opportunistic containers (YARN-2877 & YARN-5542)
○ Motivation: Resource utilization is typically low in most clusters
○ Solution: Run some containers at a lower priority, and preempted as and when needed for
Guaranteed containers
● YARN Federation (YARN-2915 & YARN-5597)
○ Allows YARN to scale to 100k nodes and beyond
● HA improvements
○ Better handling of transient network issues
○ ZK-store scalability: Limit number of children under a znode
● MapReduce Native Collector (MAPREDUCE-2841)
○ Native implementation of the map output collector
○ Upto 30% faster for shuffle-intensive jobs
38. Summary: What’s new in Hadoop 3.0?
● Storage Optimization
○ HDFS: Erasure codes
● Improved Utilization
○ YARN: Long Running Services
○ YARN: Schedule Enhancements
● Additional Workloads
○ YARN: Docker & Isolation
● Easier to Use
○ New User Interface
● Refactor Base
○ Lots of Trunk content
○ JDK8 and newer dependent libraries
3.0
40. Compatibility
● Strong feedback from large users on the need for compatibility
● Preserves wire-compatibility with Hadoop 2 clients
○ Impossible to coordinate upgrading off-cluster Hadoop clients
● Will support rolling upgrade from Hadoop 2 to Hadoop 3
○ Can’t take downtime to upgrade a business-critical cluster
● Not fully preserving API compatibility!
○ Dependency version bumps
○ Removal of deprecated APIs and tools
○ Shell script rewrite, rework of Hadoop tools scripts
○ Incompatible bug fixes
41. Testing and validation
● Extended alpha → beta → GA plan designed for stabilization
● EC already has some users in production (700 nodes at Y! JP)
● Cloudera is rebasing CDH against upstream and running full test suite
○ Integration of Hadoop 3 with all components in CDH stack
○ Same integration tests used to validate CDH5
● Hortonworks is also integrating and testing Hadoop 3
● Plans for extensive HDFS EC testing by Cloudera and Intel
● Happy synergy between 2.8.x and 3.0.x lines
○ Shares much of the same code, fixes flow into both
○ Yahoo! doing scale testing of 2.8.0
42. Conclusion
● Expect Hadoop 3.0.0 GA by the end of the year
● Shiny new features
○ HDFS Erasure Coding
○ YARN Docker and Native Service Support
○ YARN ATSv2
○ Client classpath isolation
● Great time to get involved in testing and validation
● Come to the BoFs on Thursday at 5PM
○ YARN BoF: Room 211
○ HDFS BoF: Room 212