Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and Deployment

Hadoop 23 (dotNext):
Experiences,
Customer Impact & Deployment
Hadoop User Group Sunnyvale Meet up – 17 October 2012
Viraj Bhat: viraj@yahoo-inc.com

About Me
• Principal Engg in the Yahoo! Grid Team since May 2008
• PhD from Rutgers University, NJ
– Specialization in Data Streaming, Grid, Autonomic Computing
• Worked on streaming data from live simulations executing in
NERSC (CA), ORNL (TN) to Princeton Plasma Physics Lab (PPPL -
NJ)
– Library introduce less then 5% overhead on computation
• PhD Thesis on In-Transit data processing for peta-scale simulation
workflows
• Developed CorbaCoG kit for Globus
• Active contributor to Hadoop Apache, Pig, HCat and developer of
Hadoop Vaidya

-2-

Agenda

• Overview and Introduction
• YARN
• Federation
• Hadoop 23 Experiences

-3-

Hadoop Technology Stack at Yahoo!
• HDFS – Distributed File System Oozie
• Map/Reduce – Data Processing
Paradigm HCatalog
• HBase and HFile – columnar
storage Hive PIG
• PIG – Data Processing Language
• HIVE – SQL like query processing Map Reduce
language
• HCatalog – Table abstraction on HBase
top of big data allows interaction
with Pig and Hive
File Format (HFile)
• Oozie – Workflow Management
System
HDFS

4

-4-

Hadoop 0.23 (dotNext) Highlights
• Major Hadoop release adopted by Yahoo! in over 2
years (after Hadoop 0.20)
– Built and stabilized by the Yahoo! Champaign Hadoop team
• Primary focus is scalability
– YARN aka MRv2 – Job run reliability
• Agility & Evolution
– HDFS Federation – larger namespace & scalability
• Larger aggregated namespace
• Helps for better storage consolidation in Yahoo!
• Undergoing customer testing

• Hadoop 23 release does not target availability
• Addressed in Hadoop 2.0 and beyond

-5-

Hadoop 23 Story at Yahoo!
• Extra effort was taken in Yahoo! to certify applications
with Hadoop 23
• Sufficient time was provided for users to test their
applications in Hadoop 23
• Users are encouraged to get accounts to test if their
applications run on a sandbox cluster which has
Hadoop 23 installed
• Roll Out Plan – In Progress
– Q4-2012 through Q1 2013 Hadoop 23 will be installed in
a phased manner on 50k nodes at Yahoo!
– 3 Large Customer Grids were successfully upgraded to
Hadoop 23

-6-

YET ANOTHER RESOURCE
NEGOTIATOR (YARN)

NEXT GENERATION OF HADOOP MAP-REDUCE

-7-

Hadoop MapReduce in Hadoop 1.0.2
• JobTracker
– Manages cluster resources and job
scheduling
• TaskTracker
– Per-node agent
– Manage tasks

-8-

Paradigm shift with Hadoop 23
• Split up the two major functions of JobTracker
– Cluster resource management
– Application life-cycle management
• MapReduce becomes user-land library

-9-

Components of YARN
• Resource Manager
– Global resource scheduler
– Hierarchical queues
• Node Manager
– Per-machine agent
– Manages the life-cycle of container
– Container resource monitoring
• Application Master
– Per-application
– Manages application scheduling and task execution

- 10 -

Architecture of YARN

- 11 -


- 12 -


- 13 -

Experiences of YARN – High Points
• Scalable
– Largest YARN cluster in the world built at Yahoo! running on
(Hadoop 0.23.3), with no scalability issues so far 
– Ran tests to validate that YARN should scale to 10,000 nodes.
• Surprisingly Stable
• Web Services
• Better Utilization of Resources at Yahoo!
– No fixed partitioning between Map and Reduce Tasks
– Latency from resource available to resource re-assigned is far
better than 1.x in big clusters

- 14 -

Performance (0.23.3 vs. 1.0.2)
• HDFS

– Read (Throughput 5.37% higher)

• MapReduce

– Sort (Runtime 4.59% smaller, Throughput 3.98% higher)

– Shuffle (Shuffle Time 13.25% smaller)

– Gridmix (Runtime 5.29% smaller)
– Small Jobs – Uber AM (Word Count 3.5x faster, 27.7x
fewer resources)

- 15 -

Synergy with new Compute Paradigms
• MPI (www.open-mpi.org nightly snapshot)
• Machine Learning (Spark)
• Real-time Streaming (S4 and Storm coming soon)
• Graph Processing (GIRAPH-13 coming soon)

- 16 -

The Not So Good
• Oozie on YARN can have potential deadlocks (MAPREDUCE-
4304)
– UberAM can mitigate this
• Some UI scalability issues (YARN-151, MAPREDUCE-4720)
– Some pages download very large tables and paginate in
JavaScript
• Minor incompatibilities in the distributed cache
• No generic history server (MAPREDUCE-3061)
• AM failures hard to debug (MAPREDUCE-4428, MAPREDUCE-
3688)

- 17 -

HADOOP 23 FEATURES
HDFS FEDERATION

- 18 -

Non Federated HDFS Architecture

- 19 -

Non Federated HDFS Architecture
• Single Namespace Volume
– Namespace Volume = Namespace +
Block Storage Namespace

Namenode
Blocks
NS

Block Management
• Single namenode with a namespace
– Entire namespace is in memory

Datanode Datanode – Provides Block Management
Storage • Datanodes store block replicas
– Block files stored on local file system

- 20 -

Limitation - Single Namespace
• Scalability
– Storage scales horizontally - namespace doesn’t
– Limited number of files, dirs and blocks
• 250 million files and blocks at 64GB Namenode heap size
• Performance
– File system operations throughput limited by a single node
• 120K read ops/sec and 6000 write ops/sec
• Poor Isolation
– All the tenants share a single namespace
• Separate volume for tenants is not possible
– Lacks separate namespace for different categories of applications
• Experimental apps can affect production apps
• Example - HBase could use its own namespace
• Isolation is problem, even in a small cluster

- 21 -

HDFS Federation
Namespace NN-1 NN-k NN-n

Foreign NS
NS1 NS k n
... ...

Pool 1 Pool k Pool n
Block Storage

Block Pools

Datanode 1 Datanode 2 Datanode m
... ... ...
Common Storage

• An administrative/operational feature for better managing resources required at Yahoo!
• Multiple independent Namenodes and Namespace Volumes in a cluster
› Namespace Volume = Namespace + Block Pool
• Block Storage as generic storage service
› Set of blocks for a Namespace Volume is called a Block Pool
› DNs store blocks for all the Namespace Volumes – no partitioning
- 22 -

Managing Namespaces
• Federation has multiple namespaces /
Client-side
mount-table
• Client-side implementation of mount
tables
– No single point of failure data project home tmp
– No hotspot for root and top level
directories
• Applications using Federation should NS4
use the viewfs:// schema
– The viewfs:// URI schema can be
used as the default file system replacing
NS1 NS2 NS3
the hdfs:// schema

- 23 -

Hadoop 23 Federation
• Federation Testing is underway
– Many ecosystems such as Pig have completed testing
– Real load testing will only be possible when multiple co-located
Grids transition to Hadoop 23
• Adoption of Federation will allow for better consolidation
storage resources
– Many data feeds are duplicated across various Grids

- 24 -

HADOOP 23 IMPACT ON END
USERS AND ECOSYSTEM
DEVELOPERS

- 25 -

Hadoop 23 Command Line
• New environment variables:
– $HADOOP_COMMON_HOME
– $HADOOP_MAPRED_HOME
– $HADOOP_HDFS_HOME

• hadoop command to execute mapred or hdfs sub-
commands has been deprecated
– Old usage (will work)
– hadoop queue –showacls
– hadoop fs -ls
– hadoop mapred job -kill <job_id>

– New Usage
– mapred queue -showacls
– hdfs dfs –ls <path>
– mapred job -kill <job_id>

- 26 -

Hadoop 23 Map Reduce
• An application that is using a version of Hadoop 1.0.2 will not work
in Hadoop 0.23

• Hadoop 0.23 version is API compatible with Hadoop 0.20.205 and
Hadoop 1.0.2
– Not binary compatible

• Hadoop Java programs will not require any code change, However
users have to recompile with Hadoop 0.23
– If code change is required, please let us know.

• Streaming applications should work without modifications

• Hadoop Pipes (using C/C++ interface) application will require a re-
compilation with new libraries

- 27 -

Hadoop 23 Compatibility - Pig
• Pig versions 0.9.2, 0.10 and beyond will be fully supported on Hadoop
0.23
– Packaging problem: Generating 2 different pig.jar with different versions
of Hadoop

• No Changes in Pig script if it uses relative paths in HDFS

• Changes in pig script is required if HDFS absolute path (hdfs:// ) is
used
– HDFS Federation part of Hadoop 23 requires the usage of viewfs:// (HDFS
discussion to follow)
– Change hdfs:// schema to use viewfs:// schema

• Java UDF’s must be re-compiled with Hadoop 23 compatible jar
– Customer Loaders and Storers in Pig are affected

- 28 -

Hadoop 23 Compatibility - Oozie
• Oozie 3.1.4 and later versions compatible with Hadoop 23

• No changes in workflow definition or job properties
– No need to redeploy the Oozie coordinator jobs

• Java code, streaming, pipes apps need to be recompiled with
Hadoop 0.23 jars for binary compatibility

• Existing user workflow and coordinator definition (XML) should
continue to work as expected

• Users “responsibility” to package the right Hadoop 23 compatible
jars
• Hadoop 23 compatible pig.jar needs to be packaged for Pig action

- 29 -

Hadoop 23 - Oozie Dev Challenges
• Learning curve for maven builds
– Build iterations, local maven staging repo staleness
• Queue configurations, container allocations require revisiting
the design
• Many iterations of Hadoop 23 deployment
– Overhead to test Oozie compatibility with new release
• Initial deployment of YARN did not have a view of the
Application Master (AM) logs
– Manual ssh to AM for debugging launcher jobs

- 30 -

Hadoop 23 Compatibility - Hive
• Hive version 0.9 and upwards are fully supported

• Hive SQL/scripts should continue to work without any
modification

• Java UDF’s in Hive must be re-compiled with Hadoop
23 compatible hive.jar

- 31 -

Hadoop 23 – Hive Dev Challenges
• Deprecation of code in MiniMRCluster that fetches the stack
trace from the JobTracker “no longer” works
– Extra amount of time in debugging and rewriting test cases
• Incompatibility of HDFS commands between Hadoop 1.0.2
and 0.23
– -rmr vs. -rm -r
– mkdir vs. mkdir –p
– Results in fixing tests in new ways or inventing workarounds
so that they run in both Hadoop 1.0.2 and Hadoop 0.23
• As Hive uses MapRed API’s; more work required for
certification
– Would be good to move to MapReduce API’s (for example: Pig)

- 32 -

Hadoop 23 - HCat
• HCat 0.4 and upwards version is certified to work with
Hadoop 23

- 33 -

Hadoop 23 Job History Log Format
• History API & Log format have changed
– Affects all applications and tools that directly use Hadoop
History API
– Logs stored as Avro serialization in JSon format
• Affected many tools which rely on Job Logs
– Hadoop Vaidya – had to be rewritten with the new
JobHistoryParser

- 34 -

Hadoop 23 Queues
• Hadoop 23 has support for Hierarchical Queues
– In Yahoo! it has been configured as a flat queue to limit
customer disruption
– Customer testing is being conducted

- 35 -

32/64 bit JDK 1.7
• Currently certifying Hadoop 23 and its ecosystems on 32 bit
1.7 JDK

• 64 bit 1.7 JDK certification for Hadoop and its ecosystems
would be taken up in Q1 2013

- 36 -

Hadoop 23 Operations and Services
• Grid Operations at Yahoo! transitioned Hadoop 1.0.2
Namenode to Hadoop 23 smoothly
– No data was lost
• Matched the container configurations on Hadoop 23 clusters
with the old Map Reduce slots
– Map Reduce slots were configured based on memory hence
transition was smooth
• Scheduling, planning and migration of Hadoop 1.0.2
applications to Hadoop 23 for about 100+ customers was a
major task for solutions
– Many issues were caught in the last minute needed emergency
fixes (globbing, pig.jar packaging, change in mkdir command )
– Hadoop 0.23.4 build planned

- 37 -

Acknowledgements
• YARN – Robert Evans, Thomas Graves, Jason Lowe
• Pig - Rohini Paliniswamy
• Hive and HCatalog – Chris Drome
• Oozie – Mona Chitnis and Mohammad Islam
• Services and Operations – Rajiv Chittajallu and Kimsukh
Kundu

- 38 -

References
• 0.23 Documentation
– http://people.apache.org/~acmurthy/hadoop-0.23/
• 0.23 Release Notes
– http://people.apache.org/~acmurthy/hadoop-0.23/hadoop-
project-dist/hadoop-common/releasenotes.html
• YARN Documentation
yarn/hadoop-yarn-site/YARN.html
• HDFS Federation Documentation
yarn/hadoop-yarn-site/Federation.html

- 39 -

Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and Deployment

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and Deployment

Similar to Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and Deployment (20)

More from Yahoo Developer Network

More from Yahoo Developer Network (20)

Recently uploaded

Recently uploaded (20)

Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and Deployment