HUG August 2010: Best practices

Apache Hadoop
Grid Patterns and Anti-Patterns

Arun C Murthy
Yahoo! Grid Team, CCDI
acmurthy@apache.org

Hello!
Who am I?
  Yahoo!
›  Grid Team (CCDI)

›  Lead the Apache Hadoop Map-Reduce Development Team

  Apache
›  Developer on Apache Hadoop since April 2006

›  Committer

›  Member of Apache Hadoop PMC

2 8/18/10

Apache Hadoop
The Software
  Hadoop Distributed File System
  Hadoop Map-Reduce
  Open source from Apache
  Written in Java
  Runs on
›  Linux, Solaris, Mac OS/X

›  Commodity hardware

3 8/18/10

Storage
HDFS
  Designed to store large files
  Stores files as large blocks (64 to 128 MB)
  Each block stored on multiple servers
  Data is automatically re-replicated on need
  Accessed from command line, Java API or C API

4 8/18/10

Data Processing
Hadoop Map-Reduce
  Map-Reduce is a programming model for efficient distributed computing
  Efficiency from
›  Streaming through data, reducing seeks
›  Pipelining

  A good fit for a lot of applications
›  Log processing
›  Web index building

5 8/18/10

Hadoop in the Enterprise
Usage and Importance
  Large number of corporations use Apache Hadoop at scale for several business critical
applications
›  Large, shared, multi-tenant deployments to minimize fragmentation across organizations

  Millions of dollars at stake!
›  Yahoo
•  Advertising, Search

•  40,000 machines and counting

  http://wiki.apache.org/hadoop/PoweredBy

6 8/18/10

Hadoop in the Enterprise
… however
  Hadoop isn’t a silver bullet (at least as yet!)
›  Hadoop still depends on users to utilize it effectively
›  Pig/Hive help, one can still write badly suited queries

  Need to adapt legacy applications to Hadoop, especially the Map-Reduce paradigm
  Efficient usage of Hadoop clusters is critical to getting return on the investment

7 8/18/10

Best Practices
Input to Applications
  Optimized to process large data-sets
  Pattern: Coalesce processing of multiple small input files into smaller number of maps
and use larger HDFS block-sizes for processing very large data-sets.

9 8/18/10

Best Practices
Map-Reduce - Mappers
  Process multiple-files per map for jobs with very large number of small input files
  Process large chunks of data per-map for large-scale data-processing
›  PetaSort – 66,000 maps with 12.5G per map

  Pattern: Unless the application's maps are heavily CPU bound, there is almost no
reason to ever require more than 60,000-70,000 maps for a single application.

10 8/18/10

Best Practices
Map-Reduce - Mappers
  Process multiple-files per map for jobs with very large number of small input files
  Process large chunks of data per-map for large-scale data-processing
›  PetaSort – 66,000 maps with 12.5G per map

  The shuffle cross-bar (maps * reduces) is a key performance factor
  Pattern: Applications should use fewer maps to process data in parallel, as few as
possible without having really bad failure recovery cases.
›  Unless the application's maps are heavily CPU bound, there is almost no reason to ever require
more than 60,000-70,000 maps for a single application

11 8/18/10

Best Practices
Map-Reduce – Combiner and Shuffle
  Combiner
›  Map-side aggregation to help reduce network traffic for the shuffle
›  Cost of using combiners

  Shuffle
›  Compression of intermediate output

  Pattern: Use combiners judiciously, ensure they really work! Compress intermediate
outputs

12 8/18/10

Best Practices
Map-Reduce – Reducers
  Efficiency depends on shuffle, and the cross-bar
  Configure appropriate number of reduces
›  Too few reduces hurt the nodes
›  Too many hurt the cross-bar

  Pattern: Applications should ensure that each reduce should process at least 1-2 GB of
data, and at most 5-10GB of data, in most scenarios.

13 8/18/10

Best Practices
Map-Reduce – Output
  Number of output artifacts is linear w.r.t. number of configured reduces
  Compress outputs
  Use appropriate file-formats for the output
›  E.g. compressed text-files is not a great idea if you aren’t using a splittable codec

  Think of the consumer of your data-set!
  Consider using larger HDFS block-sizes.
  Pattern: : Application outputs to be few large files, with each file spanning multiple
HDFS blocks and appropriately compressed.

14 8/18/10

Best Practices
Map-Reduce – Distributed Cache
  Efficient distribution of read-only files for applications
  Designed for small number of mid-sized files
  Pattern: Applications should ensure that artifacts in the distributed-cache should not
require more i/o than the actual input to the application tasks

15 8/18/10

Best Practices
Map-Reduce – Counters
  Global (across all tasks) counters, aggregated by the framework
  Expensive!
  Pattern: Applications should not use more than 10, 15 or 25 custom counters.

16 8/18/10

Best Practices
Map-Reduce – Total Order Outputs
  Sampling Partitioner
›  Do not use a single reducer!
›  E.g. Terasort/Petasort benchmarks

  Joining fully sorted data-sets
›  Do not need same cardinality e.g. number of buckets for the data-sets being joined

  Pattern: Use combiners judiciously, ensure they really work!

17 8/18/10

Best Practices
HDFS – NameNode and JobTracker Operations
  NameNode: Please don’t hurt me!
›  Not yet a silver bullet…
›  Do not perform metadata operations for map/reduce tasks at the backend

  Do not contact for JobTracker for cluster statistics etc. from the backend
  Pattern: Applications should not perform any metadata operations on the file-system
from the backend, they should be confined to the job-client during job-submission.
Furthermore, applications should be careful not to contact the JobTracker from the
backend.

18 8/18/10

Best Practices
Map-Reduce – Logs and Web-UI
  Tasks’ stdout/stderr stored on TaskTrackers
›  Limit amount of logs

  JobTracker/NameNode Web-UI
›  Do not screen-scrape!

19 8/18/10

Best Practices
Oozie – Workflows
  Production pipelines are run via Oozie
  Ensure workflows have small number of medium-to-large sized Map-Reduce jobs
›  Collapse smaller jobs

  Pattern: A single Map-Reduce job in a workflow should process at least a few tens of
GB of data.

20 8/18/10

Anti-Patterns
In a large enough cluster, you see any and all of these…
  Applications not using a higher-level interface such as Pig/Hive
  Processing thousands of small files (sized less than 1 HDFS block, typically 128MB)
with one map processing a single small file.

  Processing very large data-sets with small HDFS block size i.e. 128MB resulting in tens
of thousands of maps.
  Applications with a large number (thousands) of maps with a very small runtime (e.g.
5s).
  Straight-forward aggregations without the use of the Combiner.
  Applications with greater than 60,000-70,000 maps.
  Applications processing large data-sets with very few reduces (e.g. 1).
›  Pig scripts processing large data-sets without using the PARALLEL keyword

›  Applications using a single reduce for total-order amount the output records

21 8/18/10

Anti-Patterns
  Applications processing data with very large number of reduces, such that each reduce
processes less than 1-2GB of data.
  Applications writing out multiple, small, output files from each reduce.
  Applications using the DistributedCache to distribute a large number of artifacts and/or
very large artifacts (hundreds of MBs each).
  Applications using tens or hundreds of counters per task.
  Applications performing metadata operations (e.g. listStatus) on the file-system from
the map/reduce tasks.
  Applications doing screen scraping of JobTracker web-ui for status of queues/jobs or
worse, job-history of completed jobs.
  Workflows comprising of hundreds or thousands of small jobs processing small
amounts of data.

Work underway in yahoo-hadoop-0.20.200 to prevent anti-patterns

22 8/18/10

Blog Post

http://developer.yahoo.net/blogs/hadoop/2010/08/apache_hadoop_best_practices_a.html

23 8/18/10

Thanks!

Yahoo! Presentation, Confidential 24 8/18/10

HUG August 2010: Best practices

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (20)

Similaire à HUG August 2010: Best practices

Similaire à HUG August 2010: Best practices (20)

Plus de Hadoop User Group

Plus de Hadoop User Group (15)

Dernier

Dernier (20)

HUG August 2010: Best practices