SlideShare une entreprise Scribd logo
1  sur  46
Télécharger pour lire hors ligne
Adaptive MapReduce using Situation-Aware
                     Mappers

                    Rares Vernica1 (HP Labs),
    Andrey Balmin, Kevin S. Beyer, Vuk Ercegovac (IBM Research)

                           1 Work   done at IBM Research.


15th International Conference on Extending Database Technology,
                        March 26-30 2012




 Rares Vernica (HP Labs)            Adaptive MapReduce      EDBT 2012   1 / 25
Outline


1   Motivation

2   Problem Statement

3   Situation-Aware Mappers
       Adaptive Mappers
       Adaptive Combiners
       Adaptive Sampling and Partitioning

4   Summary




    Rares Vernica (HP Labs)   Adaptive MapReduce   EDBT 2012   2 / 25
MapReduce Review

map    (k,v)       → list(k,v);
reduce (k,list(v)) → list(k,v).

                 Input:             Output:
                  (k,v)             list(k,v)
                                                            Input:       Output:

     DFS                    MAP                           (k, list(v))   list(k,v)
                                                                                     DFS
    INPUT 1/3                                          REDUCE                  OUTPUT 1/2
    INPUT 2/3               MAP
                                                                               OUTPUT 2/2
    INPUT 3/3                                          REDUCE
                            MAP                        MERGE
                                                SHUFFLE




combine (k,list(v)) → list(k,v).

  Rares Vernica (HP Labs)         Adaptive MapReduce                             EDBT 2012   3 / 25
MapReduce Review

map    (k,v)       → list(k,v);
reduce (k,list(v)) → list(k,v).

                 Input:             Output:
                  (k,v)             list(k,v)
                                                            Input:       Output:

     DFS                    MAP                           (k, list(v))   list(k,v)
                                                                                     DFS
    INPUT 1/3                                          REDUCE                  OUTPUT 1/2
    INPUT 2/3               MAP
                                                                               OUTPUT 2/2
    INPUT 3/3                                          REDUCE
                            MAP                        MERGE
                                                SHUFFLE




combine (k,list(v)) → list(k,v).

  Rares Vernica (HP Labs)         Adaptive MapReduce                             EDBT 2012   3 / 25
Motivation: MapReduce Issues


MapReduce
     Parallel data-processing framework
     Open-source implementation (Hadoop)
     Simple programming environment

MapReduce: “simplicity over performance”
Limited choice of execution strategies:
     Mappers checkpoint after every split
     Map outputs are sorted and written to file
     Reducer read statically predetermined partitions



   Rares Vernica (HP Labs)    Adaptive MapReduce        EDBT 2012   4 / 25
Solutions to MapReduce Issues




MapReduce-inspired alternatives
     Dryad (Microsoft)
     Spark (UC Berkeley)
     Hyracks (UC Irvine)
     Nephele (TU Berlin)
Have more choices in runtime execution




   Rares Vernica (HP Labs)   Adaptive MapReduce   EDBT 2012   5 / 25
Our Solution: Adaptive MapReduce


Make MapReduce (Hadoop) more flexible
    Leverage existing investment in:
           Framework (Hadoop)
           Query processing systems (Jaql, Pig, Hive)
    Techniques for:
           Dynamic checkpoint intervals (Map)
           Best-effort hash-based aggregation (Combine)
           Dynamic, sample-based, partitioning (Reduce)
    Performance tuning:
           Cardinality and cost estimation (due to UDFs)
           Adaptive to runtime environment




  Rares Vernica (HP Labs)        Adaptive MapReduce        EDBT 2012   6 / 25
Problem Statement: Adaptive MapReduce



Goals
Improve MapReduce (Hadoop) performance by:
    New runtime options
    Adaptive to runtime environment

Preserve Hadoop’s
    Fault-tolerance
    Scalability
    Programability




  Rares Vernica (HP Labs)   Adaptive MapReduce   EDBT 2012   7 / 25
Outline


1   Motivation

2   Problem Statement

3   Situation-Aware Mappers
       Adaptive Mappers
       Adaptive Combiners
       Adaptive Sampling and Partitioning

4   Summary




    Rares Vernica (HP Labs)   Adaptive MapReduce   EDBT 2012   8 / 25
Situation-Aware Mappers




Main idea
    Make MapReduce more dynamic




  Rares Vernica (HP Labs)   Adaptive MapReduce   EDBT 2012   9 / 25
Situation-Aware Mappers




Main idea
    Make MapReduce more dynamic
    Mappers:
           Aware of the global state of the job




  Rares Vernica (HP Labs)         Adaptive MapReduce   EDBT 2012   9 / 25
Situation-Aware Mappers




Main idea
    Make MapReduce more dynamic
    Mappers:
           Aware of the global state of the job
           Communicate through a distributed meta-data store




  Rares Vernica (HP Labs)       Adaptive MapReduce             EDBT 2012   9 / 25
Situation-Aware Mappers




Main idea
    Make MapReduce more dynamic
    Mappers:
           Aware of the global state of the job
           Communicate through a distributed meta-data store
           Break assumption: isolation




  Rares Vernica (HP Labs)       Adaptive MapReduce             EDBT 2012   9 / 25
Situation-Aware Mappers




Main idea
    Make MapReduce more dynamic
    Mappers:
           Aware of the global state of the job
           Communicate through a distributed meta-data store
           Break assumption: isolation
    Situation-Aware Mappers




  Rares Vernica (HP Labs)       Adaptive MapReduce             EDBT 2012   9 / 25
Adaptive MapReduce




DFS                                                         DFS
              MAP
              MAP                                REDUCE
               MAP                               REDUCE




  Rares Vernica (HP Labs)   Adaptive MapReduce       EDBT 2012   10 / 25
Adaptive MapReduce
                             Distributed Meta-Data Store
                                    Distributed read/write
                                    Transactional
                     DMDS           e.g., ZooKeeper
DFS                                                                 DFS
              MAP
              MAP                                REDUCE
               MAP                               REDUCE




  Rares Vernica (HP Labs)   Adaptive MapReduce               EDBT 2012   10 / 25
Adaptive MapReduce


                     DMDS
DFS                                                                   DFS
              MAP
         AM




                                 AC

                                         AP
                            AS
              MAP                                          REDUCE
                                                           REDUCE
               MAP

Adaptive Techniques
    AM: Adaptive Mappers
    AC: Adaptive Combiners
    AS: Adaptive Sampling
    AP: Adaptive Partitioning

  Rares Vernica (HP Labs)             Adaptive MapReduce       EDBT 2012   10 / 25
Adaptive Mappers Motivation

    Input data is divided into splits
    One-to-one correspondence of mappers and splits
    AM decouple # splits from # mappers

  Large splits
    Small startup cost
    Inbalanced workload
  Small splits
   Large startup cost
   Balanced workload




                      : Startup cost, e.g., scheduling, loading ref. data

                        ,    : Split processing cost


  Rares Vernica (HP Labs)              Adaptive MapReduce                   EDBT 2012   11 / 25
Adaptive Mappers Motivation

    Input data is divided into splits
    One-to-one correspondence of mappers and splits
    AM decouple # splits from # mappers

  Large splits
    Small startup cost
    Inbalanced workload
  Small splits
   Large startup cost
   Balanced workload
  Adaptive Mappers
   Small startup cost
   Balanced workload

                      : Startup cost, e.g., scheduling, loading ref. data

                        ,    : Split processing cost


  Rares Vernica (HP Labs)              Adaptive MapReduce                   EDBT 2012   11 / 25
Adaptive Mappers Algorithm


        MapReduce Client

 ZooKeeper         1
Root
  JobID
    locations
      Host1
       [Split1,
        Split2,
          ...   ]
      Host2
      ...




  Rares Vernica (HP Labs)   Adaptive MapReduce   EDBT 2012   12 / 25
Adaptive Mappers Algorithm


        MapReduce Client
                                  Host1
 ZooKeeper                  2     Map1
                   1
Root                            Init
  JobID
    locations
      Host1                       Map2
       [Split1,                 Init
        Split2,
          ...   ]
      Host2                         ...
      ...                        Host2
                                  ...
                                  ...



  Rares Vernica (HP Labs)        Adaptive MapReduce   EDBT 2012   12 / 25
Adaptive Mappers Algorithm


        MapReduce Client
                                      Host1
 ZooKeeper                      2     Map1
                   1
Root                                Init
  JobID
    locations
      Host1                           Map2
       [Split1,                     Init
        Split2,             3
          ...   ]
      Host2                             ...
      ...                            Host2
                                      ...
                                      ...



  Rares Vernica (HP Labs)            Adaptive MapReduce   EDBT 2012   12 / 25
Adaptive Mappers Algorithm


        MapReduce Client
                                  Host1
 ZooKeeper                  2     Map1
                   1
Root                            Init
  JobID
    locations
      Host1                       Map2
       [Split1,                 Init
                  3                  Split1
        Split2,
          ...   ]
      Host2                         ...
      ...                         Host2
    assigned                4      ...
      Split1{Map2}                 ...



  Rares Vernica (HP Labs)        Adaptive MapReduce   EDBT 2012   12 / 25
Adaptive Mappers Algorithm


        MapReduce Client
                                   Host1
 ZooKeeper                  2      Map1
                   1
Root                             Init
  JobID
    locations
                                                       Store meta-data in
      Host1                        Map2                ZooKeeper
       [Split1,                  Init                  Implemented as a new
                  3                   Split1
        Split2,                        5               InputFormat
          ...   ]
      Host2                          ...
      ...                          Host2
    assigned                4       ...
      Split1{Map2}
                            OK/Fail ...



  Rares Vernica (HP Labs)         Adaptive MapReduce               EDBT 2012   12 / 25
Adaptive Mappers Algorithm




Additional Features
    Process local splits first, then remote splits
    Fault tolerance
           Restated task unlocks splits
           Split reprocessing is shared
    Scheduler aware (FIFO, FAIR, and FLEX)




  Rares Vernica (HP Labs)        Adaptive MapReduce   EDBT 2012   13 / 25
Experimental Setting

Hardware
    40-node IBM Systemx iDataPlex dx340
    Two quad-core Intel Xeon E5540 64-bit 2.83GHz
    32GB RAM
    Four SATA disks
    160 map and 160 reduce slots

Software
    Ubuntu Linux, kernel 2.6.32-24 64-bit server edition
    Java 1.6 64-bit server edition
    Hadoop 0.20.2
    ZooKeeper 3.3.1


  Rares Vernica (HP Labs)    Adaptive MapReduce            EDBT 2012   14 / 25
Start-up Cost vs. ZooKeeper Overhead


                  300          Regular Mappers
                  280          Adaptive Mappers                         2000 1-byte records
Time (seconds)




                                                                        Sleep 1s/record
                  140                                                   5 nodes, 20 map slots
                  120                                                   20-2000 Reg. Mappers
                  100
                                                                        20 Adaptive Mappers
                   80
                   60
                                                                        Small ZooKeeper
                   40
                                                                        overhead
                   20
                    0                                                   Large Map startup
                                  20         200          2000          cost ∼2s/map

                                       Number of Splits

                 Rares Vernica (HP Labs)           Adaptive MapReduce             EDBT 2012   15 / 25
Adaptive Mappers Workloads



 1     Set-Similarity Join [Vernica et al., 2010]
              Publication datasets
              DBLP: 1.2M records, 310MB
              CITESEERX: 1.3M records, 1,750MB
              Increased to ×10 and ×100
 2     JOIN
              Single dataset (“fact” table), Sort Benchmark data generator
              Fan-out coefficient (“dimension” table)
              average join fan-out 1 : 30
              TERASORT: 1B records, 93GB




     Rares Vernica (HP Labs)        Adaptive MapReduce              EDBT 2012   16 / 25
Adaptive Mappers Experiments - Set-Similarity Join


                  1000
                              Regular Mappers                                Stage 3:
                              Adaptive Mappers
                   800                                                       One-Phase Record Join
 Time (seconds)




                                                                             Broadcast join equivalent
                   600
                                                                             DBLP and CITESEERX ×10
                   400                                                       Single wave of AM
                   200
                                                                             ×3 speedup over default
                       0                                                     Hadoop split size (64MB)
                                                                             Optimal with no tuning
                           20
                           10
                           51
                           25
                           12
                           64
                                                        32
                                                        AM
                             48
                             24
                             2
                             6
                             8




                                      Split Size (MB)



                  Rares Vernica (HP Labs)               Adaptive MapReduce                  EDBT 2012   17 / 25
Adaptive Mappers Experiments - JOIN



                                           Regular Mappers
                                                                            Map-only job
                 1200
                                           Adaptive Mappers                 1B TERASORT records
Time (seconds)




                  900                                                       Models a skewed join
                                                                            Single wave of AM
                  600
                                                                            Regular Mappers:
                  300                                                           Large split: data skew
                                                                                Small split: scheduling
                      0                                                         and start-up overhead
                                                                            Optimal with no tuning
                          10
                          51
                          25
                          12
                          64
                          32
                          16
                          8
                          AM
                            24
                            2
                            6
                            8




                                     Split Size (MB)


                 Rares Vernica (HP Labs)               Adaptive MapReduce                  EDBT 2012   18 / 25
Adaptive MapReduce


                     DMDS
DFS                                                           DFS
              MAP
         AM




                            AC

                            AP
                            AS
              MAP                                  REDUCE
          AM




                             AC

                             AP
                             AS
               MAP                                 REDUCE
           AM




                              AC

                              AP
                              AS




Adaptive Techniques
    AM: Adaptive Mappers
    AC: Adaptive Combiners
    AS: Adaptive Sampling
    AP: Adaptive Partitioning

  Rares Vernica (HP Labs)     Adaptive MapReduce       EDBT 2012   19 / 25
Adaptive Combiners

Main idea
    Replace sort with hashing
    Reduce serialization, sort, and IO

                                  Regular Combiners
                     Sort Buffer
          Map




                                                            : User code
                                                            : Data




  Rares Vernica (HP Labs)              Adaptive MapReduce        EDBT 2012   20 / 25
Adaptive Combiners

Main idea
    Replace sort with hashing
    Reduce serialization, sort, and IO

                                  Regular Combiners
                     Sort Buffer
          Map                     Sort              Combine




                                                              : User code
                                                              : Data




  Rares Vernica (HP Labs)                Adaptive MapReduce        EDBT 2012   20 / 25
Adaptive Combiners

Main idea
    Replace sort with hashing
    Reduce serialization, sort, and IO

                                  Regular Combiners
                     Sort Buffer
          Map                     Sort              Combine




                                                              : User code
                                                              : Data




  Rares Vernica (HP Labs)                Adaptive MapReduce        EDBT 2012   20 / 25
Adaptive Combiners

Main idea
    Replace sort with hashing
    Reduce serialization, sort, and IO

                                  Regular Combiners
                     Sort Buffer
          Map                     Sort              Combine   Merge




                                                                      : User code
                                                                      : Data




  Rares Vernica (HP Labs)                Adaptive MapReduce                EDBT 2012   20 / 25
Adaptive Combiners

Main idea
    Replace sort with hashing
    Reduce serialization, sort, and IO

                                  Regular Combiners
                     Sort Buffer
          Map                     Sort              Combine   Merge




                                                                      : User code
                                  Adaptive Combiners                  : Data




             Hash-group and Combine


  Rares Vernica (HP Labs)                Adaptive MapReduce                EDBT 2012   20 / 25
Adaptive Combiners

Main idea
    Replace sort with hashing
    Reduce serialization, sort, and IO

                                  Regular Combiners
                     Sort Buffer
          Map                     Sort              Combine   Merge




                                                                      : User code
                                  Adaptive Combiners                  : Data




             Hash-group and Combine


  Rares Vernica (HP Labs)                Adaptive MapReduce                EDBT 2012   20 / 25
Adaptive Combiners Details



    “Best-effort” aggregation
    Never spill to disk
    Hash-table replacement policies:
           No-Replacement (NR)
           Least-Recently-Used (LRU)
    Implemented as:
           Library for Hadoop
           Optimization choice for Jaql




  Rares Vernica (HP Labs)        Adaptive MapReduce   EDBT 2012   21 / 25
Adaptive Combiners Experiments

GROUP-BY
    Synthetic dataset with 3 dimensions (A1, A2, and A3) and 1 fact
    Group records and apply aggregation function
    TWL: 10B records, 120GB
                         180                                                                    350                                 1.00
                                                                                                300
                         150




                                                                               Time (seconds)
                                                                                                                                    0.75




                                                                                                                                           Miss Ratio (%)
        Time (seconds)




                                                                                                250
                         120                                                                    200
                                                                                                                                    0.50
                         90                                                                     150
                                                                                                100                                 0.25
                         60
                                                                                                 50
                         30
                                                                                                  0                                 0.00




                                                                                                      Re

                                                                                                             AM

                                                                                                                   1

                                                                                                                        25

                                                                                                                              10
                          0




                                                                                                                                0
                                                                                                        g.
                               Re


                                     AM


                                            AC

                                                 AM




                                                                                                             Cache Size (K)
                                g.




                                                   ,A
                                                      C




                                                                                                      Regular Combiners
                                                                                                      Adaptive Combiners NR
                               Regular Combiners                                                      Adaptive Combiners LRU
                               Adaptive Combiners NR                                                  Miss Ratio NR
                               Adaptive Combiners LRU                                                 Miss Ratio LRU

                         GROUP-BY on A1                                        GROUP-BY on A1 and A2
                          ×2.5 speedup                                             ×3 speedup
  Rares Vernica (HP Labs)                                 Adaptive MapReduce                                                  EDBT 2012                     22 / 25
Adaptive MapReduce


                     DMDS
DFS                                                           DFS
              MAP
         AM




                            AC

                            AP
                            AS
              MAP                                  REDUCE
          AM




                             AC

                             AP
                             AS
               MAP                                 REDUCE
           AM




                              AC

                              AP
                              AS




Adaptive Techniques
    AM: Adaptive Mappers
    AC: Adaptive Combiners
    AS: Adaptive Sampling
    AP: Adaptive Partitioning

  Rares Vernica (HP Labs)     Adaptive MapReduce       EDBT 2012   23 / 25
Adaptive Sampling and Partitioning




                            MAP
                                                 REDUCE
                            MAP
                                                 REDUCE
                            MAP



  Rares Vernica (HP Labs)   Adaptive MapReduce   EDBT 2012   24 / 25
Adaptive Sampling and Partitioning



                                                   DMDS
Step 1 Compute and publish
       local histogram        MAP
                                                     REDUCE
                              MAP
                                                     REDUCE
                              MAP



    Rares Vernica (HP Labs)   Adaptive MapReduce          EDBT 2012   24 / 25
Adaptive Sampling and Partitioning



                                                   DMDS
Step 1 Compute and publish
       local histogram        MAP
Step 2 Collect local
       histograms and                                REDUCE
       compute partitioning
       function               MAP
                                                     REDUCE
                              MAP



    Rares Vernica (HP Labs)   Adaptive MapReduce          EDBT 2012   24 / 25
Adaptive Sampling and Partitioning



                                                     DMDS
Step 1 Compute and publish
       local histogram          MAP
Step 2 Collect local
       histograms and                                  REDUCE
       compute partitioning
       function                 MAP
Step 3 Broadcast partitioning
       function
                                                       REDUCE
                                MAP



    Rares Vernica (HP Labs)     Adaptive MapReduce          EDBT 2012   24 / 25
Summary


    Adaptive runtime techniques for MapReduce
    Situation-Aware Mappers
    Make MapReduce more dynamic


    Up to ×3 speedup for well-tuned jobs
    Orders of magnitude speedup for badly tuned jobs
    Never hurt performance
    Configure themselves
    Part of IBM InfoSphere BigInsights




  Rares Vernica (HP Labs)    Adaptive MapReduce        EDBT 2012   25 / 25
Vernica, R., Carey, M., and Li, C. (2010).
 Efficient parallel set-similarity joins using MapReduce.
 In SIGMOD Conference.




Rares Vernica (HP Labs)   Adaptive MapReduce               EDBT 2012   25 / 25

Contenu connexe

Similaire à Adaptive MapReduce using Situation-Aware Mappers

Similaire à Adaptive MapReduce using Situation-Aware Mappers (20)

Map Reduce
Map ReduceMap Reduce
Map Reduce
 
MapReduce and NoSQL
MapReduce and NoSQLMapReduce and NoSQL
MapReduce and NoSQL
 
Lecture 3: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 3: Data-Intensive Computing for Text Analysis (Fall 2011)Lecture 3: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 3: Data-Intensive Computing for Text Analysis (Fall 2011)
 
Introduction to Spark on Hadoop
Introduction to Spark on HadoopIntroduction to Spark on Hadoop
Introduction to Spark on Hadoop
 
MAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptxMAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptx
 
Map reduce
Map reduceMap reduce
Map reduce
 
Hadoop Map Reduce
Hadoop Map ReduceHadoop Map Reduce
Hadoop Map Reduce
 
Introduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduceIntroduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduce
 
02 Map Reduce
02 Map Reduce02 Map Reduce
02 Map Reduce
 
Hadoop
HadoopHadoop
Hadoop
 
Meethadoop
MeethadoopMeethadoop
Meethadoop
 
Large Scale Data Processing & Storage
Large Scale Data Processing & StorageLarge Scale Data Processing & Storage
Large Scale Data Processing & Storage
 
The Powerful Marriage of Hadoop and R (David Champagne)
The Powerful Marriage of Hadoop and R (David Champagne)The Powerful Marriage of Hadoop and R (David Champagne)
The Powerful Marriage of Hadoop and R (David Champagne)
 
Hadoop World 2011: The Powerful Marriage of R and Hadoop - David Champagne, R...
Hadoop World 2011: The Powerful Marriage of R and Hadoop - David Champagne, R...Hadoop World 2011: The Powerful Marriage of R and Hadoop - David Champagne, R...
Hadoop World 2011: The Powerful Marriage of R and Hadoop - David Champagne, R...
 
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
 
Stratosphere with big_data_analytics
Stratosphere with big_data_analyticsStratosphere with big_data_analytics
Stratosphere with big_data_analytics
 
Hive
HiveHive
Hive
 
MapReduce for scientific simulation analysis
MapReduce for scientific simulation analysisMapReduce for scientific simulation analysis
MapReduce for scientific simulation analysis
 
Map reduce team and yarn
Map reduce team and yarnMap reduce team and yarn
Map reduce team and yarn
 
Hadoop
HadoopHadoop
Hadoop
 

Adaptive MapReduce using Situation-Aware Mappers

  • 1. Adaptive MapReduce using Situation-Aware Mappers Rares Vernica1 (HP Labs), Andrey Balmin, Kevin S. Beyer, Vuk Ercegovac (IBM Research) 1 Work done at IBM Research. 15th International Conference on Extending Database Technology, March 26-30 2012 Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 1 / 25
  • 2. Outline 1 Motivation 2 Problem Statement 3 Situation-Aware Mappers Adaptive Mappers Adaptive Combiners Adaptive Sampling and Partitioning 4 Summary Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 2 / 25
  • 3. MapReduce Review map (k,v) → list(k,v); reduce (k,list(v)) → list(k,v). Input: Output: (k,v) list(k,v) Input: Output: DFS MAP (k, list(v)) list(k,v) DFS INPUT 1/3 REDUCE OUTPUT 1/2 INPUT 2/3 MAP OUTPUT 2/2 INPUT 3/3 REDUCE MAP MERGE SHUFFLE combine (k,list(v)) → list(k,v). Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 3 / 25
  • 4. MapReduce Review map (k,v) → list(k,v); reduce (k,list(v)) → list(k,v). Input: Output: (k,v) list(k,v) Input: Output: DFS MAP (k, list(v)) list(k,v) DFS INPUT 1/3 REDUCE OUTPUT 1/2 INPUT 2/3 MAP OUTPUT 2/2 INPUT 3/3 REDUCE MAP MERGE SHUFFLE combine (k,list(v)) → list(k,v). Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 3 / 25
  • 5. Motivation: MapReduce Issues MapReduce Parallel data-processing framework Open-source implementation (Hadoop) Simple programming environment MapReduce: “simplicity over performance” Limited choice of execution strategies: Mappers checkpoint after every split Map outputs are sorted and written to file Reducer read statically predetermined partitions Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 4 / 25
  • 6. Solutions to MapReduce Issues MapReduce-inspired alternatives Dryad (Microsoft) Spark (UC Berkeley) Hyracks (UC Irvine) Nephele (TU Berlin) Have more choices in runtime execution Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 5 / 25
  • 7. Our Solution: Adaptive MapReduce Make MapReduce (Hadoop) more flexible Leverage existing investment in: Framework (Hadoop) Query processing systems (Jaql, Pig, Hive) Techniques for: Dynamic checkpoint intervals (Map) Best-effort hash-based aggregation (Combine) Dynamic, sample-based, partitioning (Reduce) Performance tuning: Cardinality and cost estimation (due to UDFs) Adaptive to runtime environment Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 6 / 25
  • 8. Problem Statement: Adaptive MapReduce Goals Improve MapReduce (Hadoop) performance by: New runtime options Adaptive to runtime environment Preserve Hadoop’s Fault-tolerance Scalability Programability Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 7 / 25
  • 9. Outline 1 Motivation 2 Problem Statement 3 Situation-Aware Mappers Adaptive Mappers Adaptive Combiners Adaptive Sampling and Partitioning 4 Summary Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 8 / 25
  • 10. Situation-Aware Mappers Main idea Make MapReduce more dynamic Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 9 / 25
  • 11. Situation-Aware Mappers Main idea Make MapReduce more dynamic Mappers: Aware of the global state of the job Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 9 / 25
  • 12. Situation-Aware Mappers Main idea Make MapReduce more dynamic Mappers: Aware of the global state of the job Communicate through a distributed meta-data store Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 9 / 25
  • 13. Situation-Aware Mappers Main idea Make MapReduce more dynamic Mappers: Aware of the global state of the job Communicate through a distributed meta-data store Break assumption: isolation Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 9 / 25
  • 14. Situation-Aware Mappers Main idea Make MapReduce more dynamic Mappers: Aware of the global state of the job Communicate through a distributed meta-data store Break assumption: isolation Situation-Aware Mappers Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 9 / 25
  • 15. Adaptive MapReduce DFS DFS MAP MAP REDUCE MAP REDUCE Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 10 / 25
  • 16. Adaptive MapReduce Distributed Meta-Data Store Distributed read/write Transactional DMDS e.g., ZooKeeper DFS DFS MAP MAP REDUCE MAP REDUCE Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 10 / 25
  • 17. Adaptive MapReduce DMDS DFS DFS MAP AM AC AP AS MAP REDUCE REDUCE MAP Adaptive Techniques AM: Adaptive Mappers AC: Adaptive Combiners AS: Adaptive Sampling AP: Adaptive Partitioning Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 10 / 25
  • 18. Adaptive Mappers Motivation Input data is divided into splits One-to-one correspondence of mappers and splits AM decouple # splits from # mappers Large splits Small startup cost Inbalanced workload Small splits Large startup cost Balanced workload : Startup cost, e.g., scheduling, loading ref. data , : Split processing cost Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 11 / 25
  • 19. Adaptive Mappers Motivation Input data is divided into splits One-to-one correspondence of mappers and splits AM decouple # splits from # mappers Large splits Small startup cost Inbalanced workload Small splits Large startup cost Balanced workload Adaptive Mappers Small startup cost Balanced workload : Startup cost, e.g., scheduling, loading ref. data , : Split processing cost Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 11 / 25
  • 20. Adaptive Mappers Algorithm MapReduce Client ZooKeeper 1 Root JobID locations Host1 [Split1, Split2, ... ] Host2 ... Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 12 / 25
  • 21. Adaptive Mappers Algorithm MapReduce Client Host1 ZooKeeper 2 Map1 1 Root Init JobID locations Host1 Map2 [Split1, Init Split2, ... ] Host2 ... ... Host2 ... ... Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 12 / 25
  • 22. Adaptive Mappers Algorithm MapReduce Client Host1 ZooKeeper 2 Map1 1 Root Init JobID locations Host1 Map2 [Split1, Init Split2, 3 ... ] Host2 ... ... Host2 ... ... Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 12 / 25
  • 23. Adaptive Mappers Algorithm MapReduce Client Host1 ZooKeeper 2 Map1 1 Root Init JobID locations Host1 Map2 [Split1, Init 3 Split1 Split2, ... ] Host2 ... ... Host2 assigned 4 ... Split1{Map2} ... Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 12 / 25
  • 24. Adaptive Mappers Algorithm MapReduce Client Host1 ZooKeeper 2 Map1 1 Root Init JobID locations Store meta-data in Host1 Map2 ZooKeeper [Split1, Init Implemented as a new 3 Split1 Split2, 5 InputFormat ... ] Host2 ... ... Host2 assigned 4 ... Split1{Map2} OK/Fail ... Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 12 / 25
  • 25. Adaptive Mappers Algorithm Additional Features Process local splits first, then remote splits Fault tolerance Restated task unlocks splits Split reprocessing is shared Scheduler aware (FIFO, FAIR, and FLEX) Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 13 / 25
  • 26. Experimental Setting Hardware 40-node IBM Systemx iDataPlex dx340 Two quad-core Intel Xeon E5540 64-bit 2.83GHz 32GB RAM Four SATA disks 160 map and 160 reduce slots Software Ubuntu Linux, kernel 2.6.32-24 64-bit server edition Java 1.6 64-bit server edition Hadoop 0.20.2 ZooKeeper 3.3.1 Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 14 / 25
  • 27. Start-up Cost vs. ZooKeeper Overhead 300 Regular Mappers 280 Adaptive Mappers 2000 1-byte records Time (seconds) Sleep 1s/record 140 5 nodes, 20 map slots 120 20-2000 Reg. Mappers 100 20 Adaptive Mappers 80 60 Small ZooKeeper 40 overhead 20 0 Large Map startup 20 200 2000 cost ∼2s/map Number of Splits Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 15 / 25
  • 28. Adaptive Mappers Workloads 1 Set-Similarity Join [Vernica et al., 2010] Publication datasets DBLP: 1.2M records, 310MB CITESEERX: 1.3M records, 1,750MB Increased to ×10 and ×100 2 JOIN Single dataset (“fact” table), Sort Benchmark data generator Fan-out coefficient (“dimension” table) average join fan-out 1 : 30 TERASORT: 1B records, 93GB Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 16 / 25
  • 29. Adaptive Mappers Experiments - Set-Similarity Join 1000 Regular Mappers Stage 3: Adaptive Mappers 800 One-Phase Record Join Time (seconds) Broadcast join equivalent 600 DBLP and CITESEERX ×10 400 Single wave of AM 200 ×3 speedup over default 0 Hadoop split size (64MB) Optimal with no tuning 20 10 51 25 12 64 32 AM 48 24 2 6 8 Split Size (MB) Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 17 / 25
  • 30. Adaptive Mappers Experiments - JOIN Regular Mappers Map-only job 1200 Adaptive Mappers 1B TERASORT records Time (seconds) 900 Models a skewed join Single wave of AM 600 Regular Mappers: 300 Large split: data skew Small split: scheduling 0 and start-up overhead Optimal with no tuning 10 51 25 12 64 32 16 8 AM 24 2 6 8 Split Size (MB) Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 18 / 25
  • 31. Adaptive MapReduce DMDS DFS DFS MAP AM AC AP AS MAP REDUCE AM AC AP AS MAP REDUCE AM AC AP AS Adaptive Techniques AM: Adaptive Mappers AC: Adaptive Combiners AS: Adaptive Sampling AP: Adaptive Partitioning Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 19 / 25
  • 32. Adaptive Combiners Main idea Replace sort with hashing Reduce serialization, sort, and IO Regular Combiners Sort Buffer Map : User code : Data Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 20 / 25
  • 33. Adaptive Combiners Main idea Replace sort with hashing Reduce serialization, sort, and IO Regular Combiners Sort Buffer Map Sort Combine : User code : Data Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 20 / 25
  • 34. Adaptive Combiners Main idea Replace sort with hashing Reduce serialization, sort, and IO Regular Combiners Sort Buffer Map Sort Combine : User code : Data Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 20 / 25
  • 35. Adaptive Combiners Main idea Replace sort with hashing Reduce serialization, sort, and IO Regular Combiners Sort Buffer Map Sort Combine Merge : User code : Data Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 20 / 25
  • 36. Adaptive Combiners Main idea Replace sort with hashing Reduce serialization, sort, and IO Regular Combiners Sort Buffer Map Sort Combine Merge : User code Adaptive Combiners : Data Hash-group and Combine Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 20 / 25
  • 37. Adaptive Combiners Main idea Replace sort with hashing Reduce serialization, sort, and IO Regular Combiners Sort Buffer Map Sort Combine Merge : User code Adaptive Combiners : Data Hash-group and Combine Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 20 / 25
  • 38. Adaptive Combiners Details “Best-effort” aggregation Never spill to disk Hash-table replacement policies: No-Replacement (NR) Least-Recently-Used (LRU) Implemented as: Library for Hadoop Optimization choice for Jaql Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 21 / 25
  • 39. Adaptive Combiners Experiments GROUP-BY Synthetic dataset with 3 dimensions (A1, A2, and A3) and 1 fact Group records and apply aggregation function TWL: 10B records, 120GB 180 350 1.00 300 150 Time (seconds) 0.75 Miss Ratio (%) Time (seconds) 250 120 200 0.50 90 150 100 0.25 60 50 30 0 0.00 Re AM 1 25 10 0 0 g. Re AM AC AM Cache Size (K) g. ,A C Regular Combiners Adaptive Combiners NR Regular Combiners Adaptive Combiners LRU Adaptive Combiners NR Miss Ratio NR Adaptive Combiners LRU Miss Ratio LRU GROUP-BY on A1 GROUP-BY on A1 and A2 ×2.5 speedup ×3 speedup Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 22 / 25
  • 40. Adaptive MapReduce DMDS DFS DFS MAP AM AC AP AS MAP REDUCE AM AC AP AS MAP REDUCE AM AC AP AS Adaptive Techniques AM: Adaptive Mappers AC: Adaptive Combiners AS: Adaptive Sampling AP: Adaptive Partitioning Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 23 / 25
  • 41. Adaptive Sampling and Partitioning MAP REDUCE MAP REDUCE MAP Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 24 / 25
  • 42. Adaptive Sampling and Partitioning DMDS Step 1 Compute and publish local histogram MAP REDUCE MAP REDUCE MAP Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 24 / 25
  • 43. Adaptive Sampling and Partitioning DMDS Step 1 Compute and publish local histogram MAP Step 2 Collect local histograms and REDUCE compute partitioning function MAP REDUCE MAP Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 24 / 25
  • 44. Adaptive Sampling and Partitioning DMDS Step 1 Compute and publish local histogram MAP Step 2 Collect local histograms and REDUCE compute partitioning function MAP Step 3 Broadcast partitioning function REDUCE MAP Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 24 / 25
  • 45. Summary Adaptive runtime techniques for MapReduce Situation-Aware Mappers Make MapReduce more dynamic Up to ×3 speedup for well-tuned jobs Orders of magnitude speedup for badly tuned jobs Never hurt performance Configure themselves Part of IBM InfoSphere BigInsights Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 25 / 25
  • 46. Vernica, R., Carey, M., and Li, C. (2010). Efficient parallel set-similarity joins using MapReduce. In SIGMOD Conference. Rares Vernica (HP Labs) Adaptive MapReduce EDBT 2012 25 / 25