SlideShare une entreprise Scribd logo
1  sur  17
Télécharger pour lire hors ligne
Scheduling	
  MapReduce	
  Jobs	
  in	
  
        HPC	
  Clusters	
  
 Marcelo	
  Neves,	
  Tiago	
  Ferreto,	
  Cesar	
  De	
  Rose	
  
                 marcelo.neves@acad.pucrs.br	
  
                                      	
  
                                      	
  
                                      	
  
               Faculty	
  of	
  InformaEcs,	
  PUCRS	
  
                    Porto	
  Alegre,	
  Brazil	
  
                                    	
  
                           August	
  30,	
  2012	
  
Outline	
  
•    IntroducEon	
  
•    HPC	
  Clusters	
  and	
  MapReduce	
  
•    MapReduce	
  Job	
  Adaptor	
  
•    EvaluaEon	
  
•    Conclusion	
  




                                               2	
  
IntroducEon	
  
•  MapReduce	
  (MR)	
  
    –  A	
  parallel	
  programming	
  model	
  
    –  Simplicity,	
  efficiency	
  and	
  high	
  scalability	
  
    –  It	
  has	
  become	
  a	
  de	
  facto	
  standard	
  for	
  large-­‐scale	
  data	
  
       analysis	
  

•  MR	
  has	
  also	
  aTracted	
  the	
  aTenEon	
  of	
  the	
  HPC	
  
   community	
  
    –  Simpler	
  approach	
  to	
  address	
  the	
  parallelizaEon	
  problem	
  
    –  Highly	
  visible	
  cases	
  where	
  MR	
  has	
  been	
  successfully	
  used	
  
       by	
  companies	
  like	
  Google,	
  Facebook	
  and	
  Yahoo!	
  

                                                                                                 3	
  
HPC	
  Clusters	
  and	
  MapReduce	
  
•  HPC	
  Clusters	
  
    –  Shared	
  among	
  mulEple	
  users/organizaEons	
  
    –  Resource	
  Management	
  System	
  (RMS),	
  such	
  as	
  PBS/Torque	
  
    –  ApplicaEons	
  are	
  submiTed	
  as	
  batch	
  jobs	
  
    –  Users	
  have	
  to	
  explicitly	
  allocate	
  the	
  resources,	
  specifying	
  
       the	
  number	
  of	
  nodes	
  and	
  amount	
  of	
  Eme	
  

•  MR	
  ImplementaEons	
  (e.g.	
  Hadoop)	
  
    –  Have	
  their	
  own	
  complete	
  job	
  management	
  system	
  
    –  Users	
  do	
  not	
  have	
  to	
  explicitly	
  allocate	
  resources	
  
    –  Require	
  a	
  dedicated	
  cluster	
  

                                                                                          4	
  
Problem	
  
•  Two	
  disEnct	
  clusters	
  are	
  required	
  



          How	
  to	
  run	
  MapReduce	
  jobs	
  in	
  a	
  exisEng	
  
          HPC	
  cluster	
  along	
  with	
  regular	
  HPC	
  jobs?	
  
   	
  




                                                                            5	
  
Current	
  soluEons	
  
•  Hadoop	
  on	
  Demand	
  (HOD)	
  and	
  MyHadoop	
  
   –  Create	
  on	
  demand	
  MR	
  installaEons	
  as	
  RMS’s	
  jobs	
  
   –  It’s	
  not	
  transparent,	
  users	
  sEll	
  must	
  to	
  specify	
  the	
  
      number	
  of	
  nodes	
  and	
  amount	
  of	
  Eme	
  to	
  be	
  allocated	
  

•  MESOS	
  
   –  Shares	
  a	
  cluster	
  between	
  mulEple	
  different	
  
      frameworks	
  
   –  Creates	
  another	
  level	
  of	
  resource	
  management	
  
   –  Management	
  is	
  taken	
  away	
  from	
  the	
  cluster’s	
  RMS	
  

                                                                                         6	
  
MapReduce	
  Job	
  Adaptor	
  

                           HPC Job
                      (# of nodes, time)



                                                    Resource
HPC User                                           Management
                                                     System

                            MR Job
                            Adaptor                             Cluster

MR User
                                                 MR Job
                 MR Job                    (# of nodes, time)
   (# of map tasks, # of reduce tasks,
               job profile)




                                                                          7	
  
MapReduce	
  Job	
  Adaptor	
  
•  The	
  adaptor	
  has	
  three	
  main	
  goals:	
  
    –  Facilitate	
  the	
  execuEon	
  of	
  MR	
  jobs	
  in	
  HPC	
  clusters	
  
    –  Minimize	
  the	
  average	
  turnaround	
  Eme	
  of	
  the	
  jobs	
  
    –  Exploit	
  unused	
  resources	
  in	
  the	
  cluster	
  (the	
  result	
  
       of	
  the	
  various	
  shapes	
  of	
  HPC	
  job	
  requests)	
  




                                                                                    8	
  
CompleEon	
  Eme	
  esEmaEon	
  
      •  MR	
  performance	
  model	
  by	
  Verma	
  et	
  al.	
  1	
  
              –  Job	
  profile	
  with	
  performance	
  invariants	
  
              –  EsEmate	
  upper/lower	
  bounds	
  of	
  job	
  compleEon	
  




                               •      NJM=	
  number	
  of	
  map	
  tasks	
  
                               •      NJR=	
  number	
  of	
  reduce	
  tasks	
  
                               •      SJM=	
  number	
  of	
  map	
  slots	
  
                               •      SJR=	
  number	
  of	
  reduce	
  slots	
  
                               	
  
1.	
  Verma	
  et	
  al.:	
  Aria:	
  automaEc	
  resource	
  inference	
  and	
  allocaEon	
  for	
  mapreduce	
  environments	
  (2011)	
  
                                                                                                                                        9	
  
Algorithm	
  




                10	
  
EvaluaEon	
  
      •  Simulated	
  environment	
  (using	
  the	
  SimGrid	
  toolkit)	
  
              –  Cluster	
  composed	
  by	
  128	
  nodes	
  with	
  2	
  cores	
  each	
  
              –  RMS	
  based	
  on	
  ConservaEve	
  Backfilling	
  (CBF)	
  algorithm	
  
              –  Stream	
  of	
  job	
  submissions	
  
      •  HPC	
  workload	
  
              –  SyntheEc	
  workload	
  based	
  on	
  model	
  by	
  Lublin	
  et	
  al.1	
  
              –  Real-­‐world	
  HPC	
  traces	
  from	
  the	
  Parallel	
  Workloads	
  Archive	
  (SDSC	
  SP2)	
  
      •  MR	
  workload	
  
              –  SyntheEc	
  workload	
  derived	
  from	
  Facebook	
  workloads	
  described	
  by	
  
                 Zaharia	
  et	
  al.	
  2	
  

1.	
  Lublin	
  et	
  al.:	
  The	
  workload	
  on	
  parallel	
  supercomputers:	
  Modeling	
  the	
  characterisEcs	
  of	
  rigid	
  jobs	
  (2003)	
  
2.	
  Zaharia	
  et	
  al.:	
  Delay	
  scheduling:	
  a	
  simple	
  technique	
  for	
  achieving	
  locality	
  and	
  fairness	
  in	
  cluster	
  
scheduling	
  (2010)	
  
                                                                                                                                                  11	
  
Turnaround	
  Time	
  and	
  System	
  UElizaEon	
  
•  Workload:	
  
         –  HPC:	
  	
  “peak	
  hour”	
  of	
  Lublin’s	
  model	
  
         –  MR:	
  	
  hour	
  of	
  Facebook-­‐like	
  job	
  submissions	
  



	
  
	
  
	
  
	
                                                                ≈	
  40%	
                    ≈	
  15%	
  
•  The	
  adaptor	
  obtained	
  shorter	
  turnaround	
  Emes	
  and	
  beTer	
  
   cluster	
  uElizaEon	
  in	
  all	
  cases	
  
         –  MR-­‐only:	
  turnaround	
  was	
  reduced	
  in	
  ≈	
  40%	
  
         –  HPC+MR:	
  overall	
  turnaround	
  was	
  reduced	
  in	
  ≈	
  15%	
  
         –  HPC+MR:	
  turnaround	
  of	
  MR	
  jobs	
  was	
  reduced	
  in	
  ≈	
  73%	
  

                                                                                                               12	
  
2500
                                      2000
                                                                 Influence	
  of	
  the	
  Job	
  Size	
  
 Average turnaround time (minutes)




•  Shorter	
  turnaround	
  
                                      1500




   regardless	
  the	
  job	
  size	
  




                                                                                                                                                2500
•  BeTer	
  results	
  for	
  bins	
  with	
                                                                                                                       Naive
                                      1000




                                                                                                                                                2000
   smaller	
  jobs	
  
                                                                                                                                                                   Adaptor




                                                                                                            Average turnaround time (minutes)
                                      500




                                                     #	
  Map	
   #	
  Reduce	
   %	
  Jobs	
  at	
  




                                                                                                                                                1500
                                        Bin	
  
                                                     Tasks	
   Tasks	
   Facebook	
  
                                             1	
           1	
            0	
          39%	
  
                                             2	
           2	
            0	
          16%	
  
                                      0




                                             3	
          10	
  
                                                           1         2 3	
   3         14%	
   5
                                                                                        4               6   7                                   1000
                                                                                                                                                 8     9
                                             4	
          50	
            0	
           9%	
  
                                             5	
       100	
              0	
           6%	
   Bin
                                             6	
       200	
             50	
           6%	
  
                                                                                                                                                500




                                             7	
       400	
              0	
           4%	
  
                                             8	
       800	
            180	
           4%	
  
                                             9	
      2400	
              0	
           3%	
  
                                                                                                                                                0




                                     Job	
  sizes	
  in	
  Facebook	
  workload	
  	
                                                                      1   2      3      4   5     6   7   8            9
                                         (based	
  on	
  Zaharia	
  et	
  al.)	
                                                                                                                   13	
  
                                                                                                                                                                                 Bin
1500


                                                       Influence	
  of	
  System	
  Load	
                             1250




                                                                                  Average turnaround time (minutes)
                                     1000
                                                                               Algorithm
                                                                                     1000
                                                                                   Adaptor                                                                                    1500
                                                                                                                                                                                                                             Algorithm
                                                                                                 Naive
                                                                                                                                                                                                                                 Adaptor
                                      800                                                                                                                                                                                        Naive
                                                                                                                       750
                                                                                                                                                                              1250




                                                                                                                                          Average turnaround time (minutes)
 Average turnaround time (minutes)




                                                                                                                       500
                                      600                                                                                                                                     1000
                                                                                                                                      Algorithm                                                                                                          Alg
                                                                                                                                                            Adaptor
                                                                                                                       250                                  Naive
                                                                                                                                                              750
                                      400
                                                                                                                       100
                                       10     15       20      25       30
HPC job inter arrival time (seconds)                                                                                    50

                                                                                                                             1        5                                        500
                                                                                                                                                                                10        15       20      25      30
                                                                                                                                 Mean MR job inter arrival time (seconds)

                                      200
                                                                                                                                                                               250
                                      100
                                       50                                                                                                                                      100
                                                                                                                                                                                50

                                                   5    10       15       20                                           25        30                                                  1         5    10       15         20         25               30
                                            Mean HPC job inter arrival time (seconds)                                                                                                    Mean MR job inter arrival time (seconds)          14	
  
Real-­‐world	
  Workload	
  
•  Workload:	
  
    –  HPC:	
  a	
  day-­‐long	
  trace	
  from	
  SDSC	
  SP2	
  
    –  MR:	
  1000	
  Facebook-­‐like	
  MR	
  jobs	
  




                                                        ≈	
  54	
  %	
     ≈	
  80	
  %	
  

•  The	
  adaptor’s	
  algorithm	
  performed	
  beTer	
  in	
  all	
  cases	
  
                                                                                              15	
  
Conclusion	
  
•  Although	
  MR	
  has	
  gained	
  aTenEon	
  by	
  HPC	
  
   community	
  
•  There	
  is	
  sEll	
  a	
  quesEon	
  of	
  how	
  to	
  run	
  MR	
  jobs	
  
   along	
  with	
  regular	
  HPC	
  jobs	
  in	
  a	
  HPC	
  cluster	
  
•  MR	
  Job	
  Adaptor	
  
    –  Allows	
  transparent	
  MR	
  job	
  submission	
  on	
  HPC	
  
       clusters	
  
    –  Minimizes	
  the	
  average	
  turnaround	
  Eme	
  
    –  Improve	
  the	
  overall	
  uElizaEon,	
  by	
  exploiEng	
  unused	
  
       resources	
  in	
  the	
  cluster	
  

                                                                                     16	
  
Thank	
  you!	
  




                    17	
  

Contenu connexe

Tendances

MapReduce Using Perl and Gearman
MapReduce Using Perl and GearmanMapReduce Using Perl and Gearman
MapReduce Using Perl and GearmanJamie Pitts
 
Bft mr-clouds-of-clouds-discco2012 - navtalk
Bft mr-clouds-of-clouds-discco2012 - navtalkBft mr-clouds-of-clouds-discco2012 - navtalk
Bft mr-clouds-of-clouds-discco2012 - navtalkPedro (A. R. S.) Costa
 
Semi-supervised concept detection by learning the structure of similarity graphs
Semi-supervised concept detection by learning the structure of similarity graphsSemi-supervised concept detection by learning the structure of similarity graphs
Semi-supervised concept detection by learning the structure of similarity graphsSymeon Papadopoulos
 
Application of MapReduce in Cloud Computing
Application of MapReduce in Cloud ComputingApplication of MapReduce in Cloud Computing
Application of MapReduce in Cloud ComputingMohammad Mustaqeem
 
An Introduction to Hadoop
An Introduction to HadoopAn Introduction to Hadoop
An Introduction to HadoopDan Harvey
 
Novel Scheduling Algorithms for Efficient Deployment of Map Reduce Applicatio...
Novel Scheduling Algorithms for Efficient Deployment of Map Reduce Applicatio...Novel Scheduling Algorithms for Efficient Deployment of Map Reduce Applicatio...
Novel Scheduling Algorithms for Efficient Deployment of Map Reduce Applicatio...IRJET Journal
 
Impact of Spatial Correlation towards the Performance of MIMO Downlink Transm...
Impact of Spatial Correlation towards the Performance of MIMO Downlink Transm...Impact of Spatial Correlation towards the Performance of MIMO Downlink Transm...
Impact of Spatial Correlation towards the Performance of MIMO Downlink Transm...Rosdiadee Nordin
 
Parallelization Techniques for the 2D Fourier Matched Filtering and Interpola...
Parallelization Techniques for the 2D Fourier Matched Filtering and Interpola...Parallelization Techniques for the 2D Fourier Matched Filtering and Interpola...
Parallelization Techniques for the 2D Fourier Matched Filtering and Interpola...Fisnik Kraja
 

Tendances (9)

MapReduce Using Perl and Gearman
MapReduce Using Perl and GearmanMapReduce Using Perl and Gearman
MapReduce Using Perl and Gearman
 
Bft mr-clouds-of-clouds-discco2012 - navtalk
Bft mr-clouds-of-clouds-discco2012 - navtalkBft mr-clouds-of-clouds-discco2012 - navtalk
Bft mr-clouds-of-clouds-discco2012 - navtalk
 
Semi-supervised concept detection by learning the structure of similarity graphs
Semi-supervised concept detection by learning the structure of similarity graphsSemi-supervised concept detection by learning the structure of similarity graphs
Semi-supervised concept detection by learning the structure of similarity graphs
 
Application of MapReduce in Cloud Computing
Application of MapReduce in Cloud ComputingApplication of MapReduce in Cloud Computing
Application of MapReduce in Cloud Computing
 
An Introduction to Hadoop
An Introduction to HadoopAn Introduction to Hadoop
An Introduction to Hadoop
 
Novel Scheduling Algorithms for Efficient Deployment of Map Reduce Applicatio...
Novel Scheduling Algorithms for Efficient Deployment of Map Reduce Applicatio...Novel Scheduling Algorithms for Efficient Deployment of Map Reduce Applicatio...
Novel Scheduling Algorithms for Efficient Deployment of Map Reduce Applicatio...
 
Impact of Spatial Correlation towards the Performance of MIMO Downlink Transm...
Impact of Spatial Correlation towards the Performance of MIMO Downlink Transm...Impact of Spatial Correlation towards the Performance of MIMO Downlink Transm...
Impact of Spatial Correlation towards the Performance of MIMO Downlink Transm...
 
Parallelization Techniques for the 2D Fourier Matched Filtering and Interpola...
Parallelization Techniques for the 2D Fourier Matched Filtering and Interpola...Parallelization Techniques for the 2D Fourier Matched Filtering and Interpola...
Parallelization Techniques for the 2D Fourier Matched Filtering and Interpola...
 
50a volumes
50a volumes50a volumes
50a volumes
 

En vedette

Massive MapReduce Matrix Computations & Multicore Graph Algorithms
Massive MapReduce Matrix Computations & Multicore Graph AlgorithmsMassive MapReduce Matrix Computations & Multicore Graph Algorithms
Massive MapReduce Matrix Computations & Multicore Graph AlgorithmsDavid Gleich
 
Modeling with Hadoop kdd2011
Modeling with Hadoop kdd2011Modeling with Hadoop kdd2011
Modeling with Hadoop kdd2011Milind Bhandarkar
 
Frequent Itemset Mining on BigData
Frequent Itemset Mining on BigDataFrequent Itemset Mining on BigData
Frequent Itemset Mining on BigDataRaju Gupta
 
AMP Lab presentation -- Cloudbreak: A MapReduce Algorithm for Detecting Genom...
AMP Lab presentation -- Cloudbreak: A MapReduce Algorithm for Detecting Genom...AMP Lab presentation -- Cloudbreak: A MapReduce Algorithm for Detecting Genom...
AMP Lab presentation -- Cloudbreak: A MapReduce Algorithm for Detecting Genom...Chris Whelan
 
Functional programming
Functional programmingFunctional programming
Functional programmingedusmildo
 
Determining the k in k-means with MapReduce
Determining the k in k-means with MapReduceDetermining the k in k-means with MapReduce
Determining the k in k-means with MapReduceThibault Debatty
 
Graphs in the Database: Rdbms In The Social Networks Age
Graphs in the Database: Rdbms In The Social Networks AgeGraphs in the Database: Rdbms In The Social Networks Age
Graphs in the Database: Rdbms In The Social Networks AgeLorenzo Alberton
 
Hadoop and Machine Learning
Hadoop and Machine LearningHadoop and Machine Learning
Hadoop and Machine Learningjoshwills
 

En vedette (9)

Massive MapReduce Matrix Computations & Multicore Graph Algorithms
Massive MapReduce Matrix Computations & Multicore Graph AlgorithmsMassive MapReduce Matrix Computations & Multicore Graph Algorithms
Massive MapReduce Matrix Computations & Multicore Graph Algorithms
 
Graphs
GraphsGraphs
Graphs
 
Modeling with Hadoop kdd2011
Modeling with Hadoop kdd2011Modeling with Hadoop kdd2011
Modeling with Hadoop kdd2011
 
Frequent Itemset Mining on BigData
Frequent Itemset Mining on BigDataFrequent Itemset Mining on BigData
Frequent Itemset Mining on BigData
 
AMP Lab presentation -- Cloudbreak: A MapReduce Algorithm for Detecting Genom...
AMP Lab presentation -- Cloudbreak: A MapReduce Algorithm for Detecting Genom...AMP Lab presentation -- Cloudbreak: A MapReduce Algorithm for Detecting Genom...
AMP Lab presentation -- Cloudbreak: A MapReduce Algorithm for Detecting Genom...
 
Functional programming
Functional programmingFunctional programming
Functional programming
 
Determining the k in k-means with MapReduce
Determining the k in k-means with MapReduceDetermining the k in k-means with MapReduce
Determining the k in k-means with MapReduce
 
Graphs in the Database: Rdbms In The Social Networks Age
Graphs in the Database: Rdbms In The Social Networks AgeGraphs in the Database: Rdbms In The Social Networks Age
Graphs in the Database: Rdbms In The Social Networks Age
 
Hadoop and Machine Learning
Hadoop and Machine LearningHadoop and Machine Learning
Hadoop and Machine Learning
 

Similaire à Scheduling MapReduce Jobs in HPC Clusters

On Extending MapReduce - Survey and Experiments
On Extending MapReduce - Survey and ExperimentsOn Extending MapReduce - Survey and Experiments
On Extending MapReduce - Survey and ExperimentsYu Liu
 
Apache Spark - Santa Barbara Scala Meetup Dec 18th 2014
Apache Spark - Santa Barbara Scala Meetup Dec 18th 2014Apache Spark - Santa Barbara Scala Meetup Dec 18th 2014
Apache Spark - Santa Barbara Scala Meetup Dec 18th 2014cdmaxime
 
Hanborq optimizations on hadoop map reduce 20120221a
Hanborq optimizations on hadoop map reduce 20120221aHanborq optimizations on hadoop map reduce 20120221a
Hanborq optimizations on hadoop map reduce 20120221aSchubert Zhang
 
Big Data Analytics Chapter3-6@2021.pdf
Big Data Analytics Chapter3-6@2021.pdfBig Data Analytics Chapter3-6@2021.pdf
Big Data Analytics Chapter3-6@2021.pdfWasyihunSema2
 
Hadoop mapreduce and yarn frame work- unit5
Hadoop mapreduce and yarn frame work-  unit5Hadoop mapreduce and yarn frame work-  unit5
Hadoop mapreduce and yarn frame work- unit5RojaT4
 
Apache Spark - San Diego Big Data Meetup Jan 14th 2015
Apache Spark - San Diego Big Data Meetup Jan 14th 2015Apache Spark - San Diego Big Data Meetup Jan 14th 2015
Apache Spark - San Diego Big Data Meetup Jan 14th 2015cdmaxime
 
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with HadoopApache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with HadoopHortonworks
 
Scientific Applications of The Data Distribution Service
Scientific Applications of The Data Distribution ServiceScientific Applications of The Data Distribution Service
Scientific Applications of The Data Distribution ServiceAngelo Corsaro
 
PEARC 17: Spark On the ARC
PEARC 17: Spark On the ARCPEARC 17: Spark On the ARC
PEARC 17: Spark On the ARCHimanshu Bedi
 
Resource Aware Scheduling for Hadoop [Final Presentation]
Resource Aware Scheduling for Hadoop [Final Presentation]Resource Aware Scheduling for Hadoop [Final Presentation]
Resource Aware Scheduling for Hadoop [Final Presentation]Lu Wei
 
MapReduce Paradigm
MapReduce ParadigmMapReduce Paradigm
MapReduce ParadigmDilip Reddy
 
MapReduce Paradigm
MapReduce ParadigmMapReduce Paradigm
MapReduce ParadigmDilip Reddy
 
High Performance Computing - Cloud Point of View
High Performance Computing - Cloud Point of ViewHigh Performance Computing - Cloud Point of View
High Performance Computing - Cloud Point of Viewaragozin
 
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdfmodule3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdfTSANKARARAO
 

Similaire à Scheduling MapReduce Jobs in HPC Clusters (20)

On Extending MapReduce - Survey and Experiments
On Extending MapReduce - Survey and ExperimentsOn Extending MapReduce - Survey and Experiments
On Extending MapReduce - Survey and Experiments
 
MapReduce basics
MapReduce basicsMapReduce basics
MapReduce basics
 
Apache Spark - Santa Barbara Scala Meetup Dec 18th 2014
Apache Spark - Santa Barbara Scala Meetup Dec 18th 2014Apache Spark - Santa Barbara Scala Meetup Dec 18th 2014
Apache Spark - Santa Barbara Scala Meetup Dec 18th 2014
 
Hanborq optimizations on hadoop map reduce 20120221a
Hanborq optimizations on hadoop map reduce 20120221aHanborq optimizations on hadoop map reduce 20120221a
Hanborq optimizations on hadoop map reduce 20120221a
 
Hadoop map reduce v2
Hadoop map reduce v2Hadoop map reduce v2
Hadoop map reduce v2
 
Yarn
YarnYarn
Yarn
 
MapReduce
MapReduceMapReduce
MapReduce
 
Big Data Analytics Chapter3-6@2021.pdf
Big Data Analytics Chapter3-6@2021.pdfBig Data Analytics Chapter3-6@2021.pdf
Big Data Analytics Chapter3-6@2021.pdf
 
Hadoop mapreduce and yarn frame work- unit5
Hadoop mapreduce and yarn frame work-  unit5Hadoop mapreduce and yarn frame work-  unit5
Hadoop mapreduce and yarn frame work- unit5
 
Hadoop
HadoopHadoop
Hadoop
 
Apache Spark - San Diego Big Data Meetup Jan 14th 2015
Apache Spark - San Diego Big Data Meetup Jan 14th 2015Apache Spark - San Diego Big Data Meetup Jan 14th 2015
Apache Spark - San Diego Big Data Meetup Jan 14th 2015
 
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with HadoopApache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with Hadoop
 
Scientific Applications of The Data Distribution Service
Scientific Applications of The Data Distribution ServiceScientific Applications of The Data Distribution Service
Scientific Applications of The Data Distribution Service
 
PEARC 17: Spark On the ARC
PEARC 17: Spark On the ARCPEARC 17: Spark On the ARC
PEARC 17: Spark On the ARC
 
Resource Aware Scheduling for Hadoop [Final Presentation]
Resource Aware Scheduling for Hadoop [Final Presentation]Resource Aware Scheduling for Hadoop [Final Presentation]
Resource Aware Scheduling for Hadoop [Final Presentation]
 
MapReduce Paradigm
MapReduce ParadigmMapReduce Paradigm
MapReduce Paradigm
 
MapReduce Paradigm
MapReduce ParadigmMapReduce Paradigm
MapReduce Paradigm
 
High Performance Computing - Cloud Point of View
High Performance Computing - Cloud Point of ViewHigh Performance Computing - Cloud Point of View
High Performance Computing - Cloud Point of View
 
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdfmodule3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
 
Big Data.pptx
Big Data.pptxBig Data.pptx
Big Data.pptx
 

Dernier

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 

Dernier (20)

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 

Scheduling MapReduce Jobs in HPC Clusters

  • 1. Scheduling  MapReduce  Jobs  in   HPC  Clusters   Marcelo  Neves,  Tiago  Ferreto,  Cesar  De  Rose   marcelo.neves@acad.pucrs.br         Faculty  of  InformaEcs,  PUCRS   Porto  Alegre,  Brazil     August  30,  2012  
  • 2. Outline   •  IntroducEon   •  HPC  Clusters  and  MapReduce   •  MapReduce  Job  Adaptor   •  EvaluaEon   •  Conclusion   2  
  • 3. IntroducEon   •  MapReduce  (MR)   –  A  parallel  programming  model   –  Simplicity,  efficiency  and  high  scalability   –  It  has  become  a  de  facto  standard  for  large-­‐scale  data   analysis   •  MR  has  also  aTracted  the  aTenEon  of  the  HPC   community   –  Simpler  approach  to  address  the  parallelizaEon  problem   –  Highly  visible  cases  where  MR  has  been  successfully  used   by  companies  like  Google,  Facebook  and  Yahoo!   3  
  • 4. HPC  Clusters  and  MapReduce   •  HPC  Clusters   –  Shared  among  mulEple  users/organizaEons   –  Resource  Management  System  (RMS),  such  as  PBS/Torque   –  ApplicaEons  are  submiTed  as  batch  jobs   –  Users  have  to  explicitly  allocate  the  resources,  specifying   the  number  of  nodes  and  amount  of  Eme   •  MR  ImplementaEons  (e.g.  Hadoop)   –  Have  their  own  complete  job  management  system   –  Users  do  not  have  to  explicitly  allocate  resources   –  Require  a  dedicated  cluster   4  
  • 5. Problem   •  Two  disEnct  clusters  are  required   How  to  run  MapReduce  jobs  in  a  exisEng   HPC  cluster  along  with  regular  HPC  jobs?     5  
  • 6. Current  soluEons   •  Hadoop  on  Demand  (HOD)  and  MyHadoop   –  Create  on  demand  MR  installaEons  as  RMS’s  jobs   –  It’s  not  transparent,  users  sEll  must  to  specify  the   number  of  nodes  and  amount  of  Eme  to  be  allocated   •  MESOS   –  Shares  a  cluster  between  mulEple  different   frameworks   –  Creates  another  level  of  resource  management   –  Management  is  taken  away  from  the  cluster’s  RMS   6  
  • 7. MapReduce  Job  Adaptor   HPC Job (# of nodes, time) Resource HPC User Management System MR Job Adaptor Cluster MR User MR Job MR Job (# of nodes, time) (# of map tasks, # of reduce tasks, job profile) 7  
  • 8. MapReduce  Job  Adaptor   •  The  adaptor  has  three  main  goals:   –  Facilitate  the  execuEon  of  MR  jobs  in  HPC  clusters   –  Minimize  the  average  turnaround  Eme  of  the  jobs   –  Exploit  unused  resources  in  the  cluster  (the  result   of  the  various  shapes  of  HPC  job  requests)   8  
  • 9. CompleEon  Eme  esEmaEon   •  MR  performance  model  by  Verma  et  al.  1   –  Job  profile  with  performance  invariants   –  EsEmate  upper/lower  bounds  of  job  compleEon   •  NJM=  number  of  map  tasks   •  NJR=  number  of  reduce  tasks   •  SJM=  number  of  map  slots   •  SJR=  number  of  reduce  slots     1.  Verma  et  al.:  Aria:  automaEc  resource  inference  and  allocaEon  for  mapreduce  environments  (2011)   9  
  • 10. Algorithm   10  
  • 11. EvaluaEon   •  Simulated  environment  (using  the  SimGrid  toolkit)   –  Cluster  composed  by  128  nodes  with  2  cores  each   –  RMS  based  on  ConservaEve  Backfilling  (CBF)  algorithm   –  Stream  of  job  submissions   •  HPC  workload   –  SyntheEc  workload  based  on  model  by  Lublin  et  al.1   –  Real-­‐world  HPC  traces  from  the  Parallel  Workloads  Archive  (SDSC  SP2)   •  MR  workload   –  SyntheEc  workload  derived  from  Facebook  workloads  described  by   Zaharia  et  al.  2   1.  Lublin  et  al.:  The  workload  on  parallel  supercomputers:  Modeling  the  characterisEcs  of  rigid  jobs  (2003)   2.  Zaharia  et  al.:  Delay  scheduling:  a  simple  technique  for  achieving  locality  and  fairness  in  cluster   scheduling  (2010)   11  
  • 12. Turnaround  Time  and  System  UElizaEon   •  Workload:   –  HPC:    “peak  hour”  of  Lublin’s  model   –  MR:    hour  of  Facebook-­‐like  job  submissions           ≈  40%   ≈  15%   •  The  adaptor  obtained  shorter  turnaround  Emes  and  beTer   cluster  uElizaEon  in  all  cases   –  MR-­‐only:  turnaround  was  reduced  in  ≈  40%   –  HPC+MR:  overall  turnaround  was  reduced  in  ≈  15%   –  HPC+MR:  turnaround  of  MR  jobs  was  reduced  in  ≈  73%   12  
  • 13. 2500 2000 Influence  of  the  Job  Size   Average turnaround time (minutes) •  Shorter  turnaround   1500 regardless  the  job  size   2500 •  BeTer  results  for  bins  with   Naive 1000 2000 smaller  jobs   Adaptor Average turnaround time (minutes) 500 #  Map   #  Reduce   %  Jobs  at   1500 Bin   Tasks   Tasks   Facebook   1   1   0   39%   2   2   0   16%   0 3   10   1 2 3   3 14%   5 4 6 7 1000 8 9 4   50   0   9%   5   100   0   6%   Bin 6   200   50   6%   500 7   400   0   4%   8   800   180   4%   9   2400   0   3%   0 Job  sizes  in  Facebook  workload     1 2 3 4 5 6 7 8 9 (based  on  Zaharia  et  al.)   13   Bin
  • 14. 1500 Influence  of  System  Load   1250 Average turnaround time (minutes) 1000 Algorithm 1000 Adaptor 1500 Algorithm Naive Adaptor 800 Naive 750 1250 Average turnaround time (minutes) Average turnaround time (minutes) 500 600 1000 Algorithm Alg Adaptor 250 Naive 750 400 100 10 15 20 25 30 HPC job inter arrival time (seconds) 50 1 5 500 10 15 20 25 30 Mean MR job inter arrival time (seconds) 200 250 100 50 100 50 5 10 15 20 25 30 1 5 10 15 20 25 30 Mean HPC job inter arrival time (seconds) Mean MR job inter arrival time (seconds) 14  
  • 15. Real-­‐world  Workload   •  Workload:   –  HPC:  a  day-­‐long  trace  from  SDSC  SP2   –  MR:  1000  Facebook-­‐like  MR  jobs   ≈  54  %   ≈  80  %   •  The  adaptor’s  algorithm  performed  beTer  in  all  cases   15  
  • 16. Conclusion   •  Although  MR  has  gained  aTenEon  by  HPC   community   •  There  is  sEll  a  quesEon  of  how  to  run  MR  jobs   along  with  regular  HPC  jobs  in  a  HPC  cluster   •  MR  Job  Adaptor   –  Allows  transparent  MR  job  submission  on  HPC   clusters   –  Minimizes  the  average  turnaround  Eme   –  Improve  the  overall  uElizaEon,  by  exploiEng  unused   resources  in  the  cluster   16