SlideShare une entreprise Scribd logo
1  sur  23
Télécharger pour lire hors ligne
A Game-Theoretic Approach for Runtime
Capacity Allocation in MapReduce
Eugenio Gianniti *, Danilo Ardagna *, Michele Ciavotta *,
Mauro Passacantando * *
*Politecnico di Milano
**Università di Pisa
POLITECNICO DI MILANOSlide 1
Goals
POLITECNICO DI MILANO
Cost-effective deployment configuration for MapReduce systems hosted on
private Clouds
Efficient solution algorithm to support run-time cluster management
Distributed approach leveraging on local knowledge of problem parameters
Slide 2
Reference System
POLITECNICO DI MILANO
Hadoop YARN
Slide 3
MM
R
R
M
Preliminary Formulation
POLITECNICO DI MILANO
min
r,h,sM ,sR
NX
i=1
¯⇢ri +
NX
i=1
Pi (hi)
NX
i=1
ri  R
Hlow
i  hi  Hup
i , 8i 2 A
Aihi
sM
i
+
Bihi
sR
i
+ Ei  0, 8i 2 A
sM
i
cM
i
+
sR
i
cR
i
 ri, 8i 2 A
ri 2 N0, 8i 2 A
sM
i 2 N0, 8i 2 A
sR
i 2 N0, 8i 2 A
hi 2 N0, 8i 2 A
subject to:
Integer Nonlinear Problem
Non-convex constraints
DECISION VARIABLES
hi — concurrency level
ri — virtual machines
si
M — Map slots
si
R — Reduce slots
Continuous relaxation
hi à ( 𝜓i)-1
Slide 4
Centralized Problem
POLITECNICO DI MILANO
Continuous Nonlinear Problem
Need to centralize information
Convex constraints
KKT conditions are necessary and
sufficient for optimality
DECISION VARIABLES
𝜓i — reciprocal concurrency level
ri — virtual machines
si
M — Map slots
si
R — Reduce slots
Slide 5
min
r, ,sM ,sR
NX
i=1
¯⇢ri +
NX
i=1
(↵i i i)
subject to:
NX
i=1
ri  R
low
i  i  up
i , 8i 2 A
Ai
sM
i i
+
Bi
sR
i i
+ Ei  0, 8i 2 A
sM
i
cM
i
+
sR
i
cR
i
 ri, 8i 2 A
ri 2 R+, 8i 2 A
sM
i 2 R+, 8i 2 A
sR
i 2 R+, 8i 2 A
i 2 R+, 8i 2 A
Optimal Configuration Formulae
POLITECNICO DI MILANO
In the optimal configuration it holds:
sM
i =
cM
i
1 +
r
Bi
Ai
cM
i
cR
i
ri, 8i 2 A
sR
i =
cR
i
1 +
r
Ai
Bi
cR
i
cM
i
ri, 8i 2 A
i =
⇣q
Ai
cM
i
+
q
Bi
cR
i
⌘2
Ei
r 1
i , 8i 2 A
PROPOSITION
rlow
i =
⇣q
Ai
cM
i
+
q
Bi
cR
i
⌘2
Ei
up
i
, 8i 2 A
rup
i =
⇣q
Ai
cM
i
+
q
Bi
cR
i
⌘2
Ei
low
i
, 8i 2 A
Exact bounds on resources
requirement
Full knowledge of the
problem parameters
Slide 6
¯⇢  ⇢a
i  ⇢up
i
low
i  i  up
i
Ai
sM
i i
+
Bi
sR
i i
+ Ei  0
sM
i
cM
i
+
sR
i
cR
i
 ri
sM
i 2 R+
sR
i 2 R+
i 2 R+
⇢a
i 2 R+
min
i,⇢a
i ,sM
i ,sR
i
↵i i i
subject to:
Application Masters Problems
POLITECNICO DI MILANO
One instance per AM
Continuous Nonlinear Problem
Only application-specific
information
DECISION VARIABLES
𝜓i — reciprocal concurrency level
𝜌i
a — bid for VMs
si
M — Map slots
si
R — Reduce slots
Slide 7
max
r,y,⇢
NX
i=1
(⇢ ¯⇢) ˜ri
NX
i=1
pi (rup
i ri)
subject to:
NX
i=1
ri  R
ri rlow
i , 8i 2 A
ri  rup
i rlow
i yi + rlow
i , 8i 2 A
¯⇢  ⇢  ˜⇢
⇢ ⇢a
i  M (1 yi) , 8i 2 A
⇢a
i ⇢  Myi, 8i 2 A
ri 2 R+, 8i 2 A
yi 2 {0, 1}, 8i 2 A
⇢ 2 R+
Resource Manager Problem
POLITECNICO DI MILANO
DECISION VARIABLES
𝜌 — price of VMs
ri — virtual machines
yi — AM i offers more than price
One instance for the whole cluster
Mixed Integer Nonlinear Problem
Takes care of resource
management only
Two alternatives:
and
˜ri = ri
˜ri = ri rlow
i
Slide 8
Generalized Nash Equilibrium Problems
POLITECNICO DI MILANO
Not jointly convex
Lack of theoretical guarantees
NEP
JC-NEP
GNEP
Slide 9
A set of players, N, each with utility function Θi
Solution concept:
¯x 2 X is equilibrium , 8i 2 N, 8xi 2 Xi (¯x i) , ⇥i (¯x)  ⇥i (xi, ¯x i)
Feasible set of player i: Xi = Xi (x i)
Optimal Configuration Formulae
POLITECNICO DI MILANO
sM
i =
cM
i
1 +
r
Bi
Ai
cM
i
cR
i
ri
sR
i =
cR
i
1 +
r
Ai
Bi
cR
i
cM
i
ri
i =
⇣q
Ai
cM
i
+
q
Bi
cR
i
⌘2
Ei
r 1
i
The optimal configuration for every AM, given an amount of resources ri, is given
by the following relations:
PROPOSITION
Each AM problem reduces to a
quick algebraic update
Local knowledge of application-
specific parameters
Slide 10
Best Reply Algorithm
POLITECNICO DI MILANO
Each iteration is performed in
parallel by the RM and AMs
Continuous equilibrium
configuration
1: ri rlow
i , 8i 2 A
2: sM
i sM, low
i , 8i 2 A
3: sR
i sR, low
i , 8i 2 A
4: i
up
i , 8i 2 A
5: ⇢a
i ¯⇢, 8i 2 A
6: repeat
7: rold
i ri, 8i 2 A
8: solve RM problem
9: for all i 2 A do
10: solve AM i problem
11: if i > low
i then
12: ⇢a
i max {⇢a
i , ⇢} + ⇢up
i
13: end if
14: end for
15: "
PN
i=1
|ri rold
i |
rold
i
16: until " < ¯"
Slide 11
Experimental Overview
POLITECNICO DI MILANO
PRELIMINARY ANALYSIS
Ensure the model behavior is consistent with intuition when applied to
simple problems
SCALABILITY ANALYSIS
Verify the feasibility of solving problem instances at a realistic scale in
production environments
STOPPING CRITERION TOLERANCE ANALYSIS
Check the sensitivity of the distributed algorithm with respect to the
tolerance on the relative increment
VALIDATION WITH YARN SLS
Compare model solutions and timings measured on the official simulator
Slide 12
Preliminary Analysis
POLITECNICO DI MILANO
100 AMs 1,000 AMs
˜ri = ri rlow
iDecreasing cluster capacity experiment,
Slide 13
Alternative Virtual Gain Terms
POLITECNICO DI MILANO
˜ri = ri rlow
i ˜ri = ri
Increasing concurrency level experiment, 10 AMs
Slide 14
Scalability Analysis
POLITECNICO DI MILANO
Execution time [s]
˜ri = ri rlow
i ˜ri = ri
Slide 15
Stopping Criterion Tolerance Analysis
POLITECNICO DI MILANO
Relative error with respect to centralized solutions
Slide 16
Validation with YARN SLS
POLITECNICO DI MILANO
Relative error absolute values average: 16.999 %
Users R Di [s] S [s] ⌘ [%]
(20, 10, 10) 256 3090 2487.54 24.2191
(20, 10, 10) 512 1694 1142.3 48.2977
(20, 10, 20) 256 3380 2948.78 14.6237
(20, 10, 20) 512 1835 1299.59 41.1987
(20, 15, 10) 256 3390 2849.85 18.9536
(20, 15, 10) 512 1845 1409.88 30.8625
(20, 20, 10) 256 3700 3339.28 10.8023
(20, 20, 10) 512 1995 1512 31.9444
(20, 20, 15) 256 3845 3694.03 4.08677
(20, 20, 15) 512 2063 1745.9 18.1626
(20, 20, 20) 256 3987 4041.44 -1.34704
(20, 20, 20) 512 2140 1877.4 13.9874
(25, 15, 25) 256 4300 3893.71 10.4346
(25, 15, 25) 512 2290 1773.53 29.1213
(25, 25, 25) 512 2597 2653.64 -2.13455
Users R Di [s] S [s] ⌘ [%]
(10, 15, 20) 256 2740 2767.58 -0.996658
(10, 15, 20) 512 1508 1447.12 4.20698
(10, 20, 15) 256 2900 2708.3 7.07837
(10, 20, 15) 512 1589 1379.58 15.18
(15, 10, 10) 256 2575 2257.81 14.0487
(15, 10, 10) 512 1456 980.087 48.5583
(15, 15, 15) 256 3070 3183.02 -3.55061
(15, 15, 15) 512 1678 1456.93 15.174
(15, 15, 20) 256 3210 3088.88 3.92116
(15, 15, 20) 512 1748 1547.92 12.9255
(15, 20, 10) 256 3220 3070.87 4.85639
(15, 20, 10) 512 1755 1328.58 32.0956
(15, 20, 20) 256 3512 3951.58 -11.1242
(15, 20, 20) 512 1899 1574.1 20.6404
(15, 25, 10) 256 3520 3276.66 7.42646
(15, 25, 10) 512 1906 1524.83 24.9975
Slide 17
Conclusions and Future Work
POLITECNICO DI MILANO
The distributed approach yields accurate approximations of optimal
solutions, even without theoretical guarantees
Extend the formulation to Apache Tez and Spark (based on DAGs)
Couple the proposed algorithm with a local search method based on Colored
Petri Nets simulations
Slide 18
Thanks for your attention…
POLITECNICO DI MILANO
…any questions?
Slide 19
Solution Rounding Algorithm
POLITECNICO DI MILANO
1: sort A according to increasing ↵i
2: ri dˆrie , 8i 2 A
3: for all j 2 A do
4: if
PN
i=1 ri > R then
5: rj rj 1
6: end if
7: end for
8: sM
i
⌃
ˆsM
i
⌥
, 8i 2 A
9: sR
i
⌃
ˆsR
i
⌥
, 8i 2 A
10: for all j 2 A do
11: while sM
j /cM
j + sR
j /cR
j > rj do
12: sR
j sR
j 1
13: if sM
j /cM
j + sR
j /cR
j > rj then
14: sM
j sM
j 1
15: end if
16: end while
17: end for
The RM runs an O(N) loop
Each AM runs an O(1) loop,
concurrently
Sorting is O(N log N), but can be
done once and cached
Slide 20
Preliminary Analysis
POLITECNICO DI MILANO
˜ri = ri rlow
iDecreasing deadlines experiment,
100 AMs 1,000 AMs
Slide 21
Scalability Analysis
POLITECNICO DI MILANO
Total cost and penalties
˜ri = ri rlow
i ˜ri = ri
Slide 22
Obtained Concurrency Levels
POLITECNICO DI MILANO
Centralized model Closed form model
Decreasing cluster capacity experiment
Slide 23

Contenu connexe

Tendances

Presentation 2(power point presentation) dis2016
Presentation 2(power point presentation) dis2016Presentation 2(power point presentation) dis2016
Presentation 2(power point presentation) dis2016
Daniel Omunting
 
Lesson 1 Feb 10 2010
Lesson 1 Feb 10 2010Lesson 1 Feb 10 2010
Lesson 1 Feb 10 2010
ingroy
 
April 1, 2014
April 1, 2014April 1, 2014
April 1, 2014
khyps13
 

Tendances (12)

4.5 sec and csc worked 3rd
4.5   sec and csc worked 3rd4.5   sec and csc worked 3rd
4.5 sec and csc worked 3rd
 
Time series analysis 101
Time series analysis 101Time series analysis 101
Time series analysis 101
 
Intel 8085 largest number in a data array
Intel   8085  largest number in a data arrayIntel   8085  largest number in a data array
Intel 8085 largest number in a data array
 
Presentation 2(power point presentation) dis2016
Presentation 2(power point presentation) dis2016Presentation 2(power point presentation) dis2016
Presentation 2(power point presentation) dis2016
 
Lesson 1 Feb 10 2010
Lesson 1 Feb 10 2010Lesson 1 Feb 10 2010
Lesson 1 Feb 10 2010
 
Đề Thi HK2 Toán 6 - THCS Cầu Kiều
Đề Thi HK2 Toán 6 - THCS Cầu KiềuĐề Thi HK2 Toán 6 - THCS Cầu Kiều
Đề Thi HK2 Toán 6 - THCS Cầu Kiều
 
Intel 8085 - Smallest number in a data array
Intel 8085 -  Smallest number in a data arrayIntel 8085 -  Smallest number in a data array
Intel 8085 - Smallest number in a data array
 
Tall-and-skinny Matrix Computations in MapReduce (ICME colloquium)
Tall-and-skinny Matrix Computations in MapReduce (ICME colloquium)Tall-and-skinny Matrix Computations in MapReduce (ICME colloquium)
Tall-and-skinny Matrix Computations in MapReduce (ICME colloquium)
 
Javier dominguez 20800945 actividad 1_estructuras discretas
Javier dominguez 20800945 actividad 1_estructuras discretasJavier dominguez 20800945 actividad 1_estructuras discretas
Javier dominguez 20800945 actividad 1_estructuras discretas
 
April 1, 2014
April 1, 2014April 1, 2014
April 1, 2014
 
Invited presentation at SIAM PP 18: Communication Hiding Through Pipelining i...
Invited presentation at SIAM PP 18: Communication Hiding Through Pipelining i...Invited presentation at SIAM PP 18: Communication Hiding Through Pipelining i...
Invited presentation at SIAM PP 18: Communication Hiding Through Pipelining i...
 
Figures
FiguresFigures
Figures
 

Similaire à A game theoretic approach for runtime capacity allocation in map-reduce (WACC2017)

Heaven: Supporting Systematic Comparative Research of RDF Stream Processing E...
Heaven: Supporting Systematic Comparative Research of RDF Stream Processing E...Heaven: Supporting Systematic Comparative Research of RDF Stream Processing E...
Heaven: Supporting Systematic Comparative Research of RDF Stream Processing E...
Riccardo Tommasini
 
FV_IGARSS11.ppt
FV_IGARSS11.pptFV_IGARSS11.ppt
FV_IGARSS11.ppt
grssieee
 
FV_IGARSS11.ppt
FV_IGARSS11.pptFV_IGARSS11.ppt
FV_IGARSS11.ppt
grssieee
 
FV_IGARSS11.ppt
FV_IGARSS11.pptFV_IGARSS11.ppt
FV_IGARSS11.ppt
grssieee
 
FV_IGARSS11.ppt
FV_IGARSS11.pptFV_IGARSS11.ppt
FV_IGARSS11.ppt
grssieee
 
Prof. Sameh Saad - A Digital Model for 3D Characterization of Groundwater Qua...
Prof. Sameh Saad - A Digital Model for 3D Characterization of Groundwater Qua...Prof. Sameh Saad - A Digital Model for 3D Characterization of Groundwater Qua...
Prof. Sameh Saad - A Digital Model for 3D Characterization of Groundwater Qua...
Hudhaib Al-Allatti
 

Similaire à A game theoretic approach for runtime capacity allocation in map-reduce (WACC2017) (20)

An accurate retrieval through R-MAC+ descriptors for landmark recognition
An accurate retrieval through R-MAC+ descriptors for landmark recognitionAn accurate retrieval through R-MAC+ descriptors for landmark recognition
An accurate retrieval through R-MAC+ descriptors for landmark recognition
 
Model-counting Approaches For Nonlinear Numerical Constraints
Model-counting Approaches For Nonlinear Numerical ConstraintsModel-counting Approaches For Nonlinear Numerical Constraints
Model-counting Approaches For Nonlinear Numerical Constraints
 
Heaven: Supporting Systematic Comparative Research of RDF Stream Processing E...
Heaven: Supporting Systematic Comparative Research of RDF Stream Processing E...Heaven: Supporting Systematic Comparative Research of RDF Stream Processing E...
Heaven: Supporting Systematic Comparative Research of RDF Stream Processing E...
 
Co-Learning: Consensus-based Learning for Multi-Agent Systems
 Co-Learning: Consensus-based Learning for Multi-Agent Systems Co-Learning: Consensus-based Learning for Multi-Agent Systems
Co-Learning: Consensus-based Learning for Multi-Agent Systems
 
SSII2018企画: センシングデバイスの多様化と空間モデリングの未来
SSII2018企画: センシングデバイスの多様化と空間モデリングの未来SSII2018企画: センシングデバイスの多様化と空間モデリングの未来
SSII2018企画: センシングデバイスの多様化と空間モデリングの未来
 
Project seminar ppt_steelcasting
Project seminar ppt_steelcastingProject seminar ppt_steelcasting
Project seminar ppt_steelcasting
 
Updating PageRank for Streaming Graphs
Updating PageRank for Streaming GraphsUpdating PageRank for Streaming Graphs
Updating PageRank for Streaming Graphs
 
FV_IGARSS11.ppt
FV_IGARSS11.pptFV_IGARSS11.ppt
FV_IGARSS11.ppt
 
FV_IGARSS11.ppt
FV_IGARSS11.pptFV_IGARSS11.ppt
FV_IGARSS11.ppt
 
FV_IGARSS11.ppt
FV_IGARSS11.pptFV_IGARSS11.ppt
FV_IGARSS11.ppt
 
FV_IGARSS11.ppt
FV_IGARSS11.pptFV_IGARSS11.ppt
FV_IGARSS11.ppt
 
Introduction to Monte Carlo Ray Tracing, OpenCL Implementation (CEDEC 2014)
Introduction to Monte Carlo Ray Tracing, OpenCL Implementation (CEDEC 2014)Introduction to Monte Carlo Ray Tracing, OpenCL Implementation (CEDEC 2014)
Introduction to Monte Carlo Ray Tracing, OpenCL Implementation (CEDEC 2014)
 
MapReduce Tall-and-skinny QR and applications
MapReduce Tall-and-skinny QR and applicationsMapReduce Tall-and-skinny QR and applications
MapReduce Tall-and-skinny QR and applications
 
Prof. Sameh Saad - A Digital Model for 3D Characterization of Groundwater Qua...
Prof. Sameh Saad - A Digital Model for 3D Characterization of Groundwater Qua...Prof. Sameh Saad - A Digital Model for 3D Characterization of Groundwater Qua...
Prof. Sameh Saad - A Digital Model for 3D Characterization of Groundwater Qua...
 
Robust PID Controller Design for Non-Minimum Phase Systems using Magnitude Op...
Robust PID Controller Design for Non-Minimum Phase Systems using Magnitude Op...Robust PID Controller Design for Non-Minimum Phase Systems using Magnitude Op...
Robust PID Controller Design for Non-Minimum Phase Systems using Magnitude Op...
 
A hybrid sine cosine optimization algorithm for solving global optimization p...
A hybrid sine cosine optimization algorithm for solving global optimization p...A hybrid sine cosine optimization algorithm for solving global optimization p...
A hybrid sine cosine optimization algorithm for solving global optimization p...
 
Hand gesture recognition using discrete wavelet transform and hidden Markov m...
Hand gesture recognition using discrete wavelet transform and hidden Markov m...Hand gesture recognition using discrete wavelet transform and hidden Markov m...
Hand gesture recognition using discrete wavelet transform and hidden Markov m...
 
Real-time and Long-time Together
Real-time and Long-time TogetherReal-time and Long-time Together
Real-time and Long-time Together
 
Qicheng yu
Qicheng yuQicheng yu
Qicheng yu
 
ICRA Nathan Piasco
ICRA Nathan PiascoICRA Nathan Piasco
ICRA Nathan Piasco
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 

A game theoretic approach for runtime capacity allocation in map-reduce (WACC2017)

  • 1. A Game-Theoretic Approach for Runtime Capacity Allocation in MapReduce Eugenio Gianniti *, Danilo Ardagna *, Michele Ciavotta *, Mauro Passacantando * * *Politecnico di Milano **Università di Pisa POLITECNICO DI MILANOSlide 1
  • 2. Goals POLITECNICO DI MILANO Cost-effective deployment configuration for MapReduce systems hosted on private Clouds Efficient solution algorithm to support run-time cluster management Distributed approach leveraging on local knowledge of problem parameters Slide 2
  • 3. Reference System POLITECNICO DI MILANO Hadoop YARN Slide 3 MM R R M
  • 4. Preliminary Formulation POLITECNICO DI MILANO min r,h,sM ,sR NX i=1 ¯⇢ri + NX i=1 Pi (hi) NX i=1 ri  R Hlow i  hi  Hup i , 8i 2 A Aihi sM i + Bihi sR i + Ei  0, 8i 2 A sM i cM i + sR i cR i  ri, 8i 2 A ri 2 N0, 8i 2 A sM i 2 N0, 8i 2 A sR i 2 N0, 8i 2 A hi 2 N0, 8i 2 A subject to: Integer Nonlinear Problem Non-convex constraints DECISION VARIABLES hi — concurrency level ri — virtual machines si M — Map slots si R — Reduce slots Continuous relaxation hi à ( 𝜓i)-1 Slide 4
  • 5. Centralized Problem POLITECNICO DI MILANO Continuous Nonlinear Problem Need to centralize information Convex constraints KKT conditions are necessary and sufficient for optimality DECISION VARIABLES 𝜓i — reciprocal concurrency level ri — virtual machines si M — Map slots si R — Reduce slots Slide 5 min r, ,sM ,sR NX i=1 ¯⇢ri + NX i=1 (↵i i i) subject to: NX i=1 ri  R low i  i  up i , 8i 2 A Ai sM i i + Bi sR i i + Ei  0, 8i 2 A sM i cM i + sR i cR i  ri, 8i 2 A ri 2 R+, 8i 2 A sM i 2 R+, 8i 2 A sR i 2 R+, 8i 2 A i 2 R+, 8i 2 A
  • 6. Optimal Configuration Formulae POLITECNICO DI MILANO In the optimal configuration it holds: sM i = cM i 1 + r Bi Ai cM i cR i ri, 8i 2 A sR i = cR i 1 + r Ai Bi cR i cM i ri, 8i 2 A i = ⇣q Ai cM i + q Bi cR i ⌘2 Ei r 1 i , 8i 2 A PROPOSITION rlow i = ⇣q Ai cM i + q Bi cR i ⌘2 Ei up i , 8i 2 A rup i = ⇣q Ai cM i + q Bi cR i ⌘2 Ei low i , 8i 2 A Exact bounds on resources requirement Full knowledge of the problem parameters Slide 6
  • 7. ¯⇢  ⇢a i  ⇢up i low i  i  up i Ai sM i i + Bi sR i i + Ei  0 sM i cM i + sR i cR i  ri sM i 2 R+ sR i 2 R+ i 2 R+ ⇢a i 2 R+ min i,⇢a i ,sM i ,sR i ↵i i i subject to: Application Masters Problems POLITECNICO DI MILANO One instance per AM Continuous Nonlinear Problem Only application-specific information DECISION VARIABLES 𝜓i — reciprocal concurrency level 𝜌i a — bid for VMs si M — Map slots si R — Reduce slots Slide 7
  • 8. max r,y,⇢ NX i=1 (⇢ ¯⇢) ˜ri NX i=1 pi (rup i ri) subject to: NX i=1 ri  R ri rlow i , 8i 2 A ri  rup i rlow i yi + rlow i , 8i 2 A ¯⇢  ⇢  ˜⇢ ⇢ ⇢a i  M (1 yi) , 8i 2 A ⇢a i ⇢  Myi, 8i 2 A ri 2 R+, 8i 2 A yi 2 {0, 1}, 8i 2 A ⇢ 2 R+ Resource Manager Problem POLITECNICO DI MILANO DECISION VARIABLES 𝜌 — price of VMs ri — virtual machines yi — AM i offers more than price One instance for the whole cluster Mixed Integer Nonlinear Problem Takes care of resource management only Two alternatives: and ˜ri = ri ˜ri = ri rlow i Slide 8
  • 9. Generalized Nash Equilibrium Problems POLITECNICO DI MILANO Not jointly convex Lack of theoretical guarantees NEP JC-NEP GNEP Slide 9 A set of players, N, each with utility function Θi Solution concept: ¯x 2 X is equilibrium , 8i 2 N, 8xi 2 Xi (¯x i) , ⇥i (¯x)  ⇥i (xi, ¯x i) Feasible set of player i: Xi = Xi (x i)
  • 10. Optimal Configuration Formulae POLITECNICO DI MILANO sM i = cM i 1 + r Bi Ai cM i cR i ri sR i = cR i 1 + r Ai Bi cR i cM i ri i = ⇣q Ai cM i + q Bi cR i ⌘2 Ei r 1 i The optimal configuration for every AM, given an amount of resources ri, is given by the following relations: PROPOSITION Each AM problem reduces to a quick algebraic update Local knowledge of application- specific parameters Slide 10
  • 11. Best Reply Algorithm POLITECNICO DI MILANO Each iteration is performed in parallel by the RM and AMs Continuous equilibrium configuration 1: ri rlow i , 8i 2 A 2: sM i sM, low i , 8i 2 A 3: sR i sR, low i , 8i 2 A 4: i up i , 8i 2 A 5: ⇢a i ¯⇢, 8i 2 A 6: repeat 7: rold i ri, 8i 2 A 8: solve RM problem 9: for all i 2 A do 10: solve AM i problem 11: if i > low i then 12: ⇢a i max {⇢a i , ⇢} + ⇢up i 13: end if 14: end for 15: " PN i=1 |ri rold i | rold i 16: until " < ¯" Slide 11
  • 12. Experimental Overview POLITECNICO DI MILANO PRELIMINARY ANALYSIS Ensure the model behavior is consistent with intuition when applied to simple problems SCALABILITY ANALYSIS Verify the feasibility of solving problem instances at a realistic scale in production environments STOPPING CRITERION TOLERANCE ANALYSIS Check the sensitivity of the distributed algorithm with respect to the tolerance on the relative increment VALIDATION WITH YARN SLS Compare model solutions and timings measured on the official simulator Slide 12
  • 13. Preliminary Analysis POLITECNICO DI MILANO 100 AMs 1,000 AMs ˜ri = ri rlow iDecreasing cluster capacity experiment, Slide 13
  • 14. Alternative Virtual Gain Terms POLITECNICO DI MILANO ˜ri = ri rlow i ˜ri = ri Increasing concurrency level experiment, 10 AMs Slide 14
  • 15. Scalability Analysis POLITECNICO DI MILANO Execution time [s] ˜ri = ri rlow i ˜ri = ri Slide 15
  • 16. Stopping Criterion Tolerance Analysis POLITECNICO DI MILANO Relative error with respect to centralized solutions Slide 16
  • 17. Validation with YARN SLS POLITECNICO DI MILANO Relative error absolute values average: 16.999 % Users R Di [s] S [s] ⌘ [%] (20, 10, 10) 256 3090 2487.54 24.2191 (20, 10, 10) 512 1694 1142.3 48.2977 (20, 10, 20) 256 3380 2948.78 14.6237 (20, 10, 20) 512 1835 1299.59 41.1987 (20, 15, 10) 256 3390 2849.85 18.9536 (20, 15, 10) 512 1845 1409.88 30.8625 (20, 20, 10) 256 3700 3339.28 10.8023 (20, 20, 10) 512 1995 1512 31.9444 (20, 20, 15) 256 3845 3694.03 4.08677 (20, 20, 15) 512 2063 1745.9 18.1626 (20, 20, 20) 256 3987 4041.44 -1.34704 (20, 20, 20) 512 2140 1877.4 13.9874 (25, 15, 25) 256 4300 3893.71 10.4346 (25, 15, 25) 512 2290 1773.53 29.1213 (25, 25, 25) 512 2597 2653.64 -2.13455 Users R Di [s] S [s] ⌘ [%] (10, 15, 20) 256 2740 2767.58 -0.996658 (10, 15, 20) 512 1508 1447.12 4.20698 (10, 20, 15) 256 2900 2708.3 7.07837 (10, 20, 15) 512 1589 1379.58 15.18 (15, 10, 10) 256 2575 2257.81 14.0487 (15, 10, 10) 512 1456 980.087 48.5583 (15, 15, 15) 256 3070 3183.02 -3.55061 (15, 15, 15) 512 1678 1456.93 15.174 (15, 15, 20) 256 3210 3088.88 3.92116 (15, 15, 20) 512 1748 1547.92 12.9255 (15, 20, 10) 256 3220 3070.87 4.85639 (15, 20, 10) 512 1755 1328.58 32.0956 (15, 20, 20) 256 3512 3951.58 -11.1242 (15, 20, 20) 512 1899 1574.1 20.6404 (15, 25, 10) 256 3520 3276.66 7.42646 (15, 25, 10) 512 1906 1524.83 24.9975 Slide 17
  • 18. Conclusions and Future Work POLITECNICO DI MILANO The distributed approach yields accurate approximations of optimal solutions, even without theoretical guarantees Extend the formulation to Apache Tez and Spark (based on DAGs) Couple the proposed algorithm with a local search method based on Colored Petri Nets simulations Slide 18
  • 19. Thanks for your attention… POLITECNICO DI MILANO …any questions? Slide 19
  • 20. Solution Rounding Algorithm POLITECNICO DI MILANO 1: sort A according to increasing ↵i 2: ri dˆrie , 8i 2 A 3: for all j 2 A do 4: if PN i=1 ri > R then 5: rj rj 1 6: end if 7: end for 8: sM i ⌃ ˆsM i ⌥ , 8i 2 A 9: sR i ⌃ ˆsR i ⌥ , 8i 2 A 10: for all j 2 A do 11: while sM j /cM j + sR j /cR j > rj do 12: sR j sR j 1 13: if sM j /cM j + sR j /cR j > rj then 14: sM j sM j 1 15: end if 16: end while 17: end for The RM runs an O(N) loop Each AM runs an O(1) loop, concurrently Sorting is O(N log N), but can be done once and cached Slide 20
  • 21. Preliminary Analysis POLITECNICO DI MILANO ˜ri = ri rlow iDecreasing deadlines experiment, 100 AMs 1,000 AMs Slide 21
  • 22. Scalability Analysis POLITECNICO DI MILANO Total cost and penalties ˜ri = ri rlow i ˜ri = ri Slide 22
  • 23. Obtained Concurrency Levels POLITECNICO DI MILANO Centralized model Closed form model Decreasing cluster capacity experiment Slide 23