SlideShare une entreprise Scribd logo
1  sur  28
Télécharger pour lire hors ligne
Approximate Dynamic Programming using
 Fluid and Diffusion Approximations
                                    with Applications to Power Management


Speaker: Dayu Huang

Wei Chen, Dayu Huang, Ankur A. Kulkarni,1Jayakrishnan Unnikrishnan, Quanyan Zhu,
Prashant Mehta, Sean Meyn, and Adam Wierman 2

 Coordinated Science Laboratory, UIUC
 Dept. of IESE, UIUC 1
 Dept. of CS, California Inst. of Tech. 2
                                                             4                                                   120



                                                             3                                                   100



                                                                                                                     80
                                                                                                                                              J
                                                             2



                                                             1                                                       60




National Science Foundation (ECS-0523620 and CCF-0830511),   0                                                       40




ITMANET DARPA RK 2006-07284, and Microsoft Research
                                                             −1                                                      20



                                                             −2                                              n       0                                                      x
                                                                  0   1   2   3   4   5   6   7   8   9   10 x 104        0   2   4   6   8   10   12   14   16   18   20
Introduction
MDP model               Control


                                  i.i.d
Cost
Minimize average cost
Introduction
MDP model               Control


                                  i.i.d
Cost
Minimize average cost

Generator
Introduction
MDP model                                Control


                                                   i.i.d
Cost
Minimize average cost

 Average Cost Optimality Equation (ACOE)




                             Generator


        Relative value function

  Solve ACOE and Find
TD Learning
 The “curse of dimensionality”:
   Complexity of solving ACOE grows exponentially with
   the dimension of the state space.

 Approximate        within a nite-dimensional function class


 Criterion: minimize the mean-squre error


                               solved by stochastic approximation algorithms
TD Learning
 The “curse of dimensionality”:
    Complexity of solving ACOE grows exponentially with
    the dimension of the state space.

 Approximate         within a nite-dimensional function class


 Criterion: minimize the mean-squre error


                                solved by stochastic approximation algorithms

Problem: How to select the basis functions                    ?

                               key to the success of TD learning
Total cost for
                                                               an associated deterministic model

    is a tight approximation to
      120



      100


                    Fluid value function
       80
                    Relative value function
       60



       40



       20



            0   2    4      6      8       10   12   14   16      18    20


can be used as a part of the basis
Related Work
Multiclass queueing network

                                   Meyn 1997, Meyn 1997b



                 optimal control   Chen and Meyn 1999

                   simulation      Hendersen et.al. 2003

             network scheduling    Veatch 2004
                    and routing    Moallemi, Kumar and Van Roy 2006

                                   Meyn 2007       Control Techniques for
                                                    Complex Networks


               other approaches     Tsitsiklis and Van Roy 1997
                                    Mannor, Menache and Shimkin 2005
Related Work
Multiclass queueing network

                                   Meyn 1997, Meyn 1997b



                 optimal control   Chen and Meyn 1999

                   simulation      Hendersen et.al. 2003

             network scheduling    Veatch 2004
                    and routing    Moallemi, Kumar and Van Roy 2006

                                   Meyn 2007       Control Techniques for
                                                    Complex Networks


               other approaches     Tsitsiklis and Van Roy 1997
                                    Mannor, Menache and Shimkin 2005
  Taylor series approximation      this work
Power Management via Speed Scaling
                                                           Bansal, Kimbrel and Pruhs 2007
                                                            Wierman, Andrew and Tang 2009

Single processor
                                          job arrivals

                             processing rate
                             determined by the current power

Control the processing speed to balance delay and energy costs



                                                         Kaxiras and Martonosi 2008
Processor design: polynomial cost                        Wierman, Andrew and Tang 2009
                                          This talk
We also consider
for wireless communication applications
Fluid Model                       MDP


Fluid model:




Total Cost



Total Cost Optimality Equation (TCOE) for the uid model:
Why Fluid Model?                   MDP




 First order Taylor series approximation
Why Fluid Model?                   MDP




 First order Taylor series approximation




  TCOE



  ACOE



         almost solves the ACOE            Simple but
                                           powerful idea!
Policy

         180

         160           Stochastic optimal policy
         140                  myopic policy
         120           Di erence
         100

          80

          60

          40

          20

           0

         −20
               0   2    4      6      8       10   12   14   16   18   20   x
Value Iteration

     250



     200



     150               Initialization:   V0     0
                       Initialization:   V0 =
     100



      50



      0
              5   10                15                  20   n




                                                    (See also [Chen and Meyn 1999])
Approximation of the Cost Function

 Error Analysis
                                constant?
Approximation of the Cost Function

 Error Analysis
                                 constant?


 Surrogate cost




                  approximates

     Bounds on           ?
Structure Results on the Fluid Solution
Lower Bound




              Convexity of
Lower Bound




              Convexity of
Upper Bound
Upper Bound
Upper Bound
Approach Based on Fluid and Di usion Models
                                                                             this talk: uid model

                                                                 Total cost for
Value function of the uid model                                  an associated deterministic model

      is a tight approximation to
        120



        100


                      Fluid value function
         80
                      Relative value function
         60



         40



         20



              0   2    4      6      8       10   12   14   16     18   20


  can be used as a part of the basis
TD Learning Experiment


         Basis functions:


4                                                     120



3                                                     100              Approximate relative value function
                                                                       Fluid value function
2                                                         80
                                                                       Relative value function

1                                                         60



0                                                         40



−1                                                        20



−2                                                        0
     0   1   2   3   4   5   6    7   8    9   10 x 104        0   2    4      6      8       10   12        14   16   18   20



 Estimates of Coe cients for the case of quadratic cost
TD Learning with Policy Improvement




   3       Average cost at stage




   2


       0        5            10     15         20            25




  Nearly optimal after just a few iterations
                                                    Need the value of the optimal policy
Conclusions

   The uid value function can be used as a part of the basis for TD-learning.


   Motivated by analysis using Taylor series expansion:
     The uid value function almost solves ACOE. In particular,
     it solves the ACOE for a slightly di erent cost function; and
     the error term can be estimated.


   TD learning with policy improvement gives a near optimal policy
   in a few iterations, as shown by experiments.


   Application in power management for processors.
References

  [1] W. Chen, D. Huang, A. Kulkarni, J. Unnikrishnan, Q. Zhu, P. Mehta, S. Meyn, and A. Wierman.
      Approximate dynamic programming using fluid and diffusion approximations with applications to power
      management. Accepted for inclusion in the 48th IEEE Conference on Decision and Control, December
      16-18 2009.

  [1] P. Mehta and S. Meyn. Q-learning and Pontryagin’s Minimum Principle. To appear in Proceedings of
      the 48th IEEE Conference on Decision and Control, December 16-18 2009.

  [1] R.-R. Chen and S. P. Meyn. Value iteration and optimization of multiclass queueing networks. Queueing
      Syst. Theory Appl., 32(1-3):65–97, 1999.

  [1] S. G. Henderson, S. P. Meyn, and V. B. Tadi´. Performance evaluation and policy selection in multiclass
                                                   c
      networks. Discrete Event Dynamic Systems: Theory and Applications, 13(1-2):149–189, 2003. Special
      issue on learning, optimization and decision making (invited).

  [1] S. P. Meyn. The policy iteration algorithm for average reward Markov decision processes with general
      state space. IEEE Trans. Automat. Control, 42(12):1663–1680, 1997.

  [1] S. P. Meyn. Control Techniques for Complex Networks. Cambridge University Press, Cambridge, 2007.

  [1] C. Moallemi, S. Kumar, and B. Van Roy. Approximate and data-driven dynamic programming for
      queueing networks. Preprint available at http://moallemi.com/ciamac/research-interests.php, 2008.

Contenu connexe

Similaire à Approximate dynamic programming using fluid and diffusion approximations with applications to power management

Computational Intelligence Approach for Predicting the Hardness Performances ...
Computational Intelligence Approach for Predicting the Hardness Performances ...Computational Intelligence Approach for Predicting the Hardness Performances ...
Computational Intelligence Approach for Predicting the Hardness Performances ...
Waqas Tariq
 
CAE REPORT
CAE REPORTCAE REPORT
CAE REPORT
dayahisa
 
Cfd fem-09 compatibility-and_accuracy_of_mesh_valeo
Cfd fem-09 compatibility-and_accuracy_of_mesh_valeoCfd fem-09 compatibility-and_accuracy_of_mesh_valeo
Cfd fem-09 compatibility-and_accuracy_of_mesh_valeo
Anand Kumar Chinni
 
MEMS Extraction & Verification
MEMS Extraction & VerificationMEMS Extraction & Verification
MEMS Extraction & Verification
intellisense
 
ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..
butest
 
ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..
butest
 
Why Deep Learning Works: Dec 13, 2018 at ICSI, UC Berkeley
Why Deep Learning Works: Dec 13, 2018 at ICSI, UC BerkeleyWhy Deep Learning Works: Dec 13, 2018 at ICSI, UC Berkeley
Why Deep Learning Works: Dec 13, 2018 at ICSI, UC Berkeley
Charles Martin
 

Similaire à Approximate dynamic programming using fluid and diffusion approximations with applications to power management (20)

FEA Analysis & Re-Design of a Bicycle Crank Arm
FEA Analysis & Re-Design of a Bicycle Crank ArmFEA Analysis & Re-Design of a Bicycle Crank Arm
FEA Analysis & Re-Design of a Bicycle Crank Arm
 
Final report Review
Final report ReviewFinal report Review
Final report Review
 
Acm Tech Talk - Decomposition Paradigms for Large Scale Systems
Acm Tech Talk - Decomposition Paradigms for Large Scale SystemsAcm Tech Talk - Decomposition Paradigms for Large Scale Systems
Acm Tech Talk - Decomposition Paradigms for Large Scale Systems
 
Programming with Relaxed Synchronization
Programming with Relaxed SynchronizationProgramming with Relaxed Synchronization
Programming with Relaxed Synchronization
 
Solution to ELD problem
Solution to ELD problemSolution to ELD problem
Solution to ELD problem
 
Genetic Algorithms and Genetic Programming for Multiscale Modeling
Genetic Algorithms and Genetic Programming for Multiscale ModelingGenetic Algorithms and Genetic Programming for Multiscale Modeling
Genetic Algorithms and Genetic Programming for Multiscale Modeling
 
Computational Intelligence Approach for Predicting the Hardness Performances ...
Computational Intelligence Approach for Predicting the Hardness Performances ...Computational Intelligence Approach for Predicting the Hardness Performances ...
Computational Intelligence Approach for Predicting the Hardness Performances ...
 
CAE REPORT
CAE REPORTCAE REPORT
CAE REPORT
 
Cfd fem-09 compatibility-and_accuracy_of_mesh_valeo
Cfd fem-09 compatibility-and_accuracy_of_mesh_valeoCfd fem-09 compatibility-and_accuracy_of_mesh_valeo
Cfd fem-09 compatibility-and_accuracy_of_mesh_valeo
 
MEMS Extraction & Verification
MEMS Extraction & VerificationMEMS Extraction & Verification
MEMS Extraction & Verification
 
ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..
 
ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..
 
Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Ar...
Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Ar...Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Ar...
Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Ar...
 
An02 dws
An02 dwsAn02 dws
An02 dws
 
ENS Macrh 2022.pdf
ENS Macrh 2022.pdfENS Macrh 2022.pdf
ENS Macrh 2022.pdf
 
Steady state CFD analysis of C-D nozzle
Steady state CFD analysis of C-D nozzle Steady state CFD analysis of C-D nozzle
Steady state CFD analysis of C-D nozzle
 
Why Deep Learning Works: Dec 13, 2018 at ICSI, UC Berkeley
Why Deep Learning Works: Dec 13, 2018 at ICSI, UC BerkeleyWhy Deep Learning Works: Dec 13, 2018 at ICSI, UC Berkeley
Why Deep Learning Works: Dec 13, 2018 at ICSI, UC Berkeley
 
“Imaging Systems for Applied Reinforcement Learning Control,” a Presentation ...
“Imaging Systems for Applied Reinforcement Learning Control,” a Presentation ...“Imaging Systems for Applied Reinforcement Learning Control,” a Presentation ...
“Imaging Systems for Applied Reinforcement Learning Control,” a Presentation ...
 
Menatel delta qs opi
Menatel delta qs opiMenatel delta qs opi
Menatel delta qs opi
 
CSPA 2008 Presentation
CSPA 2008 PresentationCSPA 2008 Presentation
CSPA 2008 Presentation
 

Plus de Sean Meyn

Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...
Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...
Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...
Sean Meyn
 
State Space Collapse in Resource Allocation for Demand Dispatch - May 2019
State Space Collapse in Resource Allocation for Demand Dispatch - May 2019State Space Collapse in Resource Allocation for Demand Dispatch - May 2019
State Space Collapse in Resource Allocation for Demand Dispatch - May 2019
Sean Meyn
 
Distributed Randomized Control for Ancillary Service to the Power Grid
Distributed Randomized Control for Ancillary Service to the Power GridDistributed Randomized Control for Ancillary Service to the Power Grid
Distributed Randomized Control for Ancillary Service to the Power Grid
Sean Meyn
 
2012 Tutorial: Markets for Differentiated Electric Power Products
2012 Tutorial:  Markets for Differentiated Electric Power Products2012 Tutorial:  Markets for Differentiated Electric Power Products
2012 Tutorial: Markets for Differentiated Electric Power Products
Sean Meyn
 

Plus de Sean Meyn (20)

Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...
Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...
Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...
 
DeepLearn2022 1. Goals & AlgorithmDesign.pdf
DeepLearn2022 1. Goals & AlgorithmDesign.pdfDeepLearn2022 1. Goals & AlgorithmDesign.pdf
DeepLearn2022 1. Goals & AlgorithmDesign.pdf
 
DeepLearn2022 3. TD and Q Learning
DeepLearn2022 3. TD and Q LearningDeepLearn2022 3. TD and Q Learning
DeepLearn2022 3. TD and Q Learning
 
DeepLearn2022 2. Variance Matters
DeepLearn2022  2. Variance MattersDeepLearn2022  2. Variance Matters
DeepLearn2022 2. Variance Matters
 
Smart Grid Tutorial - January 2019
Smart Grid Tutorial - January 2019Smart Grid Tutorial - January 2019
Smart Grid Tutorial - January 2019
 
State Space Collapse in Resource Allocation for Demand Dispatch - May 2019
State Space Collapse in Resource Allocation for Demand Dispatch - May 2019State Space Collapse in Resource Allocation for Demand Dispatch - May 2019
State Space Collapse in Resource Allocation for Demand Dispatch - May 2019
 
Irrational Agents and the Power Grid
Irrational Agents and the Power GridIrrational Agents and the Power Grid
Irrational Agents and the Power Grid
 
Zap Q-Learning - ISMP 2018
Zap Q-Learning - ISMP 2018Zap Q-Learning - ISMP 2018
Zap Q-Learning - ISMP 2018
 
Introducing Zap Q-Learning
Introducing Zap Q-Learning   Introducing Zap Q-Learning
Introducing Zap Q-Learning
 
Reinforcement Learning: Hidden Theory and New Super-Fast Algorithms
Reinforcement Learning: Hidden Theory and New Super-Fast AlgorithmsReinforcement Learning: Hidden Theory and New Super-Fast Algorithms
Reinforcement Learning: Hidden Theory and New Super-Fast Algorithms
 
State estimation and Mean-Field Control with application to demand dispatch
State estimation and Mean-Field Control with application to demand dispatchState estimation and Mean-Field Control with application to demand dispatch
State estimation and Mean-Field Control with application to demand dispatch
 
Demand-Side Flexibility for Reliable Ancillary Services
Demand-Side Flexibility for Reliable Ancillary ServicesDemand-Side Flexibility for Reliable Ancillary Services
Demand-Side Flexibility for Reliable Ancillary Services
 
Spectral Decomposition of Demand-Side Flexibility for Reliable Ancillary Serv...
Spectral Decomposition of Demand-Side Flexibility for Reliable Ancillary Serv...Spectral Decomposition of Demand-Side Flexibility for Reliable Ancillary Serv...
Spectral Decomposition of Demand-Side Flexibility for Reliable Ancillary Serv...
 
Demand-Side Flexibility for Reliable Ancillary Services in a Smart Grid: Elim...
Demand-Side Flexibility for Reliable Ancillary Services in a Smart Grid: Elim...Demand-Side Flexibility for Reliable Ancillary Services in a Smart Grid: Elim...
Demand-Side Flexibility for Reliable Ancillary Services in a Smart Grid: Elim...
 
Why Do We Ignore Risk in Power Economics?
Why Do We Ignore Risk in Power Economics?Why Do We Ignore Risk in Power Economics?
Why Do We Ignore Risk in Power Economics?
 
Distributed Randomized Control for Ancillary Service to the Power Grid
Distributed Randomized Control for Ancillary Service to the Power GridDistributed Randomized Control for Ancillary Service to the Power Grid
Distributed Randomized Control for Ancillary Service to the Power Grid
 
Ancillary service to the grid from deferrable loads: the case for intelligent...
Ancillary service to the grid from deferrable loads: the case for intelligent...Ancillary service to the grid from deferrable loads: the case for intelligent...
Ancillary service to the grid from deferrable loads: the case for intelligent...
 
2012 Tutorial: Markets for Differentiated Electric Power Products
2012 Tutorial:  Markets for Differentiated Electric Power Products2012 Tutorial:  Markets for Differentiated Electric Power Products
2012 Tutorial: Markets for Differentiated Electric Power Products
 
Control Techniques for Complex Systems
Control Techniques for Complex SystemsControl Techniques for Complex Systems
Control Techniques for Complex Systems
 
Tutorial for Energy Systems Week - Cambridge 2010
Tutorial for Energy Systems Week - Cambridge 2010Tutorial for Energy Systems Week - Cambridge 2010
Tutorial for Energy Systems Week - Cambridge 2010
 

Dernier

Vip Mumbai Call Girls Bandra West Call On 9920725232 With Body to body massag...
Vip Mumbai Call Girls Bandra West Call On 9920725232 With Body to body massag...Vip Mumbai Call Girls Bandra West Call On 9920725232 With Body to body massag...
Vip Mumbai Call Girls Bandra West Call On 9920725232 With Body to body massag...
amitlee9823
 
Call Girls Basavanagudi Just Call 👗 7737669865 👗 Top Class Call Girl Service ...
Call Girls Basavanagudi Just Call 👗 7737669865 👗 Top Class Call Girl Service ...Call Girls Basavanagudi Just Call 👗 7737669865 👗 Top Class Call Girl Service ...
Call Girls Basavanagudi Just Call 👗 7737669865 👗 Top Class Call Girl Service ...
amitlee9823
 
➥🔝 7737669865 🔝▻ jhansi Call-girls in Women Seeking Men 🔝jhansi🔝 Escorts S...
➥🔝 7737669865 🔝▻ jhansi Call-girls in Women Seeking Men  🔝jhansi🔝   Escorts S...➥🔝 7737669865 🔝▻ jhansi Call-girls in Women Seeking Men  🔝jhansi🔝   Escorts S...
➥🔝 7737669865 🔝▻ jhansi Call-girls in Women Seeking Men 🔝jhansi🔝 Escorts S...
amitlee9823
 
call girls in Dakshinpuri (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service 🔝✔️✔️
call girls in Dakshinpuri  (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service 🔝✔️✔️call girls in Dakshinpuri  (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service 🔝✔️✔️
call girls in Dakshinpuri (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service 🔝✔️✔️
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
RT Nagar Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bang...
RT Nagar Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bang...RT Nagar Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bang...
RT Nagar Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bang...
amitlee9823
 
Just Call Vip call girls dharamshala Escorts ☎️9352988975 Two shot with one g...
Just Call Vip call girls dharamshala Escorts ☎️9352988975 Two shot with one g...Just Call Vip call girls dharamshala Escorts ☎️9352988975 Two shot with one g...
Just Call Vip call girls dharamshala Escorts ☎️9352988975 Two shot with one g...
gajnagarg
 
Anamika Escorts Service Darbhanga ❣️ 7014168258 ❣️ High Cost Unlimited Hard ...
Anamika Escorts Service Darbhanga ❣️ 7014168258 ❣️ High Cost Unlimited Hard  ...Anamika Escorts Service Darbhanga ❣️ 7014168258 ❣️ High Cost Unlimited Hard  ...
Anamika Escorts Service Darbhanga ❣️ 7014168258 ❣️ High Cost Unlimited Hard ...
nirzagarg
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
dollysharma2066
 
How to Build a Simple Shopify Website
How to Build a Simple Shopify WebsiteHow to Build a Simple Shopify Website
How to Build a Simple Shopify Website
mark11275
 

Dernier (20)

Vip Mumbai Call Girls Bandra West Call On 9920725232 With Body to body massag...
Vip Mumbai Call Girls Bandra West Call On 9920725232 With Body to body massag...Vip Mumbai Call Girls Bandra West Call On 9920725232 With Body to body massag...
Vip Mumbai Call Girls Bandra West Call On 9920725232 With Body to body massag...
 
Hire 💕 8617697112 Meerut Call Girls Service Call Girls Agency
Hire 💕 8617697112 Meerut Call Girls Service Call Girls AgencyHire 💕 8617697112 Meerut Call Girls Service Call Girls Agency
Hire 💕 8617697112 Meerut Call Girls Service Call Girls Agency
 
Just Call Vip call girls Nagpur Escorts ☎️8617370543 Starting From 5K to 25K ...
Just Call Vip call girls Nagpur Escorts ☎️8617370543 Starting From 5K to 25K ...Just Call Vip call girls Nagpur Escorts ☎️8617370543 Starting From 5K to 25K ...
Just Call Vip call girls Nagpur Escorts ☎️8617370543 Starting From 5K to 25K ...
 
Call Girls Basavanagudi Just Call 👗 7737669865 👗 Top Class Call Girl Service ...
Call Girls Basavanagudi Just Call 👗 7737669865 👗 Top Class Call Girl Service ...Call Girls Basavanagudi Just Call 👗 7737669865 👗 Top Class Call Girl Service ...
Call Girls Basavanagudi Just Call 👗 7737669865 👗 Top Class Call Girl Service ...
 
➥🔝 7737669865 🔝▻ jhansi Call-girls in Women Seeking Men 🔝jhansi🔝 Escorts S...
➥🔝 7737669865 🔝▻ jhansi Call-girls in Women Seeking Men  🔝jhansi🔝   Escorts S...➥🔝 7737669865 🔝▻ jhansi Call-girls in Women Seeking Men  🔝jhansi🔝   Escorts S...
➥🔝 7737669865 🔝▻ jhansi Call-girls in Women Seeking Men 🔝jhansi🔝 Escorts S...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
call girls in Dakshinpuri (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service 🔝✔️✔️
call girls in Dakshinpuri  (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service 🔝✔️✔️call girls in Dakshinpuri  (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service 🔝✔️✔️
call girls in Dakshinpuri (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service 🔝✔️✔️
 
Gamestore case study UI UX by Amgad Ibrahim
Gamestore case study UI UX by Amgad IbrahimGamestore case study UI UX by Amgad Ibrahim
Gamestore case study UI UX by Amgad Ibrahim
 
Sweety Planet Packaging Design Process Book.pptx
Sweety Planet Packaging Design Process Book.pptxSweety Planet Packaging Design Process Book.pptx
Sweety Planet Packaging Design Process Book.pptx
 
RT Nagar Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bang...
RT Nagar Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bang...RT Nagar Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bang...
RT Nagar Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bang...
 
8377087607, Door Step Call Girls In Kalkaji (Locanto) 24/7 Available
8377087607, Door Step Call Girls In Kalkaji (Locanto) 24/7 Available8377087607, Door Step Call Girls In Kalkaji (Locanto) 24/7 Available
8377087607, Door Step Call Girls In Kalkaji (Locanto) 24/7 Available
 
Just Call Vip call girls dharamshala Escorts ☎️9352988975 Two shot with one g...
Just Call Vip call girls dharamshala Escorts ☎️9352988975 Two shot with one g...Just Call Vip call girls dharamshala Escorts ☎️9352988975 Two shot with one g...
Just Call Vip call girls dharamshala Escorts ☎️9352988975 Two shot with one g...
 
High Profile Escorts Nerul WhatsApp +91-9930687706, Best Service
High Profile Escorts Nerul WhatsApp +91-9930687706, Best ServiceHigh Profile Escorts Nerul WhatsApp +91-9930687706, Best Service
High Profile Escorts Nerul WhatsApp +91-9930687706, Best Service
 
Anamika Escorts Service Darbhanga ❣️ 7014168258 ❣️ High Cost Unlimited Hard ...
Anamika Escorts Service Darbhanga ❣️ 7014168258 ❣️ High Cost Unlimited Hard  ...Anamika Escorts Service Darbhanga ❣️ 7014168258 ❣️ High Cost Unlimited Hard  ...
Anamika Escorts Service Darbhanga ❣️ 7014168258 ❣️ High Cost Unlimited Hard ...
 
Pooja 9892124323, Call girls Services and Mumbai Escort Service Near Hotel Gi...
Pooja 9892124323, Call girls Services and Mumbai Escort Service Near Hotel Gi...Pooja 9892124323, Call girls Services and Mumbai Escort Service Near Hotel Gi...
Pooja 9892124323, Call girls Services and Mumbai Escort Service Near Hotel Gi...
 
VIP Model Call Girls Kalyani Nagar ( Pune ) Call ON 8005736733 Starting From ...
VIP Model Call Girls Kalyani Nagar ( Pune ) Call ON 8005736733 Starting From ...VIP Model Call Girls Kalyani Nagar ( Pune ) Call ON 8005736733 Starting From ...
VIP Model Call Girls Kalyani Nagar ( Pune ) Call ON 8005736733 Starting From ...
 
Q4-W4-SCIENCE-5 power point presentation
Q4-W4-SCIENCE-5 power point presentationQ4-W4-SCIENCE-5 power point presentation
Q4-W4-SCIENCE-5 power point presentation
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
call girls in Vasundhra (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝...
call girls in Vasundhra (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝...call girls in Vasundhra (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝...
call girls in Vasundhra (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝...
 
How to Build a Simple Shopify Website
How to Build a Simple Shopify WebsiteHow to Build a Simple Shopify Website
How to Build a Simple Shopify Website
 

Approximate dynamic programming using fluid and diffusion approximations with applications to power management

  • 1. Approximate Dynamic Programming using Fluid and Diffusion Approximations with Applications to Power Management Speaker: Dayu Huang Wei Chen, Dayu Huang, Ankur A. Kulkarni,1Jayakrishnan Unnikrishnan, Quanyan Zhu, Prashant Mehta, Sean Meyn, and Adam Wierman 2 Coordinated Science Laboratory, UIUC Dept. of IESE, UIUC 1 Dept. of CS, California Inst. of Tech. 2 4 120 3 100 80 J 2 1 60 National Science Foundation (ECS-0523620 and CCF-0830511), 0 40 ITMANET DARPA RK 2006-07284, and Microsoft Research −1 20 −2 n 0 x 0 1 2 3 4 5 6 7 8 9 10 x 104 0 2 4 6 8 10 12 14 16 18 20
  • 2. Introduction MDP model Control i.i.d Cost Minimize average cost
  • 3. Introduction MDP model Control i.i.d Cost Minimize average cost Generator
  • 4. Introduction MDP model Control i.i.d Cost Minimize average cost Average Cost Optimality Equation (ACOE) Generator Relative value function Solve ACOE and Find
  • 5. TD Learning The “curse of dimensionality”: Complexity of solving ACOE grows exponentially with the dimension of the state space. Approximate within a nite-dimensional function class Criterion: minimize the mean-squre error solved by stochastic approximation algorithms
  • 6. TD Learning The “curse of dimensionality”: Complexity of solving ACOE grows exponentially with the dimension of the state space. Approximate within a nite-dimensional function class Criterion: minimize the mean-squre error solved by stochastic approximation algorithms Problem: How to select the basis functions ? key to the success of TD learning
  • 7. Total cost for an associated deterministic model is a tight approximation to 120 100 Fluid value function 80 Relative value function 60 40 20 0 2 4 6 8 10 12 14 16 18 20 can be used as a part of the basis
  • 8. Related Work Multiclass queueing network Meyn 1997, Meyn 1997b optimal control Chen and Meyn 1999 simulation Hendersen et.al. 2003 network scheduling Veatch 2004 and routing Moallemi, Kumar and Van Roy 2006 Meyn 2007 Control Techniques for Complex Networks other approaches Tsitsiklis and Van Roy 1997 Mannor, Menache and Shimkin 2005
  • 9. Related Work Multiclass queueing network Meyn 1997, Meyn 1997b optimal control Chen and Meyn 1999 simulation Hendersen et.al. 2003 network scheduling Veatch 2004 and routing Moallemi, Kumar and Van Roy 2006 Meyn 2007 Control Techniques for Complex Networks other approaches Tsitsiklis and Van Roy 1997 Mannor, Menache and Shimkin 2005 Taylor series approximation this work
  • 10. Power Management via Speed Scaling Bansal, Kimbrel and Pruhs 2007 Wierman, Andrew and Tang 2009 Single processor job arrivals processing rate determined by the current power Control the processing speed to balance delay and energy costs Kaxiras and Martonosi 2008 Processor design: polynomial cost Wierman, Andrew and Tang 2009 This talk We also consider for wireless communication applications
  • 11. Fluid Model MDP Fluid model: Total Cost Total Cost Optimality Equation (TCOE) for the uid model:
  • 12. Why Fluid Model? MDP First order Taylor series approximation
  • 13. Why Fluid Model? MDP First order Taylor series approximation TCOE ACOE almost solves the ACOE Simple but powerful idea!
  • 14. Policy 180 160 Stochastic optimal policy 140 myopic policy 120 Di erence 100 80 60 40 20 0 −20 0 2 4 6 8 10 12 14 16 18 20 x
  • 15. Value Iteration 250 200 150 Initialization: V0 0 Initialization: V0 = 100 50 0 5 10 15 20 n (See also [Chen and Meyn 1999])
  • 16. Approximation of the Cost Function Error Analysis constant?
  • 17. Approximation of the Cost Function Error Analysis constant? Surrogate cost approximates Bounds on ?
  • 18. Structure Results on the Fluid Solution
  • 19. Lower Bound Convexity of
  • 20. Lower Bound Convexity of
  • 24. Approach Based on Fluid and Di usion Models this talk: uid model Total cost for Value function of the uid model an associated deterministic model is a tight approximation to 120 100 Fluid value function 80 Relative value function 60 40 20 0 2 4 6 8 10 12 14 16 18 20 can be used as a part of the basis
  • 25. TD Learning Experiment Basis functions: 4 120 3 100 Approximate relative value function Fluid value function 2 80 Relative value function 1 60 0 40 −1 20 −2 0 0 1 2 3 4 5 6 7 8 9 10 x 104 0 2 4 6 8 10 12 14 16 18 20 Estimates of Coe cients for the case of quadratic cost
  • 26. TD Learning with Policy Improvement 3 Average cost at stage 2 0 5 10 15 20 25 Nearly optimal after just a few iterations Need the value of the optimal policy
  • 27. Conclusions The uid value function can be used as a part of the basis for TD-learning. Motivated by analysis using Taylor series expansion: The uid value function almost solves ACOE. In particular, it solves the ACOE for a slightly di erent cost function; and the error term can be estimated. TD learning with policy improvement gives a near optimal policy in a few iterations, as shown by experiments. Application in power management for processors.
  • 28. References [1] W. Chen, D. Huang, A. Kulkarni, J. Unnikrishnan, Q. Zhu, P. Mehta, S. Meyn, and A. Wierman. Approximate dynamic programming using fluid and diffusion approximations with applications to power management. Accepted for inclusion in the 48th IEEE Conference on Decision and Control, December 16-18 2009. [1] P. Mehta and S. Meyn. Q-learning and Pontryagin’s Minimum Principle. To appear in Proceedings of the 48th IEEE Conference on Decision and Control, December 16-18 2009. [1] R.-R. Chen and S. P. Meyn. Value iteration and optimization of multiclass queueing networks. Queueing Syst. Theory Appl., 32(1-3):65–97, 1999. [1] S. G. Henderson, S. P. Meyn, and V. B. Tadi´. Performance evaluation and policy selection in multiclass c networks. Discrete Event Dynamic Systems: Theory and Applications, 13(1-2):149–189, 2003. Special issue on learning, optimization and decision making (invited). [1] S. P. Meyn. The policy iteration algorithm for average reward Markov decision processes with general state space. IEEE Trans. Automat. Control, 42(12):1663–1680, 1997. [1] S. P. Meyn. Control Techniques for Complex Networks. Cambridge University Press, Cambridge, 2007. [1] C. Moallemi, S. Kumar, and B. Van Roy. Approximate and data-driven dynamic programming for queueing networks. Preprint available at http://moallemi.com/ciamac/research-interests.php, 2008.